BRANEN and BRANES Corpora

Abstract

This paper presents two learner corpora built to investigate anaphora: the Brazilian Learners of Anaphora in English (BRANEN) and the Aprendices Brasileños de Anáfora en Español (BRANES). Texts were written by language undergraduate students during an online course on anaphora, offered at a Brazilian University in 2020. The corpora provides insights for the analysis of the learning process of anaphora in English and Spanish by Brazilian Portuguese native speakers with intermediate-advanced levels in the foreign language. Informants are 30 English and 15 Spanish learners, who were randomly divided into three sub-groups: one group had two synchronous lessons on anaphora; another that had two asynchronous lessons; and a control group that did not take any lessons. Each participant wrote 100-150 words as a conclusion of a short story. The exercise was performed in four moments: before the course started, after the first lesson, after the second lesson, and a month after the course ended. The texts are available on Sketch Engine, a corpus manager and text analysis software, and contain information about the participants’ group and testing moment. The BRANEN corpus was automatically part-of-speech tagged with the Modified English TreeTagger and has 120 documents, 1,069 sentences, and 1,678 lemmas. For BRANES corpus, the Spanish FreeLing tagset was used, and it consists of 60 documents, 543 sentences, and 1,299 lemmas. The Concordance tool was used to retrieve sentences with pronominal and zero anaphora, which were then manually and independently annotated by two anaphora experts.



Author Information
Amanda Maraschin Bruscato, University of Algarve, Portugal
Jorge Baptista, University of Algarve, Portugal

Paper Information
Conference: ECLL2021
Stream: Applied linguistics research

This paper is part of the ECLL2021 Conference Proceedings (View)
Full Paper
View / Download the full paper in a new tab/window


To cite this article:
Bruscato A., & Baptista J. (2021) BRANEN and BRANES Corpora ISSN: 2188-112X The European Conference on Language Learning 2021: Official Conference Proceedings https://doi.org/10.22492/issn.2188-112X.2021.3
To link to this article: https://doi.org/10.22492/issn.2188-112X.2021.3


Comments & Feedback

Place a comment using your LinkedIn profile

Comments

Share on activity feed

Powered by WP LinkPress

Share this Research

Posted by James Alexander Gordon