Compiling Learner Corpora on Four Types of Common Errors Made by Chinese EFL Students


A number of EFL studies have been conducted in China to analyze different types of common lexico-grammatical mistakes made by college students in English writing. However, there has been very little research attempted to explore how instructors can practically utilize the resource of a learner corpus to help students improve their writing proficiency.

The proposed research encompasses the fields of corpus linguistics, second language acquisition and foreign language teaching. It aims to compile four mini error-tagged corpora with specific reference to tense and aspect, article, topicalization, and sentence structure. They are found to be the major types of errors made by Chinese learners of English. The sample writings collected for investigation were produced by two classes of freshmen at Wenzhou Kean University (WKU) in 2019 Fall and 2020 Spring. WKU is a Sino-American University in China which uses English as a medium of instruction. The learner corpus contains two drafts of four different types of academic essays in 500 to 1,000 words each given to students as the core assessment.

The research focuses on designing a multi-level error annotation system in identifying, coding, and annotating the four types of errors and their corrections. The data will be related to the second language acquisition theory to describe the features of interlanguage and the factors other than L1 that contribute to shaping it. Moreover, the findings will be applied in an immediate pedagogical context by designing tasks that involve learners who produce and use the corpora data themselves in an EFL classroom.

Author Information
Elaine Y. L. Ng, Wenzhou Kean University, China

Paper Information
Conference: ACL2020
Stream: Language Learning and Teaching

The full paper is not available for this title

Video Presentation

Posted by amp21