Towards advanced collocation error correction in Spanish learner corpora

Gabriela Ferraro, ROGELIO ANTONIO NAZAR, Margarita Alonso Ramos, Leo Wanner

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Collocations in the sense of idiosyncratic binary lexical co-occurrences are one of the biggest challenges for any language learner. Even advanced learners make collocation mistakes in that they literally translate collocation elements from their native tongue, create new words as collocation elements, choose a wrong subcategorization for one of the elements, etc. Therefore, automatic collocation error detection and correction is increasingly in demand. However, while state-of-the-art models predict, with a reasonable accuracy, whether a given co-occurrence is a valid collocation or not, only few of them manage to suggest appropriate corrections with an acceptable hit rate. Most often, a ranked list of correction options is offered from which the learner has then to choose. This is clearly unsatisfactory. Our proposal focuses on this critical part of the problem in the context of the acquisition of Spanish as second language. For collocation error detection, we use a frequency-based technique. To improve on collocation error correction, we discuss three different metrics with respect to their capability to select the most appropriate correction of miscollocations found in our learner corpus.

Original languageEnglish
Pages (from-to)45-64
Number of pages20
JournalLanguage Resources and Evaluation
Volume48
Issue number1
DOIs
StatePublished - Mar 2014
Externally publishedYes

Keywords

  • CALL
  • Collocation
  • Collocation error
  • Collocation error correction
  • Collocation error detection
  • Miscollocation

Fingerprint Dive into the research topics of 'Towards advanced collocation error correction in Spanish learner corpora'. Together they form a unique fingerprint.

Cite this