Abstract
We present a method for the categorization of discourse markers. Starting from the result of a previous research, in which we generated a taxonomy of discourse markers by inductive methods from parallel corpus, we propose now a method to classify new discourse markers in one or more of the categories discovered in our previous research. The method is based on the statistical similarity between a new marker and the emerging categories. We highlight the quantitative nature of the approach, because it will allow to replicate experiments in other languages. Furthermore, ours is a multi-label classification method, which is important because it represents a first approach to the study of the polyfunctionality of discourse markers from an empirical and inductive point of view.
Translated title of the contribution | Automatic categorization of discourse markers |
---|---|
Original language | Spanish |
Pages (from-to) | 109-116 |
Number of pages | 8 |
Journal | Procesamiento de Lenguaje Natural |
Volume | 61 |
DOIs | |
State | Published - Sep 2018 |
Externally published | Yes |