POL: Un nuevo sistema para la detección y clasificación de nombres propios

Rogelio Nazar, Patricio Arriagada

Resultado de la investigación: Contribución a una revistaArtículorevisión exhaustiva

1 Cita (Scopus)

Resumen

The purpose of this research is to develop a methodology for the detection and categorisation of named entities or proper names (PPNN), in the categories of geographical place, person and organisation. The hypothesis is that the context of occurrence of the entity-a context window of n words before the target-as well as the components of the PN itself may provide good estimators of the type of PN. To that end, we developed a supervised categorisation algorithm, with a training phase in which the system receives a corpus already annotated by another NERC system. In the case of these experiments, such system was the open-source suite of language analysers FreeLing, annotating the corpus of the Spanish Wikipedia. During this training phase, the system learns to associate the category of entity with words of the context as well as those from the PN itself. We evaluate results with the CONLL-2002 and also with a corpus of geopolitics from the journal Le Monde Diplomatique in its Spanish edition, and compare the results with some well-known NERC systems for Spanish.

Título traducido de la contribuciónPOL: A new system for named-entity detection and categorisation
Idioma originalEspañol
Páginas (desde-hasta)13-20
Número de páginas8
PublicaciónProcesamiento de Lenguaje Natural
Volumen58
EstadoPublicada - mar 2017
Publicado de forma externa

Palabras clave

  • Named entities
  • Proper names
  • Text linguistics

Huella

Profundice en los temas de investigación de 'POL: Un nuevo sistema para la detección y clasificación de nombres propios'. En conjunto forman una huella única.

Citar esto