Co-occurrence graphs applied to taxonomy extraction in scientific and technical corpora

Rogelio Nazar, Jorge Vivaldi, Leo Wanner

Resultado de la investigación: Contribución a una revistaArtículorevisión exhaustiva

5 Citas (Scopus)


Word co-occurrence graphs have been used in computational linguistics mainly for word sense disambiguation and induction, but until very recently, not for the extraction of hypernymy relations, where the methodology most often applied is the use of lexico-syntactic patterns. In this paper, we show that it is possible to use word co-occurrence statistics to extract IS-A relations between entities in scientific and technical corpora. We exploit the fact that word co-occurrence often has a direction, that is, a term might co-occur with another, but this is very often not true the other way round. This means that one can represent co-occurrence as a directed graph and this graph resembles a taxonomy. In this paper we present an experiment with texts randomly extracted from the Spanish Wikipedia, but our findings suggest that this co-occurrence behavior is a macroscopic and intrinsic property of argumentative discourse in general.

Idioma originalInglés
Páginas (desde-hasta)67-74
Número de páginas8
PublicaciónProcesamiento de Lenguaje Natural
EstadoPublicada - sept. 2012
Publicado de forma externa


Profundice en los temas de investigación de 'Co-occurrence graphs applied to taxonomy extraction in scientific and technical corpora'. En conjunto forman una huella única.

Citar esto