Ontology population using corpus statistics

Rogelio Nazar, IRENE RENAU ARAQUE

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

This paper presents a combination of algorithms for automatic ontology building based mainly on lexical cooccurrence statistics. We populate an ontology with hypernymy links, thus we refer more specifically to a taxonomy of lexical units (nouns organized by hypernymy relations) rather than an ontology of formally defined concepts. A set of combined statistical procedures produce fragments of taxonomies from corpora that are later integrated into a unified taxonomy by a central algorithm. Our results show that with an ensemble of different components it is possible to achieve an accuracy only slightly worse than human performance. Finally, as our methods are based on quantitative linguistics, the algorithm we propose is not language specific. The language used for the experiments is, however, Spanish.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume1517
StatePublished - 1 Jan 2015
EventJoint Ontology Workshops 2015, JOWO 2015 - Episode 1: The Argentine Winter of Ontology - Buenos Aires, Argentina
Duration: 25 Jul 201527 Jul 2015

Fingerprint Dive into the research topics of 'Ontology population using corpus statistics'. Together they form a unique fingerprint.

Cite this