Formalizing Predicates for Discovery Under the Lexicon Grammar Framework

Javiera Jacobsen, Walter Koza, Mirian Muñoz, Francisca Saiz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This text proposes a method for automatic analysis of predicates for discovery (PD) in Spanish. A PD is a predicative unit that projects an argument structure (AS) whose meaning alludes to ‘something that is found by someone -or something- somewhere’ (e.g., ‘encontrar’, ‘hallar’). This type of task is useful in fields such as medicine, since it offers the possibility of automatically identifying findings of interest (diseases, test results, etc.) in large text corpora. The present work is based on Lexicon Grammar (LG), which proposes a formalization from the nature of arguments (object classes) and transformational possibilities. The methodology is carried out as follows: (i) manual identification of PDs from a corpus of gynecology and obstetrics; (ii) elaboration of LG tables for each PD, where object classes are categorized and possible transformations are listed; and (iii) computational modeling. For the last stage, electronic dictionaries and computer-generated grammars were built in NooJ. The algorithm with automatically detected and generated ASs from PDs (325 grammatical sentences) was evaluated against an annotated corpus (1000 manually-annotated sentences, randomly extracted from a corpus of 5 million words). Results gave 98% accuracy, 88% coverage, and 92% F-measure.

Original languageEnglish
Title of host publicationFormalizing Natural Languages
Subtitle of host publicationApplications to Natural Language Processing and Digital Humanities - 15th International Conference, NooJ 2021, Revised Selected Papers
EditorsMagali Bigey, Annabel Richeton, Max Silberztein, Izabella Thomas
PublisherSpringer Science and Business Media Deutschland GmbH
Pages62-71
Number of pages10
ISBN (Print)9783030928605
DOIs
StatePublished - 2021
Externally publishedYes
Event15th International Conference, NooJ 2021 - Virtual Online
Duration: 9 Jun 202111 Jun 2021

Publication series

NameCommunications in Computer and Information Science
Volume1520 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference15th International Conference, NooJ 2021
CityVirtual Online
Period9/06/2111/06/21

Keywords

  • Automatic analyses
  • Lexicon grammar
  • Predicates for discovery

Fingerprint

Dive into the research topics of 'Formalizing Predicates for Discovery Under the Lexicon Grammar Framework'. Together they form a unique fingerprint.

Cite this