TY - GEN
T1 - Supervised learning algorithms applied to terminology extraction
AU - Nazar, Rogelio
AU - Cabré, Maria Teresa
PY - 2012
Y1 - 2012
N2 - In this paper we present a new terminology extraction system based on supervised statistical learning algorithms, which are characterized by having a training phase with a controlled exposure to both positive and negative examples prior to the actual categorization. Contrary to the vast majority of the term extractors reported in the literature, our proposal is based on implicit knowledge rather than handcrafted explicit rules. Given a list of terms from some domain and language plus a general language reference corpus, we developed a methodology for terminology extraction and implemented it as a web application that is already available online. This tool is flexible enough to operate in different languages and domains and, as a sort of lifelong learning algorithm, it turns terminology extraction into a collaborative effort, where all users benefit from the training conducted by each individual.
AB - In this paper we present a new terminology extraction system based on supervised statistical learning algorithms, which are characterized by having a training phase with a controlled exposure to both positive and negative examples prior to the actual categorization. Contrary to the vast majority of the term extractors reported in the literature, our proposal is based on implicit knowledge rather than handcrafted explicit rules. Given a list of terms from some domain and language plus a general language reference corpus, we developed a methodology for terminology extraction and implemented it as a web application that is already available online. This tool is flexible enough to operate in different languages and domains and, as a sort of lifelong learning algorithm, it turns terminology extraction into a collaborative effort, where all users benefit from the training conducted by each individual.
KW - Computational terminography
KW - Machine learning
KW - Quantitative linguistics
KW - Terminology extraction
UR - http://www.scopus.com/inward/record.url?scp=84883330362&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84883330362
SN - 9788469543337
T3 - Proceedings of the 10th Terminology and Knowledge Engineering Conference: New Frontiers in the Constructive Symbiosis of Terminology and Knowledge Engineering, TKE 2012
SP - 209
EP - 217
BT - Proceedings of the 10th Terminology and Knowledge Engineering Conference
T2 - 10th Terminology and Knowledge Engineering Conference: New Frontiers in the Constructive Symbiosis of Terminology and Knowledge Engineering, TKE 2012
Y2 - 19 June 2012 through 22 June 2012
ER -