Abstract
In this paper we deal with the problem of automatic detection of query intent in search engines. We studied features that have shown good performance in the state-of-the- art, combined with novel features extracted from click-through data. We show that the combination of these features gives good precision results. In a second stage, four text- based classifiers were studied to test the usefulness of text-based features. With a low rate of false positives (less than 10 %) the proposed classifiers can detect query intent in over 90% of the evaluation instances. However due to a notorious unbalance in the classes, the proposed classifiers show poor results to detect transactional intents. We address this problem by including a cost sensitive learning strategy, allowing to solve the skewed data distribution. Finally, we explore the use of classifier ensembles which allow to us to achieve the best performance for the task.
Original language | English |
---|---|
Pages (from-to) | 24-52 |
Number of pages | 29 |
Journal | Journal of Web Engineering |
Volume | 13 |
Issue number | 1-2 |
State | Published - 1 Mar 2014 |
Externally published | Yes |
Keywords
- Query categorization
- Query logs
- User intents