Comparative web search questions

Bondarenko, A.; Braslavski, P.; Völske, M.; Aly, R.; Fröbe, M.; Panchenko, A.; Biemann, C.; Stein, B.; Hagen, M.

doi:10.1145/3336191.3371848

Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: http://elar.urfu.ru/handle/10995/101818

Название:	Comparative web search questions
Авторы:	Bondarenko, A. Braslavski, P. Völske, M. Aly, R. Fröbe, M. Panchenko, A. Biemann, C. Stein, B. Hagen, M.
Дата публикации:	2020
Издатель:	Association for Computing Machinery, Inc
Библиографическое описание:	Comparative web search questions / A. Bondarenko, P. Braslavski, M. Völske, et al. — DOI 10.1145/3336191.3371848 // WSDM 2020 - Proceedings of the 13th International Conference on Web Search and Data Mining. — 2020. — P. 52-60.
Аннотация:	We analyze comparative questions, i.e., questions asking to compare different items, that were submitted to Yandex in 2012. Responses to such questions might be quite different from the simple “ten blue links” and could, for example, aggregate pros and cons of the different options as direct answers. However, changing the result presentation is an intricate decision such that the classification of comparative questions forms a highly precision-oriented task. From a year-long Yandex log, we annotate a random sample of 50,000 questions; 2.8% of which are comparative. For these annotated questions, we develop a precision-oriented classifier by combining carefully hand-crafted lexico-syntactic rules with feature-based and neural approaches—achieving a recall of 0.6 at a perfect precision of 1.0. After running the classifier on the full year log (on average, there is at least one comparative question per second), we analyze 6,250 comparative questions using more fine-grained subclasses (e.g., should the answer be a “simple” fact or rather a more verbose argument) for which individual classifiers are trained. An important insight is that more than 65% of the comparative questions demand argumentation and opinions, i.e., reliable direct answers to comparative questions require more than the facts from a search engine’s knowledge graph. In addition, we present a qualitative analysis of the underlying comparative information needs (separated into 14 categories like consumer electronics or health), their seasonal dynamics, and possible answers from community question answering platforms. © 2020 Copyright held by the owner/author(s).
Ключевые слова:	QUERY LOG ANALYSIS QUESTION ANSWERING QUESTION CLASSIFICATION INFORMATION RETRIEVAL SEARCH ENGINES SYNTACTICS WEBSITES COMMUNITY QUESTION ANSWERING INDIVIDUAL CLASSIFIERS KNOWLEDGE GRAPHS QUALITATIVE ANALYSIS QUERY LOG ANALYSIS QUESTION ANSWERING QUESTION CLASSIFICATION SEASONAL DYNAMICS DATA MINING
URI:	http://elar.urfu.ru/handle/10995/101818
Условия доступа:	info:eu-repo/semantics/openAccess
Идентификатор SCOPUS:	85079535949
Идентификатор WOS:	000531489300010
Идентификатор PURE:	79033a84-f346-4a57-bebf-db4f8cdaafab 12233207
ISBN:	9781450368223
DOI:	10.1145/3336191.3371848
Сведения о поддержке:	This work has been partially supported by the DFG through the project “ACQuA: Answering Comparative Questions with Arguments” (grants BI 1544/7-1 and HA 5851/2-1) as part of the priority program “RATIO: Robust Argumentation Machines” (SPP 1999). We thank Yandex and Mail.Ru for granting access to the data. The study was partially conducted during Pavel Braslavski’s research stay at the Bauhaus-Universität Weimar in 2018 supported by the DAAD. We also thank Ekaterina Shirshakova and Valentin Dittmar for their help in question annotation.
Располагается в коллекциях:	Научные публикации ученых УрФУ, проиндексированные в SCOPUS и WoS CC

Файлы этого ресурса:

Файл	Описание	Размер	Формат
2-s2.0-85079535949.pdf		1,45 MB	Adobe PDF	Просмотреть/Открыть

Показать полное описание ресурса Статистика

Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.