Please use this identifier to cite or link to this item:
http://elar.urfu.ru/handle/10995/36855
Title: | A Large-Scale Community Questions Classification Accounting for Category Similarity: An Exploratory? |
Authors: | Lezina, G. Braslavski, P. Браславский, П. И. Лезина, Г. |
Issue Date: | 2015 |
Publisher: | Springer International Publishing |
Citation: | Lezina G. A Large-Scale Community Questions Classification Accounting for Category Similarity: An Exploratory? / G. Lezina, P. Braslavski // 8th Russian Summer School on Information Retrieval, RuSSIR 2014, Communications in Computer and Information Science. — Springer International Publishing, Switzerland, 2015. — Vol. 505. — P. 332-347. |
Abstract: | The paper reports on a large-scale topical categorization of questions from a Russian community question answering (CQA) service Otvety@Mail.Ru. We used a data set containing all the questions (more than 11 millions) asked by Otvety@Mail.Ru users in 2012. This is the first study on question categorization dealing with non-English data of this size. The study focuses on adjusting category structure in order to get more robust classification results. We investigate several approaches to measure similarity between categories: the share of identical questions, language models, and user activity. The results show that the proposed approach is promising. |
Keywords: | QUESTION TOPIC CATEGORIZATION COMMUNITY QUESTION ANSWERING QUESTION RETRIEVAL LARGE-SCALE CLASSIFICATION |
URI: | http://elar.urfu.ru/handle/10995/36855 |
SCOPUS ID: | 84951806953 |
WOS ID: | 000369892500013 |
PURE ID: | 569137 |
DOI: | 10.1007/978-3-319-25485-2_13 |
metadata.dc.description.sponsorship: | 14-07-00589; RFBR; Russian Foundation for Basic Research. |
Appears in Collections: | Научные публикации ученых УрФУ, проиндексированные в SCOPUS и WoS CC |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
braslavski_russir2014.pdf | 356,05 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.