Please use this identifier to cite or link to this item:
http://elar.urfu.ru/handle/10995/103284
Title: | RuBQ: A Russian Dataset for Question Answering over Wikidata |
Authors: | Korablinov, V. Braslavski, P. |
Issue Date: | 2020 |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Citation: | Korablinov V. RuBQ: A Russian Dataset for Question Answering over Wikidata / V. Korablinov, P. Braslavski. — DOI 10.1007/978-3-030-62466-8_7 // Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). — 2020. — Vol. 12507 LNCS. — P. 97-110. |
Abstract: | The paper presents RuBQ, the first Russian knowledge base question answering (KBQA) dataset. The high-quality dataset consists of 1,500 Russian questions of varying complexity, their English machine translations, SPARQL queries to Wikidata, reference answers, as well as a Wikidata sample of triples containing entities with Russian labels. The dataset creation started with a large collection of question-answer pairs from online quizzes. The data underwent automatic filtering, crowd-assisted entity linking, automatic generation of SPARQL queries, and their subsequent in-house verification. The freely available dataset will be of interest for a wide community of researchers and practitioners in the areas of Semantic Web, NLP, and IR, especially for those working on multilingual question answering. The proposed dataset generation pipeline proved to be efficient and can be employed in other data annotation projects. © 2020, Springer Nature Switzerland AG. |
Keywords: | EVALUATION KNOWLEDGE BASE QUESTION ANSWERING RUSSIAN LANGUAGE RESOURCES SEMANTIC PARSING KNOWLEDGE BASED SYSTEMS LARGE DATASET NATURAL LANGUAGE PROCESSING SYSTEMS AUTOMATIC FILTERING AUTOMATIC GENERATION DATA ANNOTATION KNOWLEDGE BASE MACHINE TRANSLATIONS QUESTION ANSWERING QUESTION-ANSWER PAIRS SPARQL QUERIES SEMANTIC WEB |
URI: | http://elar.urfu.ru/handle/10995/103284 |
Access: | info:eu-repo/semantics/openAccess |
SCOPUS ID: | 85096596949 |
PURE ID: | 20220236 0428da86-88a9-4f2e-8056-6bc739fa0a8e |
ISSN: | 3029743 |
ISBN: | 9783030624651 |
DOI: | 10.1007/978-3-030-62466-8_7 |
metadata.dc.description.sponsorship: | We thank Mikhail Galkin, Svitlana Vakulenko, Daniil Sorokin, Vladimir Kovalenko, Yaroslav Golubev, and Rishiraj Saha Roy for their valuable comments and fruitful discussion on the paper draft. We also thank Pavel Bakhvalov, who helped collect RuWikidata8M sample and contributed to the first version of the entity linking tool. We are grateful to Yandex.Toloka for their data annotation grant. PB acknowledges support by Ural Mathematical Center under agreement No. 075-02-2020-1537/1 with the Ministry of Science and Higher Education of the Russian Federation. |
Appears in Collections: | Научные публикации ученых УрФУ, проиндексированные в SCOPUS и WoS CC |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2-s2.0-85096596949.pdf | 544,34 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.