Please use this identifier to cite or link to this item:
http://elar.urfu.ru/handle/10995/101404
Title: | SberQuAD – Russian Reading Comprehension Dataset: Description and Analysis |
Authors: | Efimov, P. Chertok, A. Boytsov, L. Braslavski, P. |
Issue Date: | 2020 |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Citation: | SberQuAD – Russian Reading Comprehension Dataset: Description and Analysis / P. Efimov, A. Chertok, L. Boytsov, et al. — DOI 10.1007/978-3-030-58219-7_1 // Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). — 2020. — Vol. 12260 LNCS. — P. 3-15. |
Abstract: | The paper presents SberQuAD – a large Russian reading comprehension (RC) dataset created similarly to English SQuAD. SberQuAD contains about 50K question-paragraph-answer triples and is seven times larger compared to the next competitor. We provide its description, thorough analysis, and baseline experimental results. We scrutinized various aspects of the dataset that can have impact on the task performance: question/paragraph similarity, misspellings in questions, answer structure, and question types. We applied five popular RC models to SberQuAD and analyzed their performance. We believe our work makes an important contribution to research in multilingual question answering. © 2020, Springer Nature Switzerland AG. |
Keywords: | EVALUATION MULTILINGUAL QUESTION ANSWERING READING COMPREHENSION RUSSIAN LANGUAGE RESOURCES ASSOCIATION REACTIONS NATURAL LANGUAGE PROCESSING SYSTEMS QUESTION ANSWERING QUESTION TYPE RC MODELS READING COMPREHENSION TASK PERFORMANCE LARGE DATASET |
URI: | http://elar.urfu.ru/handle/10995/101404 |
Access: | info:eu-repo/semantics/openAccess |
SCOPUS ID: | 85092191483 |
PURE ID: | 14123133 5932df21-7a79-4285-ae44-5931887a552d |
ISSN: | 3029743 |
ISBN: | 9783030582180 |
DOI: | 10.1007/978-3-030-58219-7_1 |
Sponsorship: | We thank Peter Romov, Vladimir Suvorov, and Ekaterina Arte-mova (Chernyak) for providing us with details about SberQuAD preparation. We also thank Natasha Murashkina for initial data processing. PB acknowledges support by Ural Mathematical Center under agreement No. 075-02-2020-1537/1 with the Ministry of Science and Higher Education of the Russian Federation. |
Appears in Collections: | Научные публикации ученых УрФУ, проиндексированные в SCOPUS и WoS CC |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2-s2.0-85092191483.pdf | 322,02 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.