Please use this identifier to cite or link to this item: http://hdl.handle.net/10995/59607
Title: Improving the presentation of search results by multipartite graph clustering of multiple reformulated queries and a novel document representation
Authors: Lytkin, N.
Streltsov, S.
Perlovsky, L.
Muchnik, I.
Petrov, S.
Issue Date: 2005
Publisher: б. и.
Citation: Improving the presentation of search results by multipartite graph clustering of multiple reformulated queries and a novel document representation / N. Lytkin, S. Streltsov, L. Perlovsky, I. Muchnik, S. Petrov // Интернет-математика 2005. Автоматическая обработка веб-данных. — М., 2005.
Abstract: The goal of clustering web search results is to reveal the semantics of the retrieved documents. The main challenge is to make clustering partition relevant to a user’s query. In this paper, we describe a method of clustering search results using a similarity measure between documents retrieved by multiple reformulated queries. The method produces clusters of documents that are most relevant to the original query and, at the same time, represent a more diverse set of semantically related queries. In order to cluster thousands of documents in real time, we designed a novel multipartite graph clustering algorithm that has low polynomial complexity and no manually adjusted hyper–parameters. The loss of semantics resulting from the stem–based document representation is a common problem in information retrieval. To address this problem, we propose an alternative novel document representation, under which words are represented by their synonymy groups.
URI: http://hdl.handle.net/10995/59607
metadata.dc.description.sponsorship: This work was supported by Yandex grant 110104.
Origin: Интернет-математика 2005: автоматическая обработка веб-данных. — М., 2005
Appears in Collections:Информационный поиск

Files in This Item:
File Description SizeFormat 
IMAT_2005_26.pdf220,48 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.