Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: http://elar.urfu.ru/handle/10995/101657
Название: Linear Time Maximum Segmentation Problems in Column Stream Model
Авторы: Cazaux, B.
Kosolobov, D.
Mäkinen, V.
Norri, T.
Дата публикации: 2019
Издатель: Springer
Библиографическое описание: Linear Time Maximum Segmentation Problems in Column Stream Model / B. Cazaux, D. Kosolobov, V. Mäkinen, et al. — DOI 10.1007/978-3-030-32686-9_23 // Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). — 2019. — Vol. 11811 LNCS. — P. 322-336.
Аннотация: We study a lossy compression scheme linked to the biological problem of founder reconstruction: The goal in founder reconstruction is to replace a set of strings with a smaller set of founders such that the original connections are maintained as well as possible. A general formulation of this problem is NP-hard, but when limiting to reconstructions that form a segmentation of the input strings, polynomial time solutions exist. We proposed in our earlier work (WABI 2018) a linear time solution to a formulation where minimum segment length was bounded, but it was left open if the same running time can be obtained when the targeted compression level (number of founders) is bounded and lossyness is minimized. This optimization is captured by the Maximum Segmentation problem: Given a threshold M and a set of strings of the same length n, find a minimum cost partition P where for each segment, the compression level is bounded from above by M. We give linear time algorithms to solve the problem for two different (compression quality) measures on P: the average length of the intervals of the partition and the length of the minimal interval of the partition. These algorithms make use of positional Burrows–Wheeler transform and the range maximum queue, an extension of range maximum queries to the case where the input string can be operated as a queue. For the latter, we present a new solution that may be of independent interest. The solutions work in a streaming model where one column of the input strings is introduced at a time. © 2019, Springer Nature Switzerland AG.
Ключевые слова: DYNAMIC PROGRAMMING
FOUNDER RECONSTRUCTION
PAN-GENOME INDEXING
POSITIONAL BURROWS–WHEELER TRANSFORM
RANGE MAXIMUM QUEUE
CLUSTERING ALGORITHMS
INFORMATION RETRIEVAL
POLYNOMIAL APPROXIMATION
QUEUEING THEORY
BIOLOGICAL PROBLEMS
COMPRESSION QUALITY
LINEAR-TIME ALGORITHMS
LINEAR-TIME SOLUTIONS
LOSSY COMPRESSIONS
POLYNOMIAL-TIME
RANGE MAXIMUM QUERY
RANGE MAXIMUM QUEUE
DYNAMIC PROGRAMMING
URI: http://elar.urfu.ru/handle/10995/101657
Условия доступа: info:eu-repo/semantics/openAccess
Идентификатор SCOPUS: 85075665675
Идентификатор PURE: 11343197
26e0ab02-70a8-477a-8f7d-52bd5f924fe7
ISSN: 3029743
ISBN: 9783030326852
DOI: 10.1007/978-3-030-32686-9_23
Располагается в коллекциях:Научные публикации, проиндексированные в SCOPUS и WoS CC

Файлы этого ресурса:
Файл Описание РазмерФормат 
2-s2.0-85075665675.pdf435,77 kBAdobe PDFПросмотреть/Открыть


Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.