Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: http://elar.urfu.ru/handle/10995/75060
Полная запись метаданных
Поле DCЗначениеЯзык
dc.contributor.authorNorri, T.en
dc.contributor.authorCazaux, B.en
dc.contributor.authorKosolobov, D.en
dc.contributor.authorMäkinen, V.en
dc.date.accessioned2019-07-22T06:43:46Z-
dc.date.available2019-07-22T06:43:46Z-
dc.date.issued2019-
dc.identifier.citationLinear time minimum segmentation enables scalable founder reconstruction / T. Norri, B. Cazaux, D. Kosolobov et al. // Algorithms for Molecular Biology. — 2019. — Vol. 14. — Iss. 1. — 12.en
dc.identifier.issn1748-7188-
dc.identifier.otherhttps://almob.biomedcentral.com/track/pdf/10.1186/s13015-019-0147-6pdf
dc.identifier.other1good_DOI
dc.identifier.otheracf43fea-9c13-46d8-9a44-0381fe8ee411pure_uuid
dc.identifier.otherhttp://www.scopus.com/inward/record.url?partnerID=8YFLogxK&scp=85065895221m
dc.identifier.urihttp://elar.urfu.ru/handle/10995/75060-
dc.description.abstractBackground: We study a preprocessing routine relevant in pan-genomic analyses: consider a set of aligned haplotype sequences of complete human chromosomes. Due to the enormous size of such data, one would like to represent this input set with a few founder sequences that retain as well as possible the contiguities of the original sequences. Such a smaller set gives a scalable way to exploit pan-genomic information in further analyses (e.g. read alignment and variant calling). Optimizing the founder set is an NP-hard problem, but there is a segmentation formulation that can be solved in polynomial time, defined as follows. Given a threshold L and a set {\mathcal {R}} = \{R-1, \ldots, R-m\} R = { R 1, ..., R m } of m strings (haplotype sequences), each having length n, the minimum segmentation problem for founder reconstruction is to partition [1, n] into set P of disjoint segments such that each segment [a,b] \in P [ a, b ] P has length at least L and the number d(a,b)=|\{R-i[a,b]:1\le i \le m\}| d (a, b) = | { R i [ a, b ]: 1 ≤ i ≤ m } | of distinct substrings at segment [a, b] is minimized over [a,b] \in P [ a, b ] P. The distinct substrings in the segments represent founder blocks that can be concatenated to form \max \{ d(a,b):[a,b] \in P \} max { d (a, b): [ a, b ] P } founder sequences representing the original {\mathcal {R}} R such that crossovers happen only at segment boundaries. Results: We give an O(mn) time (i.e. linear time in the input size) algorithm to solve the minimum segmentation problem for founder reconstruction, improving over an earlier O(mn^2) O (m n 2). Conclusions: Our improvement enables to apply the formulation on an input of thousands of complete human chromosomes. We implemented the new algorithm and give experimental evidence on its practicality. The implementation is available in https://github.com/tsnorri/founder-sequences. © 2019 The Author(s).en
dc.description.sponsorshipThis work was partially supported by the Academy of Finland (Grant 309048).en
dc.format.mimetypeapplication/pdfen
dc.language.isoenen
dc.publisherBioMed Central Ltd.en
dc.rightsinfo:eu-repo/semantics/openAccessen
dc.sourceAlgorithms for Molecular Biologyen
dc.subjectDYNAMIC PROGRAMMINGen
dc.subjectFOUNDER RECONSTRUCTIONen
dc.subjectPAN-GENOME INDEXINGen
dc.subjectPOSITIONAL BURROWS-WHEELER TRANSFORMen
dc.subjectRANGE MINIMUM QUERYen
dc.titleLinear time minimum segmentation enables scalable founder reconstructionen
dc.typeArticleen
dc.typeinfo:eu-repo/semantics/articleen
dc.typeinfo:eu-repo/semantics/publishedVersionen
dc.identifier.doi10.1186/s13015-019-0147-6-
dc.identifier.scopus85065895221-
local.affiliationDepartment of Computer Science, University of Helsinki, Pietari Kalmin katu 5, Helsinki, 00014, Finlanden
local.affiliationUral Federal University, Mira 19, Yekaterinburg, 620002, Russian Federationen
local.contributor.employeeКосолобов Дмитрий Александровичru
local.issue1-
local.volume14-
dc.identifier.wos000468292600001-
local.identifier.pure9818196-
local.description.order12-
local.identifier.eid2-s2.0-85065895221-
local.identifier.wosWOS:000468292600001-
Располагается в коллекциях:Научные публикации ученых УрФУ, проиндексированные в SCOPUS и WoS CC

Файлы этого ресурса:
Файл Описание РазмерФормат 
10.1186-s13015-019-0147-6.pdf2,95 MBAdobe PDFПросмотреть/Открыть


Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.