Next Generation Sequencing and Sequence Assembly by Ali Masoudi-Nejad Zahra Narimani & Nazanin Hosseinkhan
Author:Ali Masoudi-Nejad, Zahra Narimani & Nazanin Hosseinkhan
Language: eng
Format: epub
Publisher: Springer New York, New York, NY
The problem of repeats can be resolved by high coverage of sequences, but existing errors in sequence data don’t allow the repeat discovery task to be very easy. To resolve the repeats that are longer than reads, paired-ends are needed (paired-end [mate-pair] technologies are described in Chap. 1). This is a more complicated task than resolving repeats shorter than read sizes using single reads. Inexact repeats can be separated by the high-stringency alignment of reads and finding read correlations using different base call patterns in them [11]. The task of resolving repeats will be explained later for each assembly algorithm in Chap. 4. All these, in addition to the size of genomes and large number of reads, make assembly a complicated problem requiring an efficient solution and data structure design and computationally high-performance platforms. Intelligent heuristics and tricks play an important role in overcoming these difficulties.
It is important to note that assembling is not applied only to genomic data, and, for example, assembling transcriptomics data (such as expressed sequence tags—“ESTs”), which gives a view of the biological state of a cell, is something that is also very important in practice. However, the challenges in various kinds of data are different. The discontinuity of transcriptomics data results in less contiguity than genomic data. Since repeats mainly exist in intron regions of the genome, repeat is not a major issue in assembling transcriptomics data. But since transcription from a single part of the genome can be done in different patterns (i.e., from different start and end positions), this adds an additional complexity to the assembly of transcriptomics data. Algorithmic approaches are needed to handle other situations referring to ESTs—for example, different rate of expression (highly expressed genes), alternative splicing, and paralogous genes. These problems are even more serious with the contamination of the CDNA library by genomic data [12].
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Deep Learning with Python by François Chollet(12587)
Hello! Python by Anthony Briggs(9924)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(9800)
The Mikado Method by Ola Ellnestam Daniel Brolund(9784)
Dependency Injection in .NET by Mark Seemann(9347)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8309)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(7770)
Grails in Action by Glen Smith Peter Ledbrook(7704)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(7565)
Becoming a Dynamics 365 Finance and Supply Chain Solution Architect by Brent Dawson(7137)
Microservices with Go by Alexander Shuiskov(6899)
Practical Design Patterns for Java Developers by Miroslav Wengner(6817)
Test Automation Engineering Handbook by Manikandan Sambamurthy(6757)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6422)
Angular Projects - Third Edition by Aristeidis Bampakos(6175)
The Art of Crafting User Stories by The Art of Crafting User Stories(5697)
NetSuite for Consultants - Second Edition by Peter Ries(5628)
Demystifying Cryptography with OpenSSL 3.0 by Alexei Khlebnikov(5440)
Kotlin in Action by Dmitry Jemerov(5072)
