Bioinformatics Data Skills: Reproducible and Robust Research with Open Source Tools by Vince Buffalo

Bioinformatics Data Skills: Reproducible and Robust Research with Open Source Tools by Vince Buffalo

Author:Vince Buffalo [Buffalo, Vince]
Language: eng
Format: azw3, pdf
Publisher: O'Reilly Media
Published: 2015-06-30T16:00:00+00:00


Despite the convenience that comes with representing and working with genomic ranges, there are unfortunately some gritty details we need to be aware of. First, there are two different flavors of range systems used by bioinformatics data formats (see Table 9-1 for a reference) and software programs:

0-based coordinate system, with half-closed, half-open intervals.

1-based coordinate system, with closed intervals.

With 0-based coordinate systems, the first base of a sequence is position 0 and the last base’s position is the length of the sequence - 1. In this 0-based coordinate system, we use half-closed, half-open intervals. Admittedly, these half-closed, half-open intervals can be a little unintuitive at first — it’s easiest to borrow some notation from mathematics when explaining these intervals. For some start and end positions, half-closed, half-open intervals are written as [start, end). Brackets indicate a position is included in the interval range (in other words, the interval is closed on this end), while parentheses indicate that a position is excluded in the interval range (the interval is open on this end). So a half-closed, half-open interval like [1, 5) includes the bases at positions 1, 2, 3, and 4 (illustrated in Figure 9-2). You may be wondering why on earth we’d ever use a system that excludes the end position, but we’ll come to that after discussing 1-based coordinate systems. In fact, if you’re familiar with Python, you’ve already seen this type of interval system: Python’s strings (and lists) are 0-indexed and use half-closed, half-open intervals for indexing portions of a string:



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.