Efficient storage of multiple tracks of numeric data anchored to a genome
Description: We present a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. We have also developed utilities to load data into this format. We show that retrieving data from this format is more than 2900 times faster than a naive approach using wiggle files.
Authors: Michael Hoffman, Eric Roberts, Orion Buske
Keywords: Genomedata, storage, genome tracks, genomics, Python
Hoffman, M. M., Buske, O. J., & Noble, W. S. (2010). The Genomedata format for storing large-scale functional genomics data. Bioinformatics, 26(11), 1458-1459.