Simulation of user-defined, allele-specific copy number events into bam files.

Description: Somatic copy number variations (CNVs) play a crucial role in development of many human cancers. The broad availability of next-generation sequencing data has enabled the development of algorithms to computationally infer CNV profiles from a variety of data types including exome and targeted sequence data; currently the most prevalent types of cancer genomics data. However, systemic evaluation and comparison of these tools remains challenging due to a lack of ground truth reference sets. To address this need, we have developed Bamgineer, a tool written in Python to introduce user-defined, haplotype-phased, allele-specific copy number events into an existing Binary Alignment Mapping (BAM) files, with a focus on targeted and exome sequencing experiments. As input, this tool requires a read alignment file (BAM format), lists of non-overlapping genome coordinates for introduction of gains and losses (bed file), and an optional file defining known haplotypes (vcf format).
Authors: Soroush Samadian, Jeff P. Bruce, Trevor J. Pugh
Lab: Pugh
Version: 2
Keywords: Bamgineer, copy number variation, BAM, Python
Licensing: Apache License 2.0


Samadian, S., Bruce, J. P., & Pugh, T. J. (2018). Bamgineer: Introduction of simulated allele-specific copy number variants into exome and targeted sequence data sets. PLoS computational biology, 14(3), e1006080.