I've read previous threads on computing coverage from BAMs, and I can't seem to find an answer to my specific quandary. I am aware that coverage can be quite easily computed from a BAM file over intervals/genes/exons/etc. provided in a BED or GFF file using Samtools Pileup, Depth, or Bedtools Coverage, or GTK, but I would also like to be able to pull out average nucleotide-specific coverages within each of the intervals specified.
Specifically, I have a bunch of aligned, indexed BAM files from whole-exome sequencing data, and I have a BED file containing the locations of coding regions of all RefSeq genes. For each BAM, I want to be able to pull out average coverage per coding region of each specific nucleotide, that is, the average coverage of A's, T's, G's and C's. Furthermore, I would like to be able query the coverage of nucleotides within specific contexts for each coding region, such as, for example, what is the average coverage of all G's in CG contexts? Similarly, I would also like to get nucleotide specific lengths for each region in the BED file.
Does anyone know if this is doable using the basic tools provided in Samtools or Bedtools? If not, how difficult would it be to pipe some output (like a pileup) into a custom python script to get this done without making my vastly under-powered machine explode?
Thanks in advance for your help!
Specifically, I have a bunch of aligned, indexed BAM files from whole-exome sequencing data, and I have a BED file containing the locations of coding regions of all RefSeq genes. For each BAM, I want to be able to pull out average coverage per coding region of each specific nucleotide, that is, the average coverage of A's, T's, G's and C's. Furthermore, I would like to be able query the coverage of nucleotides within specific contexts for each coding region, such as, for example, what is the average coverage of all G's in CG contexts? Similarly, I would also like to get nucleotide specific lengths for each region in the BED file.
Does anyone know if this is doable using the basic tools provided in Samtools or Bedtools? If not, how difficult would it be to pipe some output (like a pileup) into a custom python script to get this done without making my vastly under-powered machine explode?
Thanks in advance for your help!
Comment