I'm pretty new to NGS analysis (and this forum), and have been mapping my data with bfast. I'm interested in coverage by strand, and am trying to come up with the best way to do this.
Looking through the pysam documentation
Iterating over the bam file this way seems ideal, but Is there a way to pull out the +/- strand info from a pilupcolumn object? Or, would is doing something like this more appropriate?
My primary concern is correctness, but a secondary concern is run time. This nested for loop seems costly, is that a good assumption?
Looking through the pysam documentation
Code:
import pysam samfile = pysam.Samfile("ex1.bam", "rb" ) for pileupcolumn in samfile.pileup( 'chr1', 100, 120): print print 'coverage at base %s = %s' % (pileupcolumn.pos , pileupcolumn.n) for pileupread in pileupcolumn.pileups: print '\tbase in read %s = %s' % (pileupread.alignment.qname, pileupread.alignment.seq[pileupread.qpos])
Code:
for pileupcolumn in samfile.pileup( 'chr1', 100, 120): for pileupread in pileupcolumn.pileups: if pileupread.alignment.is_reverse: #negative strand hit negCov[ pileupcolumn.pos ] += 1 else: posCov[ pileupcolumn.pos ] += 1
Comment