Header Leaderboard Ad

Collapse

coverage by strand with pysam

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • coverage by strand with pysam

    I'm pretty new to NGS analysis (and this forum), and have been mapping my data with bfast. I'm interested in coverage by strand, and am trying to come up with the best way to do this.

    Looking through the pysam documentation
    Code:
    import pysam
    samfile = pysam.Samfile("ex1.bam", "rb" )
    for pileupcolumn in samfile.pileup( 'chr1', 100, 120):
        print
        print 'coverage at base %s = %s' % (pileupcolumn.pos , pileupcolumn.n)
        for pileupread in pileupcolumn.pileups:
            print '\tbase in read %s = %s' % (pileupread.alignment.qname,
            pileupread.alignment.seq[pileupread.qpos])
    Iterating over the bam file this way seems ideal, but Is there a way to pull out the +/- strand info from a pilupcolumn object? Or, would is doing something like this more appropriate?

    Code:
     
    for pileupcolumn in samfile.pileup( 'chr1', 100, 120):
        for pileupread in pileupcolumn.pileups:
            if pileupread.alignment.is_reverse: #negative strand hit
                negCov[ pileupcolumn.pos ] += 1
            else:
                posCov[ pileupcolumn.pos ] += 1
    My primary concern is correctness, but a secondary concern is run time. This nested for loop seems costly, is that a good assumption?
    Last edited by desaila; 10-14-2010, 08:09 AM.

  • #2
    BEDTools's genomeCoverageBed will calculate strand-specific coverage.

    http://code.google.com/p/bedtools/

    Comment

    Working...
    X