I am new to python and trying to learn. The below is an attempt to filter out secondary reads in a all bam files in a directory using pysam. I ran mark duplicates and got an error on several bam files due to secondary alignments. I added comments to each line to illustrate my thought process. Thank you .
The specific mark duplicates error:
Value was put into PairInfoMap more than once
another option
The specific mark duplicates error:
Value was put into PairInfoMap more than once
Code:
#! /usr/bin/python ## call python script import sys ## import python system functions import pysam ## import module bam = pysam.AlignmentFile(.bam, "rb") ## open bam and read click.echo("Reading BAM file") ## output message for read in bam.fetch(): if read.is_secondary=true ## not the primary alignment continue if read.has_tag('AS') and read.has_tag('XS') ## look for Primary and Secondary Alignment Score tag AS = read.get_tag('AS') ## read and store AS value XS = read.get_tag('XS') ## read and store XS value if AS >= XS ## Alignment score greater then or equal to XS (these will be printed) bam.write(read) ## open bam and only print primary alignments continue ## process next line bam.close() ## end processing
Code:
#! /usr/bin/python ## call python script import sys ## import python system functions import pysam ## import module bam = pysam.AlignmentFile(bam, "rb") ## open bam and read click.echo("Reading BAM file") ## output message for read in bam.fetch(): ## start loop and iterate over each bam if read.is_secondary: ## not the primary alignment continue if read.has_tag('AS') and read.has_tag('XS') ## look for Primary and Secondary Alignment Score tag AS = read.get_tag('AS') ## read and store AS value XS = read.get_tag('XS') ## read and store XS value if AS >= XS ## Alignment score greater then or equal to XS (these will be printed) read.write(read) ## only print primary alignments continue bam.close() ## end processing