Hello,
I am looking for a tool or script that counts the number of reads in a bam file by position, and takes into account only successfully paired reads. I would like to estimate PCR duplicates when making the (possibly erroneous, I understand!) assumption that reads with the same genomic coordinates are copies.
For example, for the following input:
Chr1:1-500
R1--------------> <---------------R2
R1--------------> <---------------R2
R1--------------> <---------------R2
Chr1:5-505
R1--------------> <---------------R2
Chr1:10-510
R1--------------> <-----------R2
R1--------------> <-----------R2
R1-------------->
I would like to output this information:
Chr1:1-500 3
Chr1:5-505 1
Chr1:10-510 2
I am looking for a tool or script that counts the number of reads in a bam file by position, and takes into account only successfully paired reads. I would like to estimate PCR duplicates when making the (possibly erroneous, I understand!) assumption that reads with the same genomic coordinates are copies.
For example, for the following input:
Chr1:1-500
R1--------------> <---------------R2
R1--------------> <---------------R2
R1--------------> <---------------R2
Chr1:5-505
R1--------------> <---------------R2
Chr1:10-510
R1--------------> <-----------R2
R1--------------> <-----------R2
R1-------------->
I would like to output this information:
Chr1:1-500 3
Chr1:5-505 1
Chr1:10-510 2
Comment