Hi everyone,
I aligned my paired-end sequencing data using Bowtie2 and processed the resulting BAM files with Picard MarkDuplicates. However, I noticed a discrepancy between Bowtie2’s uniquely mapped read count and Samtools’ uniquely mapped reads after filtering. Bowtie2 alignment statistics (bowtie2.txt):
8536975 reads; of these: 5180282 (60.68%) aligned concordantly 0 times (unmapped) 1868933 (21.89%) aligned concordantly exactly 1 time (uniquely mapped) 1487760 (17.43%) aligned concordantly >1 times (multi-mapped) Overall alignment rate: 39.32%
2,877,386
I would appreciate any insights!
Thanks in advance!
I aligned my paired-end sequencing data using Bowtie2 and processed the resulting BAM files with Picard MarkDuplicates. However, I noticed a discrepancy between Bowtie2’s uniquely mapped read count and Samtools’ uniquely mapped reads after filtering. Bowtie2 alignment statistics (bowtie2.txt):
8536975 reads; of these: 5180282 (60.68%) aligned concordantly 0 times (unmapped) 1868933 (21.89%) aligned concordantly exactly 1 time (uniquely mapped) 1487760 (17.43%) aligned concordantly >1 times (multi-mapped) Overall alignment rate: 39.32%
- Uniquely mapped reads reported by Bowtie2: 1,868,933
2,877,386
- Samtools reports ~1M more uniquely mapped reads than Bowtie2.
- Why does my final _mdu.bam file (uniquely mapped, deduplicated) contain ~1M more reads than Bowtie2’s unique count?
- Does Bowtie2 apply stricter filtering than samtools view -q 30 when classifying uniquely mapped reads?
- Could multi-mapped reads or soft-clipped alignments still be counted in _mdu.bam, even with MAPQ ≥ 30?
- How does paired-end read counting differ between Bowtie2 and Samtools? Does Bowtie2 exclude some properly paired reads?
- Verified read counts before and after duplicate marking:bash
CopyEdit
samtools view -c -F 1024 CR05NFYA_S1_mdu.bam - Checked MAPQ distribution:bash
CopyEdit
samtools view CR05NFYA_S1_mdu.bam | awk '{print $5}' | sort -n | uniq -c - Excluded secondary alignments:bash
CopyEdit
samtools view -c -F 256 CR05NFYA_S1_mdu.bam
I would appreciate any insights!
Thanks in advance!