Hello,
I use TopHat2 to generate the alignment of my paired-end fastq files.
The following "align_summary.txt" is created :
how to determine the number of uniquely aligned reads ?
If I do :
39,852,098-4,959,709-408,853=34,483,536 uniquely aligned reads.
But when I run htseq on this sam file, I obtained:
reads assigned to a gene : 30,247,445
no_feature : 6,495,866
ambiguous : 748,775
If I do not talk nonsense, the sum of these three numbers should equal the number of uniquely aligned reads (since HTseq deals only these reads). But 37,492,086 reads are processed and not 34,483,536.
So my calculation of unique reads is false?
Does anyone has an idea how to interpret this "align_summary.txt" file?
Thanks at all !
I use TopHat2 to generate the alignment of my paired-end fastq files.
The following "align_summary.txt" is created :
Left reads:
Input : 47466605
Mapped : 41265217 (86.9% of input)
of these: 5205466 (12.6%) have multiple alignments (141994 have >20)
Right reads:
Input : 47466605
Mapped : 41580477 (87.6% of input)
of these: 5255753 (12.6%) have multiple alignments (143224 have >20) 87.3% overall read mapping rate.
Aligned pairs: 39852098
of these: 4959709 (12.4%) have multiple alignments
408853 ( 1.0%) are discordant alignments
83.1% concordant pair alignment rate.
Input : 47466605
Mapped : 41265217 (86.9% of input)
of these: 5205466 (12.6%) have multiple alignments (141994 have >20)
Right reads:
Input : 47466605
Mapped : 41580477 (87.6% of input)
of these: 5255753 (12.6%) have multiple alignments (143224 have >20) 87.3% overall read mapping rate.
Aligned pairs: 39852098
of these: 4959709 (12.4%) have multiple alignments
408853 ( 1.0%) are discordant alignments
83.1% concordant pair alignment rate.
If I do :
39,852,098-4,959,709-408,853=34,483,536 uniquely aligned reads.
But when I run htseq on this sam file, I obtained:
reads assigned to a gene : 30,247,445
no_feature : 6,495,866
ambiguous : 748,775
If I do not talk nonsense, the sum of these three numbers should equal the number of uniquely aligned reads (since HTseq deals only these reads). But 37,492,086 reads are processed and not 34,483,536.
So my calculation of unique reads is false?
Does anyone has an idea how to interpret this "align_summary.txt" file?
Thanks at all !