Hi,
I am mapping paired end RNAseq data using tophat2, but the alignment summary generated is showing I am getting a very high discordant alignment rate. The only tophat options I am specifying is -p 16 and -o "DIR". Below is the output from tophat2:
The flagstat output I get is also below:
I am using cutadapt to remove adapters and remove low quality reads and that is running fine. But then when I pass the paired files onto tophat, the results don't seem good. From what I have read, it is to do with the mate pairs no longer being in sync in the two fastq files. Is there a way around this and to get the number of discordant alignments down?
I have tried aligning the fastq files with tophat2 without passing the files through cutadapt first and the alignment is fine and there is a very low discordant alignment rate, so I'm guessing the fastq files are good, but something is happening after the cutadapt step.
Just as a note, I am not using Galaxy for the analysis.
Thanks
I am mapping paired end RNAseq data using tophat2, but the alignment summary generated is showing I am getting a very high discordant alignment rate. The only tophat options I am specifying is -p 16 and -o "DIR". Below is the output from tophat2:
PHP Code:
Left reads:
Input : 88556961
Mapped : 76938162 (86.9% of input)
of these: 20429665 (26.6%) have multiple alignments (622137 have >20)
Right reads:
Input : 88556961
Mapped : 75252663 (85.0% of input)
of these: 20114304 (26.7%) have multiple alignments (621700 have >20)
Unpaired reads:
Input : 68008
Mapped : 56927 (83.7% of input)
of these: 8045 (14.1%) have multiple alignments (9 have >20)
85.9% overall read mapping rate.
Aligned pairs: 65389463
of these: 18622479 (28.5%) have multiple alignments
61775607 (94.5%) are discordant alignments
4.1% concordant pair alignment rate.
PHP Code:
341377625 + 0 in total (QC-passed reads + QC-failed reads)
189129873 + 0 secondary
0 + 0 supplimentary
0 + 0 duplicates
341377625 + 0 mapped (100.00%:-nan%)
152190825 + 0 paired in sequencing
76938162 + 0 read1
75252663 + 0 read2
263998 + 0 properly paired (0.17%:-nan%)
130778926 + 0 with itself and mate mapped
21411899 + 0 singletons (14.07%:-nan%)
116104620 + 0 with mate mapped to a different chr
80799382 + 0 with mate mapped to a different chr (mapQ>=5)
I am using cutadapt to remove adapters and remove low quality reads and that is running fine. But then when I pass the paired files onto tophat, the results don't seem good. From what I have read, it is to do with the mate pairs no longer being in sync in the two fastq files. Is there a way around this and to get the number of discordant alignments down?
I have tried aligning the fastq files with tophat2 without passing the files through cutadapt first and the alignment is fine and there is a very low discordant alignment rate, so I'm guessing the fastq files are good, but something is happening after the cutadapt step.
Just as a note, I am not using Galaxy for the analysis.
Thanks
Comment