Hi,
I have 85 bp paired end RNA seq data from Illumina GA IIE (fragment size - 180 bp) . I tried three mapping to a reference -
1. Using both the read pairs.
tophat -p 2 -r 10 -o ./tophatr16unipaired Ref/Zm.seq.uniq seqs_filtered_4_1.fastq seqs_filtered_4_2.fastq
2. Using only the reads from one end of the fragments.
tophat -p 2 -r 10 -o ./tophatr16unisingle1 Ref/Zm.seq.uniq seqs_filtered_4_1.fastq
3. Using only the reads from the opposite end of the fragments.
tophat -p 2 -r 10 -o ./tophatr16unisingle2 Ref/Zm.seq.uniq seqs_filtered_4_2.fastq
But the results of the three mappings show vast differences in the number of reads that are mapped to the reference. Around 80 % of the reads in experiment 2 mapped,but only around 50 % of the reads in experiment 3 and around 30 % of the reads mapped when I did paired end mapping in experiment 1.
Has some one seen something like this before ? Am I wrong to expect both the pairs to map with a similar percentage to the reference ?
I can't think of a reason other than difference in quality of the reads (the average quality score plot from fastx looks very similar to each other from the pairs) which could cause this.
Any advice will be greatly appreciated.
Thanks in advance,
Adarsh Jose
Iowa State University
I have 85 bp paired end RNA seq data from Illumina GA IIE (fragment size - 180 bp) . I tried three mapping to a reference -
1. Using both the read pairs.
tophat -p 2 -r 10 -o ./tophatr16unipaired Ref/Zm.seq.uniq seqs_filtered_4_1.fastq seqs_filtered_4_2.fastq
2. Using only the reads from one end of the fragments.
tophat -p 2 -r 10 -o ./tophatr16unisingle1 Ref/Zm.seq.uniq seqs_filtered_4_1.fastq
3. Using only the reads from the opposite end of the fragments.
tophat -p 2 -r 10 -o ./tophatr16unisingle2 Ref/Zm.seq.uniq seqs_filtered_4_2.fastq
But the results of the three mappings show vast differences in the number of reads that are mapped to the reference. Around 80 % of the reads in experiment 2 mapped,but only around 50 % of the reads in experiment 3 and around 30 % of the reads mapped when I did paired end mapping in experiment 1.
Has some one seen something like this before ? Am I wrong to expect both the pairs to map with a similar percentage to the reference ?
I can't think of a reason other than difference in quality of the reads (the average quality score plot from fastx looks very similar to each other from the pairs) which could cause this.
Any advice will be greatly appreciated.
Thanks in advance,
Adarsh Jose
Iowa State University