Hello everyone,
I have been trying to process some RNAseq data prepared using the ScriptSeq protocol and sequenced on a HiSeq machine. My pipeline was to remove adaptors with trimmomatic, align with bowtie2 against the transcriptome and then use express to quantify the transcripts.
However when I ran express I got a warning message about "The observed alignments appear disproportionately in the forward-reverse order". I have been trying to understand what could cause this to happen on paired ended data. After aligning each pair individual against the transcriptome I noticed the first pair aligns most of the time on the forward strand but the reverse pair seem to align on both strands. See below:
pair_1
**********************************************
Stats for BAM file(s):
**********************************************
Total reads: 5975216
Mapped reads: 2530358 (42.3476%)
Forward strand: 5836104 (97.6719%)
Reverse strand: 139112 (2.32815%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 0 (0%)
pair_2
**********************************************
Stats for BAM file(s):
**********************************************
Total reads: 5994394
Mapped reads: 2543964 (42.4391%)
Forward strand: 3587426 (59.8463%)
Reverse strand: 2406968 (40.1536%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 0 (0%)
Shouldn't reads always align to the reverse strand on the second file or am I getting this wrong? And if so what could have cause this to happen? I am just puzzled by the data since the pairs are always supposed to be forward-reverse right?
I have been trying to process some RNAseq data prepared using the ScriptSeq protocol and sequenced on a HiSeq machine. My pipeline was to remove adaptors with trimmomatic, align with bowtie2 against the transcriptome and then use express to quantify the transcripts.
However when I ran express I got a warning message about "The observed alignments appear disproportionately in the forward-reverse order". I have been trying to understand what could cause this to happen on paired ended data. After aligning each pair individual against the transcriptome I noticed the first pair aligns most of the time on the forward strand but the reverse pair seem to align on both strands. See below:
pair_1
**********************************************
Stats for BAM file(s):
**********************************************
Total reads: 5975216
Mapped reads: 2530358 (42.3476%)
Forward strand: 5836104 (97.6719%)
Reverse strand: 139112 (2.32815%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 0 (0%)
pair_2
**********************************************
Stats for BAM file(s):
**********************************************
Total reads: 5994394
Mapped reads: 2543964 (42.4391%)
Forward strand: 3587426 (59.8463%)
Reverse strand: 2406968 (40.1536%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 0 (0%)
Shouldn't reads always align to the reverse strand on the second file or am I getting this wrong? And if so what could have cause this to happen? I am just puzzled by the data since the pairs are always supposed to be forward-reverse right?
Comment