Hey,
I`m still very uncertain when dealing with strand specific RNA-Seq data. Especially when using TopHat2 and Cufflinks, as these make use of the strand-information via the library-types.
I found this table for the TopHat2 / Cufflinks library type options: http://www.nature.com/nprot/journal/...12.016_T1.html
In my data I can clearly see that the R1 (forward) read maps on the sense/coding strand and the R2 (reverse) read maps on the antisense strand.
Illustration:
a) gene located on wat (+) strand
......................R1
.....................----->
--------------[############# Gene ##############]-------------------- wat (+)
--------------------------------------------------------------------------------------------- cri (-)
..........................................................<------
............................................................R2
b) gene located on cri (-) strand
.......................R2
......................----->
--------------------------------------------------------------------------------------------- wat (+)
--------------[############# Gene ##############]-------------------- cri (-)
..........................................................<-----
............................................................R1
This would mean (according to my link) that I have fr-secondstrand. As
Am I correct with this assumption?
What I still do not get are the terms "firststrand" and "secondstrand" themselves.
My understanding of the library prep is the following (leaving out fragmentation):
1) Transcription
5' [###########Gene############] 3' coding strand
3' -------------------------------------------------- 5' template strand
5' -------------------------------------------------- 3' mRNA
2) Adapter Ligation (Lets assume 5'Adapter seq is only AATT and 3'Adapter seq only GGCC)
5' AATT------------------------------------------------GGCC 3' mRNA+Adapters
3) 1st strand synthesis
5' AATT------------------------------------------------GGCC 3' mRNA+Adapters
3' TTAA------------------------------------------------CCGG 5' 1st cDNA
4) 2nd strand synthesis
5' AATT------------------------------------------------GGCC 3' 2nd cDNA <---- identical (U->T) to mRNA
3' TTAA------------------------------------------------CCGG 5' 1st cDNA
Let`s skip the PCR
5a) Sequencing 1st cDNA strand
5'........SeqPrimer----->
3' TTAA------------------------------------------------CCGG 5' 1st cDNA
As I see it, I now get a read, which is identical to a part of the mRNA sequence located at the left end.
5b) Sequencing 2nd cDNA strand
5' AATT------------------------------------------------GGCC 3' 2nd cDNA
.....................................<-----remirPqeS..........5'
Now I should get a read whose reverse complement is identical to a part of the mRNA sequene located at the right end.
With this understanding of the library prep I would say that if my R1 (forward) read is located on the sense/coding strand I would have sequenced the 1st strand first, but according to my link it must have been "secondstrand".
I hope anyone is able to understand me and detects my misinterpretation of the first/secondstrand terms or my misinterpretation of the library prep.
Thanks in advance
Mario
I`m still very uncertain when dealing with strand specific RNA-Seq data. Especially when using TopHat2 and Cufflinks, as these make use of the strand-information via the library-types.
I found this table for the TopHat2 / Cufflinks library type options: http://www.nature.com/nprot/journal/...12.016_T1.html
In my data I can clearly see that the R1 (forward) read maps on the sense/coding strand and the R2 (reverse) read maps on the antisense strand.
Illustration:
a) gene located on wat (+) strand
......................R1
.....................----->
--------------[############# Gene ##############]-------------------- wat (+)
--------------------------------------------------------------------------------------------- cri (-)
..........................................................<------
............................................................R2
b) gene located on cri (-) strand
.......................R2
......................----->
--------------------------------------------------------------------------------------------- wat (+)
--------------[############# Gene ##############]-------------------- cri (-)
..........................................................<-----
............................................................R1
This would mean (according to my link) that I have fr-secondstrand. As
the leftmost end of the fragment (in transcript coordinates) is the first sequenced
What I still do not get are the terms "firststrand" and "secondstrand" themselves.
My understanding of the library prep is the following (leaving out fragmentation):
1) Transcription
5' [###########Gene############] 3' coding strand
3' -------------------------------------------------- 5' template strand
5' -------------------------------------------------- 3' mRNA
2) Adapter Ligation (Lets assume 5'Adapter seq is only AATT and 3'Adapter seq only GGCC)
5' AATT------------------------------------------------GGCC 3' mRNA+Adapters
3) 1st strand synthesis
5' AATT------------------------------------------------GGCC 3' mRNA+Adapters
3' TTAA------------------------------------------------CCGG 5' 1st cDNA
4) 2nd strand synthesis
5' AATT------------------------------------------------GGCC 3' 2nd cDNA <---- identical (U->T) to mRNA
3' TTAA------------------------------------------------CCGG 5' 1st cDNA
Let`s skip the PCR
5a) Sequencing 1st cDNA strand
5'........SeqPrimer----->
3' TTAA------------------------------------------------CCGG 5' 1st cDNA
As I see it, I now get a read, which is identical to a part of the mRNA sequence located at the left end.
5b) Sequencing 2nd cDNA strand
5' AATT------------------------------------------------GGCC 3' 2nd cDNA
.....................................<-----remirPqeS..........5'
Now I should get a read whose reverse complement is identical to a part of the mRNA sequene located at the right end.
With this understanding of the library prep I would say that if my R1 (forward) read is located on the sense/coding strand I would have sequenced the 1st strand first, but according to my link it must have been "secondstrand".
I hope anyone is able to understand me and detects my misinterpretation of the first/secondstrand terms or my misinterpretation of the library prep.
Thanks in advance
Mario
Comment