Hi All,
I'm new to analyzing sequencing (particularly RNA-seq) data. I have mapping paired end reads that were prepared using a modified RNA-seq protocol out of the Weissman lab. I have a simple question that I cannot seem to find the answer to.
In the Tophat manual, it specifies three particular types of library preps as follows:
"fr-unstranded Standard Illumina Reads from the left-most end of the fragment (in transcript coordinates) map to the transcript strand, and the right-most end maps to the opposite strand.
fr-firststrand dUTP, NSR, NNSR Same as above except we enforce the rule that the right-most end of the fragment (in transcript coordinates) is the first sequenced (or only sequenced for single-end reads). Equivalently, it is assumed that only the strand generated during first strand synthesis is sequenced.
fr-secondstrand Ligation, Standard SOLiD Same as above except we enforce the rule that the left-most end of the fragment (in transcript coordinates) is the first sequenced (or only sequenced for single-end reads). Equivalently, it is assumed that only the strand generated during second strand synthesis is sequenced."
Based upon scripts from my lab members and brief discussions with them, it seems like fr-firststrand provides the correct mapping. My overarching question is why? More specifically, the tophat manual describes these options in terms of left-most or right-most end of the fragment in transcript coordinates. I can't find anywhere that transcript coordinates are defines. What are they? Would the leftmost and rightmost transcript coordinate correspond to the 5' and 3' end of the transcript respectively (regardless of if the transcript is located on the sense or antisense strand)?
Thank you for your help!
I'm new to analyzing sequencing (particularly RNA-seq) data. I have mapping paired end reads that were prepared using a modified RNA-seq protocol out of the Weissman lab. I have a simple question that I cannot seem to find the answer to.
In the Tophat manual, it specifies three particular types of library preps as follows:
"fr-unstranded Standard Illumina Reads from the left-most end of the fragment (in transcript coordinates) map to the transcript strand, and the right-most end maps to the opposite strand.
fr-firststrand dUTP, NSR, NNSR Same as above except we enforce the rule that the right-most end of the fragment (in transcript coordinates) is the first sequenced (or only sequenced for single-end reads). Equivalently, it is assumed that only the strand generated during first strand synthesis is sequenced.
fr-secondstrand Ligation, Standard SOLiD Same as above except we enforce the rule that the left-most end of the fragment (in transcript coordinates) is the first sequenced (or only sequenced for single-end reads). Equivalently, it is assumed that only the strand generated during second strand synthesis is sequenced."
Based upon scripts from my lab members and brief discussions with them, it seems like fr-firststrand provides the correct mapping. My overarching question is why? More specifically, the tophat manual describes these options in terms of left-most or right-most end of the fragment in transcript coordinates. I can't find anywhere that transcript coordinates are defines. What are they? Would the leftmost and rightmost transcript coordinate correspond to the 5' and 3' end of the transcript respectively (regardless of if the transcript is located on the sense or antisense strand)?
Thank you for your help!