Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

Strand specific RNA-seq data from dUTP protocol

  • Filter
  • Time
  • Show
Clear All
new posts

  • Strand specific RNA-seq data from dUTP protocol

    Hi all,
    We have a single end strand specific RNA-seq using dUTP protocol. Those reads (single end) from this protocol should be mapped to sense strand or antisense strand?

    In the Trinity page, it said dUTP is RF. So I think the single end should be anti-sense.
    Paired reads:
    RF: first read (/1) of fragment pair is sequenced as anti-sense (reverse(R)), and second read (/2) is in the sense strand (forward(F)); typical of the dUTP/UDG sequencing method.

    However, in the TopHat web, it said

    fr-firststrand dUTP, NSR, NNSR Same as above except we enforce the rule that the right-most end of the fragment (in transcript coordinates) is the first sequenced (or only sequenced for single-end reads). Equivalently, it is assumed that only the strand generated during first strand synthesis is sequenced.

    Can anyone give me some suggestion? Many thanks.


  • #2
    Hi, Zheng.

    Is your question solved? or get answered? I am also interested in the answer.




    • #3
      I used tophat to map dUTP strand specific libraries and I used '–library–-type fr-firststrand'.

      There is a nice quality control package called RSeQC (
      One of its functions, infer_experiment, tells you whether the data comes from a strand specific protocol or not.

      For single end reads the rule goes like this:
      If the read maps to the plus strand it indicates that the parental gene is on the minus strand.
      If the read maps to the minus strand it indicates that the parental gene is on the plus strand.

      Finally, it should be stressed (please correct me if you think I am wrong) that there are no things such as sense and antisense strands. Rather if a gene is transcribed it is sense transcription but if the opposite strand of a gene is transcribed it is antisense transcription. The gene in question can be located on either strand.



      • #4
        Thanks. I will try it.


        • #5
          Hi Blanco,

          We've tried rseqc to check on the library type for our RNAseq data. But the package doesn't have much details about the output.

          Here is our results

          Fraction of reads explained by "1++,1--,2+-,2-+": 0.0078
          Fraction of reads explained by "1+-,1-+,2++,2--": 0.9922
          Fraction of reads explained by other combinations: 0.0000

          Wondering if we chose Tophat for mapping, for the parameter "--library-type", should we use fr-firststrand or fr-secondstrand?


          • #6
            Hi masterpiece,
            this is the same pattern I get for the dUTP method and I use fr-fristrand in tophat so I assume that would also apply to your reads.

            This is impressive strand specificity, I usually get around 97%. What method do you use for library preparation?


            • #7
              Hi Blanco,

              Sorry for late reply. Busy with other stuff lately.

              This is human transcriptome dataset from ENCODE/Cold Spring Harbor Lab. They done a lot of RNAseq sequencing. You can check their method based on this paper and yes they use dUTP method.

              Jiang, L., Schlesinger, F., Davis, C. A., Zhang, Y., Li, R., Salit, M., Gingeras, T. R., et al. (n.d.). Synthetic spike-in standards for RNA-seq experiments, 1543–1551. doi:10.1101/gr.121095.111.

              So I think in this case, practice make perfect huh?
              Last edited by masterpiece; 10-15-2012, 09:22 PM.


              • #8
                Some of the RNA in a cell will be true "antisense". That is transcripts of the opposite from the coding strand. However this will be non-coding RNA except for rare exceptions. So it is not clear to me whether it would be poly-adenylated.

                However this antisense RNA would be expected hybridized to the sense strand, at least to some extent. So it could "hitchhike" into your polyA+ pool that way.



                • #9

                  I have a strand specific question. If you want to run tuxedo adding the strand-specific argument in the tophat script, do you also need to incorporate this specification in cufflinks? or is just in tophat?. For example:

                  tophat -p 8 -G genes.gtf -o C1_R1_thout --library-type=fr-firststrand genome C1_R1_1.fq

                  cufflinks -p 8 -o C1_R2_clout C1_R2_thout/accepted_hits.bam (add --library-type here?)