Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Very low Sn/Sp outputs from Cufflinks?

    I have obtained a RNA-seq library from my collaborator with a total of more than 100M reads with length of 36bp from three Illumina sequencing lanes.

    So I tried to use tophat + cufflinks to discover some novel splice isoforms from this library. I do realize that it is ideal to use paired-end reads with longer lengths such as 75bp, I just want to see what I can get from cufflinks. However, the outputs seem to be a bit dissapointing after running cuffcompare:

    #--------------------| Sn | Sp | fSn | fSp
    Base level: 59.0 17.8 - -
    Exon level: 1.7 0.4 18.6 4.0
    Intron level: 7.5 47.0 7.6 47.3
    Intron chain level: 0.1 0.1 0.1 0.1
    Transcript level: 0.0 0.0 0.0 0.0
    Locus level: 0.1 0.0 0.2 0.0
    Missed exons: 66987/206780 ( 32.4%)
    Wrong exons: 813919/958710 ( 84.9%)
    Missed introns: 167142/185318 ( 90.2%)
    Wrong introns: 11318/29587 ( 38.3%)
    Missed loci: 5737/21602 ( 26.6%)
    Wrong loci: 782485/927668 ( 84.3%)

    At the transcript level, both Sn and Sp are zero! Does that mean cufflinks is not supposed to be run with short single-ended RNA-seq data? Is this typical or did I do sth. wrong? Any inputs?

    - L

  • #2
    I'm trying to lift this post. It's strange nobody replies to it. Does that mean nobody know the answer? ...

    Comment


    • #3
      Reads of 36bp are really short for a Tophat + Scripture/Cufflinks approach. The software will run, but you will take a performance hit in addition to getting less informative output. We have analyzed libraries of ~200 million paired 36-mers + 42-mers (20% and 80% of the data respectively) with Tophat + Scripture/Cufflinks. The output was interesting but did not work nearly as well as an approach involving mapping reads directly to a database of junctions, transcripts and genomic sequences. This is not a failing of the Tophat + Cufflinks/Scripture approach, it is simply that these methods are not optimized for such short reads. TopHat attempts to identify splice junctions by splitting the reads (an over simplification). Any method that takes this type of approach will suffer when the reads are that short. If you want to have decent sensitivity/specificity for detecting junctions you could try mapping to a database of known and predicted junction sequences of suitable length... Once you are analyzing libraries like paired 75-mers you will find that Cufflinks and Scripture shine a lot brighter... Another option is to use Trans-ABySS or some other de novo assembly approach to make longer contigs out of your 36-mers and then align these instead...

      Comment


      • #4
        Yeah...That's what I thought. Our library is mainly good for gene expression profiling. Thx for sharing your experience!

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advanced Methods for the Detection of Infectious Disease
          by seqadmin




          The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
          ...
          Yesterday, 01:15 PM
        • seqadmin
          Strategies for Investigating the Microbiome
          by seqadmin




          Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
          11-09-2023, 07:02 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:12 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-22-2023, 09:29 AM
        1 response
        51 views
        0 likes
        Last Post VilliamPast  
        Started by seqadmin, 11-22-2023, 08:53 AM
        0 responses
        57 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-21-2023, 08:24 AM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Working...
        X