Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Very low Sn/Sp outputs from Cufflinks?

    I have obtained a RNA-seq library from my collaborator with a total of more than 100M reads with length of 36bp from three Illumina sequencing lanes.

    So I tried to use tophat + cufflinks to discover some novel splice isoforms from this library. I do realize that it is ideal to use paired-end reads with longer lengths such as 75bp, I just want to see what I can get from cufflinks. However, the outputs seem to be a bit dissapointing after running cuffcompare:

    #--------------------| Sn | Sp | fSn | fSp
    Base level: 59.0 17.8 - -
    Exon level: 1.7 0.4 18.6 4.0
    Intron level: 7.5 47.0 7.6 47.3
    Intron chain level: 0.1 0.1 0.1 0.1
    Transcript level: 0.0 0.0 0.0 0.0
    Locus level: 0.1 0.0 0.2 0.0
    Missed exons: 66987/206780 ( 32.4%)
    Wrong exons: 813919/958710 ( 84.9%)
    Missed introns: 167142/185318 ( 90.2%)
    Wrong introns: 11318/29587 ( 38.3%)
    Missed loci: 5737/21602 ( 26.6%)
    Wrong loci: 782485/927668 ( 84.3%)

    At the transcript level, both Sn and Sp are zero! Does that mean cufflinks is not supposed to be run with short single-ended RNA-seq data? Is this typical or did I do sth. wrong? Any inputs?

    - L

  • #2
    I'm trying to lift this post. It's strange nobody replies to it. Does that mean nobody know the answer? ...

    Comment


    • #3
      Reads of 36bp are really short for a Tophat + Scripture/Cufflinks approach. The software will run, but you will take a performance hit in addition to getting less informative output. We have analyzed libraries of ~200 million paired 36-mers + 42-mers (20% and 80% of the data respectively) with Tophat + Scripture/Cufflinks. The output was interesting but did not work nearly as well as an approach involving mapping reads directly to a database of junctions, transcripts and genomic sequences. This is not a failing of the Tophat + Cufflinks/Scripture approach, it is simply that these methods are not optimized for such short reads. TopHat attempts to identify splice junctions by splitting the reads (an over simplification). Any method that takes this type of approach will suffer when the reads are that short. If you want to have decent sensitivity/specificity for detecting junctions you could try mapping to a database of known and predicted junction sequences of suitable length... Once you are analyzing libraries like paired 75-mers you will find that Cufflinks and Scripture shine a lot brighter... Another option is to use Trans-ABySS or some other de novo assembly approach to make longer contigs out of your 36-mers and then align these instead...

      Comment


      • #4
        Yeah...That's what I thought. Our library is mainly good for gene expression profiling. Thx for sharing your experience!

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Developments in Metagenomics
          by seqadmin





          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
          09-23-2024, 06:35 AM
        • seqadmin
          Understanding Genetic Influence on Infectious Disease
          by seqadmin




          During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

          Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
          09-09-2024, 10:59 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 10-02-2024, 04:51 AM
        0 responses
        13 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-01-2024, 07:10 AM
        0 responses
        21 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-30-2024, 08:33 AM
        0 responses
        25 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-26-2024, 12:57 PM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Working...
        X