Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does Cufflinks Give Me Trascriptomes?

    Hi Everyone,

    I'm a beginner in this area, please forget any silly question.

    My situation is that I have a raw scaffold whole genome sequences for my organism, but not annotated at all. I run Tophat and cufflinks and got some results. But does the results of cuffdiff here means transcripts? or scaffolds?

    I discussed with some of my friends and they all have a concern that maybe what Cuffdiff counts showed me was the scaffold (not really the transcripts). That means the Cuffdiff was counted based on how many reads on scaffold, instead of transcripts. But the Cufflinks website and manual seems to say Cufflinks do assemble transcripts.

    I pasted some output from my gene_exp.diff:

    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
    XLOC_000001 XLOC_000001 - Scaffold1:84918-92189 5D 20D OK 310.481 1.31159 -7.88704 6.88788 5.66303e-12 1.023e-10 yes
    XLOC_000002 XLOC_000002 - Scaffold1:92592-96046 5D 20D OK 162.253 1.31639 -6.94551 4.25586 2.08245e-05 0.000137689 yes


    I appreciate any commends!

  • #2
    I'm not totally clear on what you've got there. Pretty much the only application for cufflinks is when you have RNA Seq reads that you'll align to some reference. Cufflinks can process those alignments and make some decisions about how those reads, as aligned to the reference, might form transcripts. Cufflinks then returns an annotation of the locations of those transcripts as a GTF file which basically just shows you the start and end coordinates of the estimated exons with annotation grouping them into transcripts. Cufflinks uses ids like "XLOC_xxxxxxxx" to name the transcripts it predicts from the alignments.

    Does this match your experiment?
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

    Comment


    • #3
      Thank you for responding.

      The problem is, I found more than one genes in a Cufflinks-predicted transcript. So my friend raised the doubts whether the assembly Cufflinks did is good or not. Does it the real transcript, which could response to differential expression; or it is a long fragment/contig, that contains several genes, and the differential expression is not accurate.

      Comment


      • #4
        So cuffdiff won't give you the genomic scaffold FPKMs, unless your annotation file you are feeding it has the whole scaffold listed as a gene or transcript, which is definitely not the right thing to do.

        If you have no annotation, I wouldn't recommend using cufflinks RABT annotation by itself. Instead, I would suggest assembling your RNAseq data de novo with trinity or trans-abyss, then doing a genome annotation with maker using the RNAseq derived transcripts. Once that is complete, you will get a much more reliable gtf annotation file to feed cufflinks and do the RABT annotation to add additional genes/transcripts, if you wish. Though I would trust PASA more than cufflinks for adding transcripts to your maker annotation file.

        I would warn you that this process is pretty involved, even for RNAseq veterans. But if you or your group went though the trouble to construct a decent genome, you'd be doing yourselves a disservice by not creating a decent annotation to go with it.

        Comment


        • #5
          Agreed. Also bioinformatics with unannotated species is no simple matter. I mostly have the luxury of working with mouse data which is nicely supported and even that gets complicated at times. I have been collaborating with someone working with frog data and its a real mess trying to use cufflinks and RNA seq reads to try to construct a real reference. For one thing cufflinks is going to provide you with real intron chains however it doesn't do any type of biologically informed analysis to determine the end points of transcripts. Plus you're at the mercy of the randomness of RNA seq data. Cufflinks tries to fill in gaps and make guesses but its entirely based on simple thresholds like if a gap in coverage is less than 50 bases it'll call that a continuous feature or to call the end of a 3' or 5' Exon it just has a pileup cutoff threshold. In addition it does whatever it can to report the least number of transcripts that "explain" the coverage so it could be easily tricked by a complex locus.
          /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
          Salk Institute for Biological Studies, La Jolla, CA, USA */

          Comment

          Latest Articles

          Collapse

          • seqadmin
            The Impact of AI in Genomic Medicine
            by seqadmin



            Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
            02-26-2024, 02:07 PM
          • seqadmin
            Multiomics Techniques Advancing Disease Research
            by seqadmin


            New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

            A major leap in the field has
            ...
            02-08-2024, 06:33 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 02-28-2024, 06:12 AM
          0 responses
          21 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 02-23-2024, 04:11 PM
          0 responses
          69 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 02-21-2024, 08:52 AM
          0 responses
          77 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 02-20-2024, 08:57 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X