Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • To cuffcompare or not to cuffcompare

    Dear Bioinformaticians,
    The experiment; RNA-seq, transcriptomics; we simply want to find differentially expressed genes between 2 Illumina samples, case vs. control. No novel transcripts needed, just want to find differentially expressed known genes and/or known transcript isoforms.

    Do I make the right assumption that there is no need to use cufflinks or cuffcompare?
    E.g. Cuffcompare does;
    1) Compare your assembled transcripts to a reference annotation
    2) Track Cufflinks transcripts across multiple experiments
    ????

    Our proposed approach to finding known differentially expressed genes/isoforms:
    1) Run Tophat with a Refseq GTF file on both case and control
    2) Run cuffdiff on both sam files generated from 1 and the GTF file used in 1 as well.

    Please let me know if this is or is not a valid approach?

    Thank you,

    J.
    Last edited by poisson200; 07-29-2010, 04:16 AM.

  • #2
    There are 49 views but nobody knows the answer?


    Or is this is a stupid question?

    Comment


    • #3
      I would really want to help you, but you know that my knowlage about programming are very small

      Olgy Wolgy

      Comment


      • #4
        Originally posted by poisson200 View Post
        Dear Bioinformaticians,
        The experiment; RNA-seq, transcriptomics; we simply want to find differentially expressed genes between 2 Illumina samples, case vs. control. No novel transcripts needed, just want to find differentially expressed known genes and/or known transcript isoforms.

        Do I make the right assumption that there is no need to use cufflinks or cuffcompare?
        E.g. Cuffcompare does;
        1) Compare your assembled transcripts to a reference annotation
        2) Track Cufflinks transcripts across multiple experiments
        ????

        Our proposed approach to finding known differentially expressed genes/isoforms:
        1) Run Tophat with a Refseq GTF file on both case and control
        2) Run cuffdiff on both sam files generated from 1 and the GTF file used in 1 as well.

        Please let me know if this is or is not a valid approach?

        Thank you,

        J.
        Hi J,
        I'm new to this field and this is just a guess:
        I think you have to run cuffcompare on the two GTF files produced in step 2 and your reference annotation file:
        cuffcompare -r Mus_musculus.test.gtf -R -o prefix transcripts1.gtf transcripts2.gtf

        Then you run cuffdiff on the combined output GTF file (from cuffcompare) and the 2 SAM files which had been generated by TopHat in step 1:
        cuffdiff -o diff_out/ combined.gtf accepted_hits1.sam accepted_hits2.sam

        DE genes/isoforms will be in 0_1_gene_exp.diff/0_1_isoform_exp.diff
        but it seems that there is no output for coding sequences as mentioned in an other thread http://seqanswers.com/forums/showthread.php?t=4989. If you have trouble with duplicate entries in your reference annotation filles: check this: http://seqanswers.com/forums/showthread.php?t=3493

        Comment


        • #5
          Thanks Olgy,
          I am sure your intelligence and knowledge of other things is vast.

          ===============================

          Dear Enrico,
          thanks for your answer. I since found out this is the answer to my questions, which are (very similar to yours with the novel and known case)

          1) Known transcripts only; Find differentially expressed known Refseq transcripts between case and control data (no novel transcripts wanted);
          a. Run TopHat on both fastq read files using a GFF file of Refseq
          b. Run cuffdiff on both sam outputs, using a Refseq GTF file

          2) Known and novel transcripts; Find differentially expressed transcripts, both novel and known Refseq
          a. Run TopHat on both fastq (without GFF)
          b. Run cufflinks without GTF file on each SAM file
          c. Run cuffcompare with Refseq.GTF sam1.gtf and sam2.gtf
          d. Run cuffdiff on the combined.GTF and sam files

          Thanks for the link too.

          Kind regards,

          J

          Comment


          • #6
            No need for Cufflinks or Cuffcompare if you are happy to stick within an existing genome annotation like RefSeq, UCSC or Ensembl. Just take the sam files from Tophat and go straight to Cuffdiff. There is a little trick to get a gtf file that is suitable to use with Cuffdiff with tss id, etc. Just use Cuffcompare and feed it the reference annotation (RefSeq, Ensembl, etc) twice and it will give you a gtf file that is suitable.

            Hope this helps.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            17 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Working...
            X