Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rdsqc22
    Junior Member
    • Nov 2013
    • 7

    Odd statistical differences in Cuffdiff output?

    Hi,

    I've been aligning and counting some RNA-seq reads with SHRiMP and Cuffdiff, doung the same analysis with both an older genome assembly and a newer one, and I found an interesting possible discrepancy in my Cuffdiff output. If anyone could help explain it would be much appreciated.

    Basically, I noticed a number of different genes where the expression levels was similar between the two assemblies, yet for some reason Cuffdiff was reporting wildly different significance results between the two. For example:

    gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
    Asmb 5: Gfap 10:90763148-90771847 lineA lineN OK 484.283 11.1909 -5.43545 1.62926 0.103258 0.394632 no
    Asmb 4: Gfap 10:92059880-92068555 lineA lineN OK 526.67 12.77 -5.36606 4.09233 4.27058E-005 0.00052085 yes

    Both were run with the same cuffdiff binary (Cuffdiff 2.0.2), with the exact same command (adjusted for the appropriate assembly), with an FDR of 0.05. It would stand to reason that the results are similar between the binaries- Line A is much more upregulated than line N in both cases, and the only statistical difference I can see that might have an effect is that the size of the gene in the assembly changed by 24 nucleotides, out of just under 10000.

    If the gene size, fold change, and FPKM values are so similar, why are the statistical values so wildly different? This does not make sense to me.

    Thanks!
  • Wallysb01
    Senior Member
    • Feb 2011
    • 286

    #2
    Agree this is odd.

    Can you post your commands? That might help us get a little more information.

    Also, how much changed as far as number of genes in your annotation, or what percent of reads are mapping to each genome?

    You should probably update your version of cufflinks too. Even though these are the same version, we are a long way from v2.0.2 now.

    Have you tried doing this with DESeq2? Might be worth seeing if this is something that is cufflinks specific or more broadly true about something going on in your new genome and genome annotation.

    Comment

    • rdsqc22
      Junior Member
      • Nov 2013
      • 7

      #3
      My command, used in both cases, is simply:

      cuffdiff --FDR 0.05 -u -b genome.fa -p 4 -L lineA,lineN -o cuffdiffout genes.gtf lineA.bam lineN.bam

      We downgraded to 2.0.2 because we had run into trouble with version 2.1.1, which is what had been installed previously- our sequencing center uses 2.0.2, which is why that version was chosen. I'm currently running another run with 2.2.0.

      This is the older assembly used: http://www.ncbi.nlm.nih.gov/assembly/237618/
      And the newer one: http://www.ncbi.nlm.nih.gov/assembly/382928

      I'm familiar with Cuffdiff, which is why it was used. I'll try using DeSeq2, though.

      Comment

      Latest Articles

      Collapse

      • SEQadmin2
        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by SEQadmin2


        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

        Here are nine questions we think about, in roughly the order they matter, before...
        06-18-2026, 07:11 AM
      • SEQadmin2
        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
        by SEQadmin2


        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
        ...
        06-02-2026, 10:05 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-17-2026, 06:09 AM
      0 responses
      36 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      99 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      120 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      113 views
      0 reactions
      Last Post SEQadmin2  
      Working...