Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • zorin
    Junior Member
    • Feb 2014
    • 2

    cuffquant count data as input for DESeq/DEXseq

    Hi community,

    the latest cufflinks release (2.2.0) comes with two novel tools, cuffquant and cuffnorm. The latter can be used to generate expression and count tables at the level of transcripts, primary transcripts and genes, that are normalized for library size.

    I was wondering whether these normalized counts can be used with one of the 'count-based' methods like DESeq/DEXSeq/edgeR, circumventing their normalization methods.

    In other words, can I use e.g. the DESeq nbinomTest() function with these cuffnorm-generated data?

    Thanks.
  • id0
    Senior Member
    • Sep 2012
    • 130

    #2
    According to the cuffnorm documentation:
    Cuffnorm will report both FPKM values and normalized, estimates for the number of fragments that originate from each gene, transcript, TSS group, and CDS group. Note that because these counts are already normalized to account for differences in library size, they should not be used with downstream differential expression tools that require raw counts as input.
    So they specifically warn against using cuffnorm counts for tools that require raw counts.

    I actually have a follow-up question. If these are just normalized counts, why can't they be used? When they get re-normalized again by another tool, wouldn't they just come out the same as if they weren't normalized? The initial normalization shouldn't lose any information.

    Comment

    • gringer
      David Eccles (gringer)
      • May 2011
      • 845

      #3
      No, this is not an appropriate thing to do for either DESeq or edgeR. They assume raw counts are used as input, and these have a particular distribution that is assumed by the programs. The programs use the assumed distribution to estimate biological variation and determine statistical significance. While your normalised count values may be similar (or the same), the probability calculations will likely be off.

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        There seems to be a common misconception that tools like DESeq(2) actually store the normalized counts somewhere. They don't, in fact, which is why trying to input normalized counts will lead to no end of problems.

        Comment

        • zaki
          Member
          • Dec 2012
          • 15

          #5
          Following up from the original question..

          If we were to use cuffnorm with --library-norm-method parameter specifying classic-fpkm, can the count data be used for DESeq/DESeq(2)?

          classic-fpkm - Library size factor is set to 1 - no scaling applied to FPKM values or fragment counts. (default for Cufflinks)
          Does this mean the library size normalization was not applied? and therefore can the count data be considered as raw count??

          Comment

          • raphael123
            Member
            • Dec 2013
            • 37

            #6
            Do you think there is a way to get the raw count in a readable format for DESeq ?
            Or a way to read the binary file ? I can t find that !

            Comment

            • dpryan
              Devon Ryan
              • Jul 2011
              • 3478

              #7
              What binary file? If you mean the BAM file, just use featureCounts or htseq-count.

              Comment

              • raphael123
                Member
                • Dec 2013
                • 37

                #8
                No the raw count table:

                Cuffquant produces writes a single output file, abundances.cxb, into the output directory. CXB files are binary files, and can be passed to Cuffnorm or Cuffdiff for further processing.
                I would like to analyse the raw count with DESeq2

                Comment

                • dpryan
                  Devon Ryan
                  • Jul 2011
                  • 3478

                  #9
                  I'm sure it's theoretically possible to read the CXB file, but since its format seems to have never been documented, you'd have to go through the source code and reverse-engineer its format. It'd be faster to just ignore it.

                  Comment

                  • raphael123
                    Member
                    • Dec 2013
                    • 37

                    #10
                    Thanks for your answer !|
                    So there is no way to get the read counts from cuff-tools ? Maybe I miss something here..

                    Comment

                    • dpryan
                      Devon Ryan
                      • Jul 2011
                      • 3478

                      #11
                      Hard to say, there are a lot of undocumented areas of those programs. It's quick enough to just use featureCounts.

                      Comment

                      • raphael123
                        Member
                        • Dec 2013
                        • 37

                        #12
                        Oh ! so featureCounts is a tool to construct a count table from a sam/bam file ?
                        Thank you !

                        Comment

                        • dpryan
                          Devon Ryan
                          • Jul 2011
                          • 3478

                          #13
                          Yes, it's similar to htseq-count, though significantly faster.

                          Comment

                          • gringer
                            David Eccles (gringer)
                            • May 2011
                            • 845

                            #14
                            To repeat myself, you shouldn't be using cufflinks output as input to DESeq2, because DESeq is expecting raw count data, and depends on that for its model.

                            If you want to do isoform-level analysis with a DESeq-like workflow, look at DEXSeq, which has its own method of counting by using raw counts for exon bins.

                            Comment

                            • shi
                              Wei Shi
                              • Feb 2010
                              • 236

                              #15
                              Another option is to use limma/voom, which accepts fractional counts.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                                Here are nine questions we think about, in roughly the order they matter, before...
                                06-18-2026, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 11:10 AM
                              0 responses
                              8 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              43 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              104 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              125 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...