Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq2: genes significant under more stringent model but NOT with standard model

    hello everybody,

    I noticed some inconsistency in the DESeq2 results when calling results() under the default and the alternative hypothesis option.
    Some genes are called signifcant above a certain threshold (e.g. LFC>0.5) but not significant when no threshold is applied (i.e. just differentially expressed no matter the strength).
    The issue is caused by independent filtering, which in my case filters out lowly-expressed genes under the default (i.e. sets their padj to NA) but does not so under the alternative hypothesis.

    Is this
    (a) just some negligible consequence of independent filtering and perhaps the tradeoff for having a greater number of DEG under the default
    (b) possibly a kind of bug and filtering should be the same for both
    or
    (c) am I doing/understanding something wrong?

    pls let me know if you require additional information
    btw: I like to add the stereotypical "I am a biologist and don't know much about statistics/informatics"-excuse
    bye



    Code:
    [COLOR="Blue"]> #without independent filtering
    > #standard method
    > res <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = FALSE)
    > summary(res, alpha=0.01)[/COLOR]
    
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 547, 4% 
    LFC < 0 (down)   : 896, 6.5% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 0, 0% 
    (mean count < 0)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> res <- na.omit(res)
    > res <- res[res[,6] < 0.01,]
    > #alternative hypothesis
    > resGA <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = FALSE, 
    +                  lfcThreshold=0.5, altHypothesis="greaterAbs")
    > summary(resGA, alpha=0.01)[/COLOR]
    
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 40, 0.29% 
    LFC < 0 (down)   : 9, 0.065% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 0, 0% 
    (mean count < 0)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> resGA <- na.omit(resGA)
    > resGA <- resGA[resGA[,6] < 0.01,]
    > #is resGA subset of res?
    > table(is.element(rownames(resGA), rownames(res)))[/COLOR]
    
    TRUE 
      49 
    [COLOR="blue"]> 
    > #with independent filtering
    > #standard method
    > res <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = TRUE)
    > summary(res, alpha=0.01)[/COLOR]
    
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 549, 4% 
    LFC < 0 (down)   : 911, 6.6% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 1382, 10% 
    (mean count < 15.2)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> res <- na.omit(res)
    > res <- res[res[,6] < 0.01,]
    > #alternative hypothesis
    > resGA <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = TRUE, 
    +                  lfcThreshold=0.5, altHypothesis="greaterAbs")
    > summary(resGA, alpha=0.01)
    [/COLOR]
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 40, 0.29% 
    LFC < 0 (down)   : 9, 0.065% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 0, 0% 
    (mean count < 0)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> resGA <- na.omit(resGA)
    > resGA <- resGA[resGA[,6] < 0.01,]
    > #is resGA subset of res?
    > table(is.element(rownames(resGA), rownames(res)))[/COLOR]
    
    FALSE  TRUE 
        2    47

  • #2
    hi,

    the answer is (a). In the first run, independent filtering at 15.2 increases the total # of DEG for an FDR cutoff of 10%. (Note that 0.1 is the default target FDR for independent filtering as specified by 'alpha' argument of results(). I find that 1% is a very low FDR cutoff, but if you want to use 1% you should change 'alpha'.) And the optimal filter for the second run is no filter. So you have found some genes which were filtered for having low counts in the first run, although they are significant even for a more stringent test.

    Comment


    • #3
      thank you for the explanation and especially for the hint to set 'alpha' in results()
      at least in my very case the inconsistency has disappeared now

      bye

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Exploring the Dynamics of the Tumor Microenvironment
        by seqadmin




        The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
        07-08-2024, 03:19 PM
      • seqadmin
        Exploring Human Diversity Through Large-Scale Omics
        by seqadmin


        In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
        06-25-2024, 06:43 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 07-19-2024, 07:20 AM
      0 responses
      35 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-16-2024, 05:49 AM
      0 responses
      46 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-15-2024, 06:53 AM
      0 responses
      56 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-10-2024, 07:30 AM
      0 responses
      43 views
      0 likes
      Last Post seqadmin  
      Working...
      X