Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Beginner question for Differential Expression Analysis

    Hello,

    I am a beginner in analyzing data from an RNA seq experiment. I was not the one performing the bioinformatics analysis (I am more of a bench scientist). So, I have an excel file in my hands. I am a bit confused though with how to retrieve my DE genes.
    I have read what p and q values represent. I have understood that setting an FDR value threshold is a 'safe' choice in order to identify whether the significant differences recorded are truly significant.

    I am a bit confused though with choosing the FDR threshold. If I understand correctly the level of 0.05 does not apply to all experiments.

    Could you please refer me to some further reading, or perhaps provide me with some tips, so that I proceed correctly with my analysis?

    I apologize if this is a very basic question. I appreciate your help.

    Regards
    Vassen

  • #2
    The raw p-values in your results are still what they are - at a per-gene level given the dispersion models of the expression values in conditions that gene has a low probability of NOT being deferentially expressed. Statistical reality, however, shows us that when we repeatedly run a statistical test between two groups of values that DO come from the same distribution (say split 20 values with a mean of 10 and stdev of 5 into two random groups) we will see 5% or so of those tests return a significant p-value. So given the large number of genes we are testing people theorize that there's a measurable effect of type I error.

    In practice I think of the p-value and q-value (adjusted p-value, FDR, etc) differently in different situations. If our goal is a candidate type approach, which means we'll be running additional experiments to verify the RNA-seq result for that gene, we may use the raw p-values to get a broader list of candidates. If we have a phenotype and we want to report the number of genes affected or the percentage of genes enriched vs depleted we'll use the adjusted p-values since that is a more general claim.

    Sometimes our experiment may yield zero significant genes by the adjusted p-values even though we know there's a phenotype. In those cases we may proceed with genes significant by raw p-value and keep in mind that we must proceed cautiously. We wouldn't do that if we were going straight into a figure with that result - we'd of course try to confirm if any of those genes appear to be different via other methods.

    Finally, keep in mind that raw p-values likely have a high type-I error rate while the adjusted p-values likely have a high type-II error rate. Both of these rates improve the larger your sample size. Of course with higher and higher sample sizes you'll also get significance calls for features with smaller and smaller effect sizes and you'll have to start thinking in terms of "what is a significant effect?". I can't answer that one.
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

    Comment


    • #3
      Many thanks sdriscoll!!

      Cheers
      Vassen

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Advances in Sequencing Analysis Tools
        by seqadmin


        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
        Today, 07:48 AM
      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 07:17 AM
      0 responses
      3 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-02-2024, 08:06 AM
      0 responses
      19 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-30-2024, 12:17 PM
      0 responses
      20 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-29-2024, 10:49 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Working...
      X