Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Am I completely missing the point?

    So... Trying to get an overview of Limma, LimmaVoom, EdgeR, DESeq2, NPEBseq etc. I'm getting the feeling, that the task of differential gene expression analysis is being over-complicated...?

    I'm currently looking at a count matrix derived from 95 RNAseq samples from Illumina HiSeq2000 (Illumina TruSeq stranded kit). Raw reads mapped to hg19 using STAR and then counted using HTSeq.

    The result is a count matrix with 25369 rows and 95 columns, then I have two groups classic case(n=15)/control(n=80). I then perform the following steps:

    1. Use the edgeR package to perform TMM normalisation of the raw counts
    2. Foreach gene do a case vs. control t-test and a Wilcoxon test on the TMM values
    3. Apply FDR correction
    4. Sort on ascending FDR-value for the t-test and use the Wilcoxon p-value to get an idea of whether the difference is "outlier-driven"

    Please enlighten me as to why this simple approach is not sufficient?

    Cheers,
    Leon

  • #2
    It may be sufficient - after all, you have quite a few samples. With a small number of samples it can be hard to achieve the necessary statistical power without "borrowing variance across genes".

    Or you could use SAMSeq which is very simple to use and understand. It's based on non-parametrics stats.

    Comment


    • #3
      to echo Kopi-o, these methods each give the motivation fairly early on in the corresponding paper:

      edgeR

      "Various tests of differential expression have been proposed for replicated DGE data using binomial, Poisson, negative binomial or pseudo-likelihood (PL) models for the counts, but none of the these are usable when the number of replicates is very small."

      DESeq

      "Typically, the number of replicates is small, and further modelling assumptions need to be made in order to obtain useful estimates."

      Voom

      "Borrowing information between genes is a crucial feature of the genome-wide statistical methods, as it allows for gene-specific variation while still providing reliable inference with small sample sizes."

      I'd also recommend checking out SAMseq paper and method.

      Comment


      • #4
        Run your statistical tests on log2 values. That's all I have to add. With those sample sizes you could even do permutation tests and avoid any distribution assumptions all together.
        /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
        Salk Institute for Biological Studies, La Jolla, CA, USA */

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-27-2024, 06:37 PM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-27-2024, 06:07 PM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        68 views
        0 likes
        Last Post seqadmin  
        Working...
        X