Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Statistical models for DE

    Hey guys,

    I have some problems understanding the need for statistical models when dealing with differential expression in RNA-Seq. Of course I already used tools like DESeq2 or NOISeq. Nevertheless, I also want at least partially understand what these tools are doing. Unfortunately I don't have a good statistical background and found no tutorial, which is explaining the usage of statistical models in a for me understandable manner. I think the best would be if one could explain it for the Poisson model as this one seems to be easier to understand than a NB.

    So what I know is that after sequencing I align my reads to the reference genome, followed by generation of read counts for each annotated gene. Of course I cannot directly use these counts for testing DE cause of different library sizes as well as technical and biological variation.

    So what I read most of the time is that people fit statistical models to the count data. Like for example a Poisson model (as this one is accounting for the technical variance).

    Question 1: Is the model fitted on the read counts of all genes? Or is each gene getting its own model?

    Question 2: In the case of the Poisson model, where do I get the lambda? Should be calculated from my count data?

    Question 3: If I have constructed my Poisson model. What is it now used for? Do I use it to change my count data? Is it used in the statistical test? This is the step where I have absolutely no clue what is going on.

    I tried to read different publications including the DESeq publications or in the case of Poisson the Marioni paper from 2008. But with my little statistical knowledge I do not get the key idea of these statistical models and how they can help me when dealing with DE in RNA-Seq.

    I really hope someone can explain this general concept in a really easy way so I can understand it.

    Cheers
    Mario

  • #2
    N.B., I'm going to completely ignore the empirical Bayes parts of this for the sake of simplicity.

    1. The model is fit to each gene, one at a time. The actual model used is the same for all of them.
    2. The lambda is part of the fit. Note that there is a lambda per group.
    3. The model is used for a statistical test, which is typically of the form, "Do groups A and B have different lambdas?"

    Comment


    • #3
      First of all, thanks for your quick answer dpryan.

      But I still do not get where the lambda is coming from and what you mean by "group".

      Comment


      • #4
        A group is a group (eine "Gruppe" auf Deutsch), it has no special meaning in this context

        Regarding lambda, each gene has some sort of expression count associated with it, normally in the form of counts per sample. These counts are then used to estimate lambda.

        Comment


        • #5
          This is actually where I have a problem. I estimate lambda by the read counts of a gene (lambda = read count) and then I test the null-hypothesis that Condition A and B have the same lambda. So why do I use the Poisson model and not just test if A and B have the same read count?

          Does anyone know a tutorial or lecture with examples?

          Comment


          • #6
            You essentially are testing whether A and B have the same read count. The question is simply how you test that. One option is assuming Poisson variance, which requires estimating lambda and then doing a test. In most real cases, you'd have multiple groups of samples, so you couldn't just compare two numbers, but would need to come up with group estimates, likely accounting for differences in sequencing depth for each sample.

            Comment


            • #7
              Anyone out there who can explain the lambda estimation in more detail (probably with an example) or knows a nice tutorial?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Best Practices for Single-Cell Sequencing Analysis
                by seqadmin



                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                Yesterday, 07:15 AM
              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin



                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                05-24-2024, 01:16 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 06:58 AM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 08:18 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 08:04 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-03-2024, 06:55 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Working...
              X