Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • dpryan
    replied
    For pair-wise comparisons you need a Wald test. For "Is there a time effect, regardless of when?" you need an LRT. So yes, your assumption is absolutely correct

    Leave a comment:


  • frymor
    replied
    Originally posted by dpryan View Post
    2. Yup. This can be though of as a superset of the results from all pairwise comparisons. If it's ever DE in a pairwise comparison, it'll likely be DE in the LRT (the reverse isn't the case).
    When calculating the pair-wise comparisons of the timepoints, does it make more sense to add the parameter test="Wald", or can I keep the LRT results?

    I can see that there is a significant difference in the number of DE genes with an adjusted p-value <= 0.1

    Code:
    resTP16h_90h<- results(dds.filtered, contrast = c("hours", "90", "16"))
    resTP16h_90h.wald <- results(dds.filtered, test = "Wald", contrast = c("hours", "16", "90"))
    
    > addmargins(table(wald.test =(resTP16h_90h.wald$padj <.1), LRT.test=(resTP16h_90h$padj<.1)))
             LRT.test
    wald.test FALSE  TRUE   Sum
        FALSE   624  2958  3582
        TRUE      0  9725  9725
        Sum     624 12683 13307
    Am I correct in assuming that the LRT test done in the first results command contains not only the genes differentiating between 90h and 16h, but also all the genes in the time-points between (24h,30h,48h and 72h)?
    Last edited by frymor; 11-12-2015, 02:27 AM.

    Leave a comment:


  • frymor
    replied
    Originally posted by dpryan View Post
    1. Yup, you understood exactly. If you really want to be technical, what you're actually testing is whether including "hours" results in a better fit of the data...though the practical effect is asking for all genes changing over time. I should point out that you may not see all of these changes in direct pairwise comparisons (you'll probably see most of them though).
    I didn't expect to see all the genes, when doing a pair-wise comparison, but I expect to see all of them as a subset (or subsets, if testing multiple time-point comparisons),as the LRT test supposedly tests for all DE genes over time.

    When checking for the pair-wise comparisons using the results() function, would it be better to keep using the LRT testing method, or would it be better to use the Wald test for a more robust statistical results.

    When comparing the two tests for a specific pair of time points, I can see a difference. I have read that the Wald test calculate the LFC shrinkage for the data while the takes multiple parameters into account.
    Code:
    > resTP16h_90hwald
    log2 fold change (MLE): hours 90 vs 16 
    Wald test p-value: hours 90 vs 16 
    DataFrame with 17558 rows and 6 columns
                   baseMean log2FoldChange     lfcSE       stat      pvalue        padj
                  <numeric>      <numeric> <numeric>  <numeric>   <numeric>   <numeric>
    FBgn0085804  0.18052802      -3.642448  6.899080 -0.5279614 0.597526110          NA
    FBgn0267431 19.73070118      -2.155084  1.270286 -1.6965346 0.089784690 0.125377361
    FBgn0039987  0.08559842      -2.183327  6.937402 -0.3147183 0.752975578          NA
    FBgn0058182  0.49195220      -2.710627  5.724264 -0.4735329 0.635833037          NA
    FBgn0267430 27.36264804      -4.362455  1.481378 -2.9448633 0.003230974 0.006781261
    ...                 ...            ...       ...        ...         ...         ...
    > resTP16h_90h
    log2 fold change (MLE): hours 90 vs 16 
    LRT p-value: '~ replica + hours' vs '~ replica' 
    DataFrame with 17558 rows and 6 columns
                   baseMean log2FoldChange     lfcSE      stat       pvalue         padj
                  <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
    FBgn0085804  0.18052802      -3.642448  6.899080  1.518897 0.9816482138           NA
    FBgn0267431 19.73070118      -2.155084  1.270286 17.423700 0.0148591771 0.0184819228
    FBgn0039987  0.08559842      -2.183327  6.937402  0.610748 0.9989315237           NA
    FBgn0058182  0.49195220      -2.710627  5.724264  3.607104 0.8237543782           NA
    FBgn0267430 27.36264804      -4.362455  1.481378 25.744205 0.0005595056 0.0007857541
    .
    Are the two method even comparable?

    Originally posted by dpryan View Post
    2. Yup. This can be though of as a superset of the results from all pairwise comparisons. If it's ever DE in a pairwise comparison, it'll likely be DE in the LRT (the reverse isn't the case).
    This is where I don't understand what happens. If a genes changes over time in the analysis across all time points, it must also be changed in at least one of the pair-wise comparisons. isn't that true?
    So why is the reverse not always the case?

    Leave a comment:


  • dpryan
    replied
    1. Yup, you understood exactly. If you really want to be technical, what you're actually testing is whether including "hours" results in a better fit of the data...though the practical effect is asking for all genes changing over time. I should point out that you may not see all of these changes in direct pairwise comparisons (you'll probably see most of them though).

    2. Yup. This can be though of as a superset of the results from all pairwise comparisons. If it's ever DE in a pairwise comparison, it'll likely be DE in the LRT (the reverse isn't the case).

    Leave a comment:


  • frymor
    started a topic interactions with DESeq2 in a time-course analysis

    interactions with DESeq2 in a time-course analysis

    Dear all,

    we're having a time course data set of 8 TP (0, 16,24,30,48,72,90, and 100 hours).
    We would like to test for genes which are changing over time compared to the 0h (~ctrl).

    I have read the guide and a lot of posts here and on biostar.org. After understanding the way to create a design matrix for my data, I am still confused about the interactions in the design matrix.

    I have created the design matrix like that:

    Code:
    dds<-DESeqDataSetFromMatrix(countData=countTable, colData=phenotype, design= ~ replica + time )
    dds = DESeq(dds, test="LRT", reduced=~replica)
    Question 1:
    If I understand correctly, the results I get represents all genes with sig. diff. behaviour across all TP.
    But I get a lot of them. attached is [the plotMA of the counts].

    Code:
    > res
    log2 fold change (MLE): hours 100 vs 0 
    LRT p-value: '~ replica + hours' vs '~ replica' 
    DataFrame with 17558 rows and 6 columns
                   baseMean log2FoldChange     lfcSE      stat       pvalue         padj
                  <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
    FBgn0085804  0.18052802    -1.18153049  7.162547  1.518897 0.9816482138           NA
    ...
    Do all these genes change over time across all Time-points?

    Question 2:
    Does this list include all genes which are changing between only two time points?
    (that might be the reason, why the list is so long.)

    Question 3:
    what will change in term of my question (and of course the list of genes I will get), if I change my full and reduced model to this:
    Code:
    dds<-DESeqDataSetFromMatrix(countData=countTable, colData=phenotype, design= ~ replica + time +replica:time )
    dds = DESeq(dds, test="LRT", reduced=~replica +replica:time)

    Thanks in advance

    Assa
    Last edited by frymor; 11-05-2015, 03:21 AM.

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Technologies
    by seqadmin







    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

    Long-Read Sequencing
    Long-read sequencing has...
    12-02-2024, 01:49 PM
  • seqadmin
    Genetic Variation in Immunogenetics and Antibody Diversity
    by seqadmin



    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
    11-06-2024, 07:24 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 12-02-2024, 09:29 AM
0 responses
139 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 09:06 AM
0 responses
49 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 08:03 AM
0 responses
38 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-22-2024, 07:36 AM
0 responses
69 views
0 likes
Last Post seqadmin  
Working...
X