Seqanswers Leaderboard Ad

**dpryan** · 11-05-2015, 04:39 AM

1. Yup, you understood exactly. If you really want to be technical, what you're actually testing is whether including "hours" results in a better fit of the data...though the practical effect is asking for all genes changing over time. I should point out that you may not see all of these changes in direct pairwise comparisons (you'll probably see most of them though).

2. Yup. This can be though of as a superset of the results from all pairwise comparisons. If it's ever DE in a pairwise comparison, it'll likely be DE in the LRT (the reverse isn't the case).

**frymor** · 11-06-2015, 01:26 AM

Originally posted by dpryan View Post

1. Yup, you understood exactly. If you really want to be technical, what you're actually testing is whether including "hours" results in a better fit of the data...though the practical effect is asking for all genes changing over time. I should point out that you may not see all of these changes in direct pairwise comparisons (you'll probably see most of them though).

I didn't expect to see all the genes, when doing a pair-wise comparison, but I expect to see all of them as a subset (or subsets, if testing multiple time-point comparisons),as the LRT test supposedly tests for all DE genes over time.

When checking for the pair-wise comparisons using the results() function, would it be better to keep using the LRT testing method, or would it be better to use the Wald test for a more robust statistical results.

When comparing the two tests for a specific pair of time points, I can see a difference. I have read that the Wald test calculate the LFC shrinkage for the data while the takes multiple parameters into account.

Code:

> resTP16h_90hwald
log2 fold change (MLE): hours 90 vs 16 
Wald test p-value: hours 90 vs 16 
DataFrame with 17558 rows and 6 columns
               baseMean log2FoldChange     lfcSE       stat      pvalue        padj
              <numeric>      <numeric> <numeric>  <numeric>   <numeric>   <numeric>
FBgn0085804  0.18052802      -3.642448  6.899080 -0.5279614 0.597526110          NA
FBgn0267431 19.73070118      -2.155084  1.270286 -1.6965346 0.089784690 0.125377361
FBgn0039987  0.08559842      -2.183327  6.937402 -0.3147183 0.752975578          NA
FBgn0058182  0.49195220      -2.710627  5.724264 -0.4735329 0.635833037          NA
FBgn0267430 27.36264804      -4.362455  1.481378 -2.9448633 0.003230974 0.006781261
...                 ...            ...       ...        ...         ...         ...
> resTP16h_90h
log2 fold change (MLE): hours 90 vs 16 
LRT p-value: '~ replica + hours' vs '~ replica' 
DataFrame with 17558 rows and 6 columns
               baseMean log2FoldChange     lfcSE      stat       pvalue         padj
              <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
FBgn0085804  0.18052802      -3.642448  6.899080  1.518897 0.9816482138           NA
FBgn0267431 19.73070118      -2.155084  1.270286 17.423700 0.0148591771 0.0184819228
FBgn0039987  0.08559842      -2.183327  6.937402  0.610748 0.9989315237           NA
FBgn0058182  0.49195220      -2.710627  5.724264  3.607104 0.8237543782           NA
FBgn0267430 27.36264804      -4.362455  1.481378 25.744205 0.0005595056 0.0007857541
.

Are the two method even comparable?

Originally posted by dpryan View Post

2. Yup. This can be though of as a superset of the results from all pairwise comparisons. If it's ever DE in a pairwise comparison, it'll likely be DE in the LRT (the reverse isn't the case).

This is where I don't understand what happens. If a genes changes over time in the analysis across all time points, it must also be changed in at least one of the pair-wise comparisons. isn't that true?
So why is the reverse not always the case?

**frymor** · 11-12-2015, 02:23 AM

Originally posted by dpryan View Post

2. Yup. This can be though of as a superset of the results from all pairwise comparisons. If it's ever DE in a pairwise comparison, it'll likely be DE in the LRT (the reverse isn't the case).

When calculating the pair-wise comparisons of the timepoints, does it make more sense to add the parameter test="Wald", or can I keep the LRT results?

I can see that there is a significant difference in the number of DE genes with an adjusted p-value <= 0.1

Code:

resTP16h_90h<- results(dds.filtered, contrast = c("hours", "90", "16"))
resTP16h_90h.wald <- results(dds.filtered, test = "Wald", contrast = c("hours", "16", "90"))

> addmargins(table(wald.test =(resTP16h_90h.wald$padj <.1), LRT.test=(resTP16h_90h$padj<.1)))
         LRT.test
wald.test FALSE  TRUE   Sum
    FALSE   624  2958  3582
    TRUE      0  9725  9725
    Sum     624 12683 13307

Am I correct in assuming that the LRT test done in the first results command contains not only the genes differentiating between 90h and 16h, but also all the genes in the time-points between (24h,30h,48h and 72h)?

**dpryan** · 11-12-2015, 04:19 AM

For pair-wise comparisons you need a Wald test. For "Is there a time effect, regardless of when?" you need an LRT. So yes, your assumption is absolutely correct

Topics	Statistics	Last Post
Study Highlights Challenges in Cellular Reprogramming for Regenerative Medicine by seqadmin Started by seqadmin, Today, 06:25 AM	0 responses 13 views 0 likes	Last Post by seqadmin Today, 06:25 AM
New DNA Modification Discovered as Key to Gene Activation in Early Development by seqadmin Started by seqadmin, Yesterday, 01:02 PM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 01:02 PM
Wastewater Analysis Unlocks New Method for Identifying Public Health Threats by seqadmin Started by seqadmin, 09-18-2024, 06:39 AM	0 responses 14 views 0 likes	Last Post by seqadmin 09-18-2024, 06:39 AM
Molecular Markers Shared Across Dementias by seqadmin Started by seqadmin, 09-11-2024, 02:44 PM	0 responses 14 views 0 likes	Last Post by seqadmin 09-11-2024, 02:44 PM

Seqanswers Leaderboard Ad

Announcement

interactions with DESeq2 in a time-course analysis

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News