Hi All
I am currently working on a set of RNAseq data which contain 4 genotypes (A, B, C, D) and 5 time points. I'd like to compare gene expression across all time points between 4 genotypes. Differential expressed genes accross 5 time points will be identified based on p values. However, the LRT considered full data and gives one padj. I also tried the contrast function between genotype A and B, A and C for example. A similar question was asked previously by someone else @ https://www.biostars.org/p/119342/. Looks like LRT is for testing all levels at once EVEN contrast is between A and B or A and C or A and D.
below is my design and some codes:
design = ~ genotype + timepoint + genotype:timepoint
> dds_time <- DESeq(dds, test = "LRT", reduced = ~ genotype + timepoint)
> res_time <- results(dds_time)
#CONTRAST_RESULTS
> results_A_B <- results(dds_time, contrast=c("genotype","A","B"))
> results_A_C <- results(dds_time, contrast=c("genotype","A","C"))
After reading some other posts and manuals, looks like I have two options:
1, using the current analysis but recalculate p values using Wald test.
I did this:
> dds_wald <- nbinomWaldTest(dds_time, betaPrior=FALSE)
found results columns, replacing these
> results_A_B_wald <- results(dds_time, contrast=c("genotype","A","B"))
> results_A_C_wald <- results(dds_time, contrast=c("genotype","A","C"))
padj values for specific comparisons were given using this method.
2, rerun pair-wise analysis (6 comparisons) using LRT separately?
Which option is better and why?
Am I correct if I want to identify the significant differences between genotypes across all time points?
Another question is that is it possible to cluster the differential expressed genes (e.g. A vs B) and plot different clusters using DESeq2?
Thanks,
I am currently working on a set of RNAseq data which contain 4 genotypes (A, B, C, D) and 5 time points. I'd like to compare gene expression across all time points between 4 genotypes. Differential expressed genes accross 5 time points will be identified based on p values. However, the LRT considered full data and gives one padj. I also tried the contrast function between genotype A and B, A and C for example. A similar question was asked previously by someone else @ https://www.biostars.org/p/119342/. Looks like LRT is for testing all levels at once EVEN contrast is between A and B or A and C or A and D.
below is my design and some codes:
design = ~ genotype + timepoint + genotype:timepoint
> dds_time <- DESeq(dds, test = "LRT", reduced = ~ genotype + timepoint)
> res_time <- results(dds_time)
#CONTRAST_RESULTS
> results_A_B <- results(dds_time, contrast=c("genotype","A","B"))
> results_A_C <- results(dds_time, contrast=c("genotype","A","C"))
After reading some other posts and manuals, looks like I have two options:
1, using the current analysis but recalculate p values using Wald test.
I did this:
> dds_wald <- nbinomWaldTest(dds_time, betaPrior=FALSE)
found results columns, replacing these
> results_A_B_wald <- results(dds_time, contrast=c("genotype","A","B"))
> results_A_C_wald <- results(dds_time, contrast=c("genotype","A","C"))
padj values for specific comparisons were given using this method.
2, rerun pair-wise analysis (6 comparisons) using LRT separately?
Which option is better and why?
Am I correct if I want to identify the significant differences between genotypes across all time points?
Another question is that is it possible to cluster the differential expressed genes (e.g. A vs B) and plot different clusters using DESeq2?
Thanks,
Comment