Seqanswers Leaderboard Ad

**NicoBxl** · 05-12-2011, 10:17 PM

nobody knows ?

**Simon Anders** · 05-12-2011, 10:28 PM

DESeq and edgeR both offer ANODEV tests.

**NicoBxl** · 05-12-2011, 11:12 PM

what's the function to do this ?

I've a dataframe A with 20 samples ( 4 groups of 5 samples )

Code:

groups <- c(rep("A",5),rep("B",5),rep("C",5),rep("D",5))
cds <- newCountDataSet(A,groups)
cds <- estimateSizeFactors(cds)
cds <- estimateVarianceFunctions(cds)

res <- nbinomTest(cds,condA="A",condB="B") # two group test

**Simon Anders** · 05-13-2011, 01:29 AM

Sorry, I need to add this to the vignette. In your case, this should work:

Code:

groups <- c( rep("A",5), rep("B",5), rep("C",5), rep("D",5) )
cds <- newCountDataSet( A, groups )
cds <- estimateSizeFactors( cds )
cds <- estimateVarianceFunctions( cds, "pooled" )

fit0 <- nbinomFitGLM( cds, count ~ 1 )
fit1 <- nbinomFitGLM( cds, count ~ condition )
pvals <- nbinomGLMTest( fit1, fit0 )

Note that this is a good deal slower than the standard test.

If you have more than one factor, pass a data frame to 'newCountDataSet' instead of 'groups' and then, you can write down more interesting model formulae than the one in this example, using the data frame's column names instead of 'condition'.

Simon

**NicoBxl** · 05-13-2011, 01:31 AM

ok I'll try this.

Thanks Simon

**RSK** · 05-16-2011, 08:41 AM

Dear Simon,
Is there any place where I can find more information about performing ANODEV analysis using DESeq?
Alternatively, when do you think the vignette will include this information?
Thanks!
RSK

**chenhao392** · 05-16-2011, 02:50 PM

Dear Simon,

I am currently working with RNA-seq datasets in 5 conditions without replicates. It is so lucky for me to see this vignette, Howerer, I found that I can't use "pooled" methods in estimateVarianceFunctions. Or alternatively, I can't go through nbinomFitGLM with "blind" methods in estimateVarianceFunctions. Both error msgs is attached below. But in the nbinomFitGLM function description, it says:

Use this function to estimate coefficients and calculate deviance
from a GLM for each gene. The GLM uses the ‘nbkd.sf’ family, with
the dispersion estimate according to ‘getVarianceFunction(cds)’.
Note that this requires that the variance functions were estimated
with method "pooled" or "blind".

When I was trying to use "method=pooled" in estimateVarianceFunctions

Code:

Error in rowSums(sapply(tapply((1:ncol(counts))[replicated_sample], factor(conditions[replicated_sample]),  : 
  'x' must be an array of at least two dimensions

When I was using "method=blind" instead.

Code:

Error in nbinomFitGLM(cds, count ~ 1) : 
  No pooled variance function found. Have you called 'estimateVarianceFunctions' with 'method="pooled"'?

my code is:

Code:

library(DESeq);
list=read.table("../data/5_condition.txt",row.names=1,header=T);
conds <- c("CFGU","CFGW","CFHG","CFHH","CFHI");
cds <- newCountDataSet( list,conds);
cds <- estimateSizeFactors( cds );
cds <- estimateVarianceFunctions(cds, method='blind');
fit0 <- nbinomFitGLM( cds, count ~ 1 )
fit1 <- nbinomFitGLM( cds, count ~ condition )
result <- nbinomGLMTest( fit1, fit0 )

Could you tell me what is my mistake here?
Thanks in advance,

Hao Chen

Originally posted by Simon Anders View Post

Sorry, I need to add this to the vignette. In your case, this should work:

Code:

groups <- c( rep("A",5), rep("B",5), rep("C",5), rep("D",5) )
cds <- newCountDataSet( A, groups )
cds <- estimateSizeFactors( cds )
cds <- estimateVarianceFunctions( cds, "pooled" )

fit0 <- nbinomFitGLM( cds, count ~ 1 )
fit1 <- nbinomFitGLM( cds, count ~ condition )
pvals <- nbinomGLMTest( fit1, fit0 )

Note that this is a good deal slower than the standard test.

If you have more than one factor, pass a data frame to 'newCountDataSet' instead of 'groups' and then, you can write down more interesting model formulae than the one in this example, using the data frame's column names instead of 'condition'.

Simon

**yoyoq** · 11-08-2011, 01:56 PM

hi,

just for reference the function is now called
fitNbinomGLMs

**jaymiRNA** · 08-02-2012, 01:03 AM

Differential expression DESeq using GLM for timecourse experiment

Dear Members
I am currently working on NGS data and performed some differential expression analysis using DESeq. At the moment I am trying to find differentially expressed miRNAs using GLM functionality.
My experimental design consists of 4 samples (4stages of developing flower) with 2 biological replicates each. I would like to identify the differentially expressed miRNA during different developmental stages(Stage 1,2,3 and 4). However, I have some doubts regarding my final data interpretation and here is the code I used and the output.

My Dataset:
>Str(SSSA)
Dataframe: 662 obs. of 8 variables:
$ Flower1 : int 1 1 6 0 0 0 6709 0 3 0 ...
$ Flower1.1: int 1 1 1 0 0 0 2261 2 1 2 ...
$ Flower2 : int 0 4 49 0 0 0 7311 0 4 0 ...
$ Flower2.1: int 1 14 69 0 0 0 11211 0 1 0 ...
$ Flower3 : int 0 8 89 1 0 0 1978 0 0 1 ...
$ Flower3.1: int 0 4 72 0 0 0 5392 0 1 1 ...
$ Flower4 : int 0 1 67 0 0 0 4675 0 0 0 ...
$ Flower4.1: int 0 0 21 0 0 0 4629 0 1 5 ...

My code
groups <- c( rep("F1",2), rep("F2",2), rep("F3",2), rep("F4",2) )
cds <- newCountDataSet(SSSA , groups )
cds <- estimateSizeFactors( cds )
cds <- estimateDispersions( cds, "pooled" )
fit0 <- fitNbinomGLMs ( cds, count ~ 1 )
fit1 <- fitNbinomGLMs ( cds, count ~ groups )
pvals <- nbinomGLMTest( fit1, fit0 )
padjGLM <- p.adjust( pvals, method="BH" )
fit1.pval <- cbind(fit1, padjGLM)

My Result
> str(fit1.pval)
'data.frame': 662 obs. of 7 variables:
$ (Intercept): num 1.23 1.16 2.77 -32.46 NA ...
$ groupsF2 : num -3.17 1.07 2.17 -2.1 NA ...
$ groupsF3 : num -32.569 0.859 2.973 30.973 NA ...
$ groupsF4 : num -31.675 -2.008 2.835 -0.875 NA ...
$ deviance : num 0.746 1.895 2.041 0.234 NA ...
$ converged : logi TRUE TRUE TRUE TRUE NA NA ...
$ pval : num 0.28395 0.44052 0.00622 0.86603 NA ...

My doubts.....
1.In my case, is it better to set the conditions as a data frame instead of a vector- “groups <- c( rep("F1",2), rep("F2",2), rep("F3",2), rep("F4",2) )”?
2.In the code, I calculated the object fit0 setting the value 1 after the tilde -“fit0 <- fitNbinomGLMs ( cds, count ~ 1 )” but I am not very clear.. what count ~ 1 stand for/refers to ?

3.My final results( above) shows 7 variables(including the adjusted pvalue) in the data.frame and I am not clear in interpreting this data. In particular , by using ‘groups’ in calling fit 1 i.e. fit1 <- fitNbinomGLMs ( cds, count ~ groups ),I can’t understand if I really compared F1 against all other stages F2,F3,F4, as I intend or If I made any other comparisions.

Could anyone help me on this?

Jay

**jaymiRNA** · 08-03-2012, 12:04 AM

No body knows ?

**Simon Anders** · 08-06-2012, 05:41 AM

Please do not cross-post. The reply is in the other thread.

Topics	Statistics	Last Post
Study Reveals How Bacteria Defend Against Viral Attacks by seqadmin Started by seqadmin, 08-27-2024, 04:40 AM	0 responses 16 views 0 likes	Last Post by seqadmin 08-27-2024, 04:40 AM
New Single-Molecule Sequencing Platform Introduces Advanced Features for High-Throughput Genomics by seqadmin Started by seqadmin, 08-22-2024, 05:00 AM	0 responses 293 views 0 likes	Last Post by seqadmin 08-22-2024, 05:00 AM
New DNA Code Discovered Revealing Complex Gene Regulation Mechanisms by seqadmin Started by seqadmin, 08-21-2024, 10:49 AM	0 responses 135 views 0 likes	Last Post by seqadmin 08-21-2024, 10:49 AM
Epigenetic Clocks Derived from Retroelements Offer New Insights into Aging by seqadmin Started by seqadmin, 08-19-2024, 05:12 AM	0 responses 124 views 0 likes	Last Post by seqadmin 08-19-2024, 05:12 AM

Seqanswers Leaderboard Ad

Announcement

DE analysis with multiple groups

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News