Hello,
I would like to use glm to ignore the bias from two different protocols and obtain the real DEG.
I have three conditions, say C1, C2 and C3.
Protocol 1 was used to obtain three replicates of C1 and three of C2.
Protocol 2 was used to obtain three replicates of C2 and C3.
I am interested in finding the DEG between C1 and C3 by taking into consideration C2 (from both protocols) in order to eliminate the protocol bias. See table below.
_________|__C1___|___C2___|___C3___|
Protocol 1 |___X___|___X____|_________|
Protocol 2 |_______|___X____|____X____|
The next are my two vectors for the matrix.design:
Protocol = p1 p1 p1 p1 p2 p1 p2 p1 p2 p2 p2 p2
Conditions= c1 c1 c1 c2 c2 c2 c2 c2 c2 c3 c3 c3
In R code we have:
Protocol = factor( c( rep('p1',3), rep(c("p1","p2"),3),rep('p2',3)) )
Conditions = factor(c(rep('c1',3),rep("c2",6),rep('c3',3)))
data.frame(Protocol,Conditions)
Protocol Conditions
1 p1 c1
2 p1 c1
3 p1 c1
4 p1 c2
5 p2 c2
6 p1 c2
7 p2 c2
8 p1 c2
9 p2 c2
10 p2 c3
11 p2 c3
12 p2 c3
design <- model.matrix(~Protocol+Conditions)
design
(Intercept) Protocolp2 Conditionsc2 Conditionsc3
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 1 0 1 0
5 1 1 1 0
6 1 0 1 0
7 1 1 1 0
8 1 0 1 0
9 1 1 1 0
10 1 1 0 1
11 1 1 0 1
12 1 1 0 1
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$Protocol
[1] "contr.treatment"
attr(,"contrasts")$Conditions
[1] "contr.treatment"
After estimating dispersion, glmFit and glmLRT I have:
...
lrt <- glmLRT(fit)
topTags(lrt)
Coefficient: ConditionC3
...results here...
My questions are:
1 - is my design correct by using "model.matrix(~Protocol+Conditions)"? Where did ConditionsC1 go in design table?
2 - is the coefficient "ConditionsC3" correct for this analysis? How should the contrast be in glmLRT function?
Any comments, tips or help is greatly appreciated.
Thank you very much,
I would like to use glm to ignore the bias from two different protocols and obtain the real DEG.
I have three conditions, say C1, C2 and C3.
Protocol 1 was used to obtain three replicates of C1 and three of C2.
Protocol 2 was used to obtain three replicates of C2 and C3.
I am interested in finding the DEG between C1 and C3 by taking into consideration C2 (from both protocols) in order to eliminate the protocol bias. See table below.
_________|__C1___|___C2___|___C3___|
Protocol 1 |___X___|___X____|_________|
Protocol 2 |_______|___X____|____X____|
The next are my two vectors for the matrix.design:
Protocol = p1 p1 p1 p1 p2 p1 p2 p1 p2 p2 p2 p2
Conditions= c1 c1 c1 c2 c2 c2 c2 c2 c2 c3 c3 c3
In R code we have:
Protocol = factor( c( rep('p1',3), rep(c("p1","p2"),3),rep('p2',3)) )
Conditions = factor(c(rep('c1',3),rep("c2",6),rep('c3',3)))
data.frame(Protocol,Conditions)
Protocol Conditions
1 p1 c1
2 p1 c1
3 p1 c1
4 p1 c2
5 p2 c2
6 p1 c2
7 p2 c2
8 p1 c2
9 p2 c2
10 p2 c3
11 p2 c3
12 p2 c3
design <- model.matrix(~Protocol+Conditions)
design
(Intercept) Protocolp2 Conditionsc2 Conditionsc3
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 1 0 1 0
5 1 1 1 0
6 1 0 1 0
7 1 1 1 0
8 1 0 1 0
9 1 1 1 0
10 1 1 0 1
11 1 1 0 1
12 1 1 0 1
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$Protocol
[1] "contr.treatment"
attr(,"contrasts")$Conditions
[1] "contr.treatment"
After estimating dispersion, glmFit and glmLRT I have:
...
lrt <- glmLRT(fit)
topTags(lrt)
Coefficient: ConditionC3
...results here...
My questions are:
1 - is my design correct by using "model.matrix(~Protocol+Conditions)"? Where did ConditionsC1 go in design table?
2 - is the coefficient "ConditionsC3" correct for this analysis? How should the contrast be in glmLRT function?
Any comments, tips or help is greatly appreciated.
Thank you very much,
Comment