Hello,
I have RNAseq samples from 11 Patients. For each patient
the samples were taken from two difference locations (Ileum and Caecum)
and had two different inflammation status at those locations.
However, the design is not balanced as I do not have 4 samples for each patient.
Below is a description of the dataset
I am interested in testing if there is a difference between the Location in general, the Status in general,
the Status given the Location and the interaction between status and location.
Therefore, I used DESeq2 with the following design
From what I understood this will remove the variation due to Patient and test for
the effect of Location, Status and the interaction between the two.
To check the difference for Location in general, I extracted the results of the DESeq analysis
using contrast c("Location", "Ileum", "Caecum"). For Status, I used contrast = c("Status", "NI", "I")
To test for the effect of Status given the Location, I used
contrast = list("LocationCaecum.StatusI", "LocationCaecum.StatusNI").
To check for the effect of Location on Status (i.e. inflamed caecum has additional effect
than Caecum alone and inflammation alone) I used
contrast = list(c("StatusI", "LocationCaecum.StatusI"), c("StatusNI", "LocationCaecum.StatusNI"))
Considering that the high amount of missing samples (or incomplete blocks), does the design
formula I am using make sense? For instance, when comparing on Location, does the pairing have any value
as it will only work for the following 8 samples (Or am I wrong?)
And is it OK to use the whole dataset for these tests or it is better to subset it first
then do a DESeq2 analysis for each case separately. For example select all NI samples, then compare them between Ileum and Caecum with the design below
Thank you very much in advance,
Youssef
I have RNAseq samples from 11 Patients. For each patient
the samples were taken from two difference locations (Ileum and Caecum)
and had two different inflammation status at those locations.
However, the design is not balanced as I do not have 4 samples for each patient.
Below is a description of the dataset
Code:
1 Caecum NI 1 Ileum NI 3 Caecum I 3 Ileum NI 4 Ileum I 5 Caecum NI 5 Caecum I 5 Caecum NI 6 Ileum I 6 Caecum I 7 Caecum NI 7 Ileum I 8 Ileum NI 8 Ileum I 9 Ileum NI 9 Ileum I 9 Caecum I 10 Ileum NI 10 Caecum NI 12 Ileum NI 12 Caecum NI 14 Ileum NI 14 Ileum I 14 Ileum I
I am interested in testing if there is a difference between the Location in general, the Status in general,
the Status given the Location and the interaction between status and location.
Therefore, I used DESeq2 with the following design
Code:
~ Patient + Location + Status + Location:Status
the effect of Location, Status and the interaction between the two.
To check the difference for Location in general, I extracted the results of the DESeq analysis
using contrast c("Location", "Ileum", "Caecum"). For Status, I used contrast = c("Status", "NI", "I")
To test for the effect of Status given the Location, I used
contrast = list("LocationCaecum.StatusI", "LocationCaecum.StatusNI").
To check for the effect of Location on Status (i.e. inflamed caecum has additional effect
than Caecum alone and inflammation alone) I used
contrast = list(c("StatusI", "LocationCaecum.StatusI"), c("StatusNI", "LocationCaecum.StatusNI"))
Considering that the high amount of missing samples (or incomplete blocks), does the design
formula I am using make sense? For instance, when comparing on Location, does the pairing have any value
as it will only work for the following 8 samples (Or am I wrong?)
Code:
Sample Location Status 5 Caecum NI 5 Caecum I 8 Ileum NI 8 Ileum I 9 Ileum NI 9 Ileum I 14 Ileum NI 14 Ileum I
then do a DESeq2 analysis for each case separately. For example select all NI samples, then compare them between Ileum and Caecum with the design below
Code:
design = formula(~ Patient + Status)
Thank you very much in advance,
Youssef
Comment