Hi everyone,
I have a data set of nine different groups, each with three samples. All these groups a are being compared against each other to find differentially regulated genes. All in all I have 14 different comparisons.
I have tested one design matrix for all samples vs. a pair-wise approach, where only the two compared samples were uploaded.
I was wondering which way make more sense, since I'm getting different results, when comparing the two ways.
this is my design matrix for all the samples:
and accordingly the pair-wise design:
or
When comparing the different results I am getting for the first matrix design better adjusted p-values as for the pair-wise approach. As I expected, I get similar (but not identical, probably due to the different size factors) log2 fold-changes.
Here is a sample of one of the comparisons from the full matrix design
and the same samples from the pair-wise design:
It is clear that there are far less DE miRNA in the pair-wise comparison, than in the full matrix design.
On the other hand, probably also due to the differences in the size factors I am getting log2FC values also in miRNAs, which have no reads attached at all.
I can see that the adjusted p-values are neglectable here, but still it make me wonders which of the two designs are better to continue with.
The full design matrix shows better p-values, but creates possible artefacts in the data set. The pair-wise design shows less significant results.
So, which one of the matrices will show me more realistic results?
thanks,
Assa
I have a data set of nine different groups, each with three samples. All these groups a are being compared against each other to find differentially regulated genes. All in all I have 14 different comparisons.
I have tested one design matrix for all samples vs. a pair-wise approach, where only the two compared samples were uploaded.
I was wondering which way make more sense, since I'm getting different results, when comparing the two ways.
this is my design matrix for all the samples:
Code:
conditionT HP4_1 HP4 HP4_2 HP4 HP4_3 HP4 HP24_1 HP24 HP24_2 HP24 HP24_3 HP24 CR4w_1 CR4w4 CR4w_2 CR4w4 CR4w_3 CR4w4 CR4w24_1 CR4w24 CR4w24_2 CR4w24 CR4w24_3 CR4w24 CTRL4_1 CTRL4 CTRL4_2 CTRL4 CTRL4_3 CTRL4 CTRL24_1 CTRL24 CTRL24_2 CTRL24 CTRL24_3 CTRL24 basalCR4w_1 basalCR4w basalCR4w_2 basalCR4w basalCR4w_3 basalCR4w basalCTRL_1 basalCTRL basalCTRL_2 basalCTRL basalCTRL_3 basalCTRL basalHP_1 basalHP basalHP_2 basalHP basalHP_3 basalHP
Code:
conditionT HP4_1 HP4 HP4_2 HP4 HP4_3 HP4 CTRL4_1 CTRL4 CTRL4_2 CTRL4 CTRL4_3 CTRL4
Code:
conditionT HP24_1 HP24 HP24_2 HP24 HP24_3 HP24 basalHP_1 basalHP basalHP_2 basalHP basalHP_3 basalHP
Here is a sample of one of the comparisons from the full matrix design
Code:
miRNA log2FoldChange padj mmu-miR-29a-3p 0.534368658 0.000259248 mmu-miR-26a-5p 0.378956528 0.000310647 mmu-miR-200a-3p 0.299780505 0.00060916 mmu-miR-29c-3p 0.433273797 0.00060916 mmu-miR-29b-3p 0.625200783 0.001034352 mmu-miR-30d-5p 0.253729371 0.00715 mmu-miR-30a-5p 0.289108972 0.00715 mmu-miR-26b-5p 0.287258966 0.009435688 mmu-miR-30a-3p 0.263099811 0.012596849 mmu-miR-200c-3p 0.480164731 0.016093411 mmu-miR-455-3p 0.734375756 0.016093411 mmu-miR-101a-3p 0.231741597 0.019381496 mmu-miR-101c 0.23216037 0.021264359 mmu-miR-30e-3p 0.276381941 0.026896293 mmu-miR-92b-3p 0.491022916 0.041665933 mmu-miR-99a-5p 0.332214684 0.049609316 mmu-miR-151-5p 0.259334395 0.08039887 mmu-miR-181c-5p -0.226533316 0.08039887 mmu-miR-127-3p -0.5404365 0.092739149 mmu-miR-182-5p 0.460856503 0.095664474 mmu-miR-30e-5p 0.212060214 0.095664474
Code:
log2FoldChange padj mmu-miR-29a-3p 0.488112296 0.054110034 mmu-miR-29b-3p 0.531545957 0.080779499 mmu-miR-29c-3p 0.398383972 0.080779499 mmu-miR-451a -0.515259487 0.080779499 mmu-miR-26a-5p 0.35141262 0.086831362
On the other hand, probably also due to the differences in the size factors I am getting log2FC values also in miRNAs, which have no reads attached at all.
Code:
miRNA baseMeanA baseMeanB baseMean log2FoldChange lfcSE stat pvalue padj HP24_1 HP24_2 HP24_3 CTRL24_1 CTRL24_2 CTRL24_3 mmu-miR-376b-5p 0 0 0.127918331 -0.000507796 0.046785551 -0.010853703 0.991340168 NA 0 0 0 0 0 0 mmu-miR-1968-3p 0 0 0.127682606 -0.000515398 0.046192878 -0.011157523 NA NA 0 0 0 0 0 0
The full design matrix shows better p-values, but creates possible artefacts in the data set. The pair-wise design shows less significant results.
So, which one of the matrices will show me more realistic results?
thanks,
Assa
Comment