I searched a lot of threads here and elsewhere without finding anything exactly like this.
I have two RNA-seq datasets from different dates and different platforms. Split between them are 4 groups (normal and 3 stages of cancer). This is the distribution:
Set 1
1x normal
2x stage 1
3x stage 2
7x stage 3
Set 2
2x normal
2x stage 1
2x stage 2
2x stage 3
We want to combine the datasets and make comparisons between the groups for differential expression. So far, I've tried:
-Combine all FPKM values into 1 table
-Run ComBat on the table, specifying dataset as batch and stage as a covariate
-Skipped voom() since they are not raw counts and log2 converted the ComBat output for limma.
-Run lmFit, contrasts.fit, and eBayes from limma on the converted output.
My questions/confusion is over:
1. Should I be using FPKM values or the raw counts for this, given the two datasets and need for batch removal?
2. What is the best way to run limma on the ComBat output without conversion through voom()?
3. Are there any other glaring problems with this approach?
Thanks!
I have two RNA-seq datasets from different dates and different platforms. Split between them are 4 groups (normal and 3 stages of cancer). This is the distribution:
Set 1
1x normal
2x stage 1
3x stage 2
7x stage 3
Set 2
2x normal
2x stage 1
2x stage 2
2x stage 3
We want to combine the datasets and make comparisons between the groups for differential expression. So far, I've tried:
-Combine all FPKM values into 1 table
-Run ComBat on the table, specifying dataset as batch and stage as a covariate
-Skipped voom() since they are not raw counts and log2 converted the ComBat output for limma.
-Run lmFit, contrasts.fit, and eBayes from limma on the converted output.
My questions/confusion is over:
1. Should I be using FPKM values or the raw counts for this, given the two datasets and need for batch removal?
2. What is the best way to run limma on the ComBat output without conversion through voom()?
3. Are there any other glaring problems with this approach?
Thanks!
Comment