I'm getting different p-values and FCs from DESeq2 (v1.34.0) by simply changing the order of the columns in the count table. I can't understand why. I made sure the cols of count table match the rows of colData in both.

The differences in this dataset are small ([-0.003, 0.003] for p-value, and [-0.002, 0.001] for FC). But in some other datasets, the difference of log2FC can be up to 0.8 (and the baseMean for that gene is 300), which is concerning.

I tried setting `betaPrior = FALSE` but still got the difference.

Column `Genotype` in colData is the only tested variable in the model. It contains `WT` and `Mutant` as factors, with `WT` as the 1st level.

Actually if I change column `Genotype` to character type and rerun DESeq2 (so DESeq2 will convert them to factors with the warning message), I get the same results independent of the col/row orders in the input. So it seems the difference is somehow related to the factorization of the columns in colData, but I don't know why, and which results I should trust.

data <- readRDS('data.rds')

dds1 <- DESeqDataSetFromMatrix(countData = data[['counts']], colData = data[['colData']], design = ~ Genotype)
dds1 <- dds1[rowSums(counts(dds1)) > 1, ]
dds1 <- DESeq(dds1, betaPrior = TRUE)
res1 <- as.data.frame(results(dds1, contrast = list("GenotypeMutant","GenotypeWT")))

# only reorder the inputs, but the columns of counts still match the rows of colData:
dds2 <- DESeqDataSetFromMatrix(countData = data[['counts']][,c(4:6,1:3)], colData = data[['colData']][c(4:6,1:3),], design = ~ Genotype)
dds2 <- dds2[rowSums(counts(dds2)) > 1, ]
dds2 <- DESeq(dds2, betaPrior = TRUE)
res2 <- as.data.frame(results(dds2, contrast = list("GenotypeMutant","GenotypeWT")))


baseMean log2FoldChange lfcSE stat pvalue
ENSMUSG00000069515 3043.921 1.142537 0.09271227 12.32347 6.771949e-35
ENSMUSG00000069515 4.605602e-31

baseMean log2FoldChange lfcSE stat pvalue
ENSMUSG00000069515 3043.921 1.142512 0.09272373 12.32168 6.923947e-35
ENSMUSG00000069515 4.708976e-31