Dear all,
currently we are thinking about replacing HTseq count by featureCount from the Subread package. The run time is shortened considerably, while giving the exact same raw counts.
Downstream analysis instead seems to be affected. Running a DESeq2 analysis on the output, results in varying log2fold changes [max( htseq$log2fold - subread$logfold ) != 0 ].
After several tries i could identify the sort order of the geneIDs in the raw counts being the culprit. As soon as i sorted the output of HTseq and featureCount alphabetically by geneID (which resulted in identical lists) DESeq2 computed identical log2fold changes.
Could anyone confirm these results?
Is this a bug or a feature of DESeq2? Is this maybe a consequence of some normalisation (relative-log-expression) approaches undertaken?
Or am i plainly doing anything wrong?
Thanks a lot
currently we are thinking about replacing HTseq count by featureCount from the Subread package. The run time is shortened considerably, while giving the exact same raw counts.
Downstream analysis instead seems to be affected. Running a DESeq2 analysis on the output, results in varying log2fold changes [max( htseq$log2fold - subread$logfold ) != 0 ].
After several tries i could identify the sort order of the geneIDs in the raw counts being the culprit. As soon as i sorted the output of HTseq and featureCount alphabetically by geneID (which resulted in identical lists) DESeq2 computed identical log2fold changes.
Could anyone confirm these results?
Is this a bug or a feature of DESeq2? Is this maybe a consequence of some normalisation (relative-log-expression) approaches undertaken?
Or am i plainly doing anything wrong?
Thanks a lot
Comment