Hi,
I have a set of DE genes from an RNAseq study in a non-model species. I performed a GO enrichment analyses using topGO and GOseq.
TopGO is nice, because it offers the parent-child algorithm, which takes into account the hierachical relationship between GOs. GOseq is nice, because it corrects for length bias in RNAseq.
Yet, of course I would like to combine both advantages. Ideally, I would just run topGO and perform a length bias correction on the calculated p values.
I found a paper by Gao et al. (http://bioinformatics.oxfordjournals.../27/5/662.long), where they essentially describe how to do this, at least that is what I understood, but to be fair, I am neither a bioinformatician nor a statistician.
The problem is, the R scripts mentioned in the paper are no longer available.
So my question is: Is there a simple way to correct for the RNAseq length bias, by just correcting the p values returned by topGO's fisher tests? I imagine to just multiply the p values by a correction factor based on the average transcript length in the respective GO term.
I hope some of you might have an idea on that.
Thanks,
Lukas
I have a set of DE genes from an RNAseq study in a non-model species. I performed a GO enrichment analyses using topGO and GOseq.
TopGO is nice, because it offers the parent-child algorithm, which takes into account the hierachical relationship between GOs. GOseq is nice, because it corrects for length bias in RNAseq.
Yet, of course I would like to combine both advantages. Ideally, I would just run topGO and perform a length bias correction on the calculated p values.
I found a paper by Gao et al. (http://bioinformatics.oxfordjournals.../27/5/662.long), where they essentially describe how to do this, at least that is what I understood, but to be fair, I am neither a bioinformatician nor a statistician.
The problem is, the R scripts mentioned in the paper are no longer available.
So my question is: Is there a simple way to correct for the RNAseq length bias, by just correcting the p values returned by topGO's fisher tests? I imagine to just multiply the p values by a correction factor based on the average transcript length in the respective GO term.
I hope some of you might have an idea on that.
Thanks,
Lukas
Comment