Hi,
I am using DESeq v1.12.0 to compare count data for ~6,000 exons in ~200 genes; these genes are the ones found to be significantly differentially expressed (FDR level) in a prior gene-centric DESeq analysis between 20 diseased and 20 normal human RNA-Seq samples. To get the exon counts that DESeq needs as an input file, I used two python scripts that Simon Anders created: dexseq_prepare_annotation.py and dexseq_count.py.
In DESeq, everything went smoothly until the estimateDispersions function. Initially, I tried using the "pooled-CR" method for this function, but got the following error:
Given this message, I tried using the estimateDispersions function in two different ways:
When using the plotDispEsts function for the cds.pooledCR.local and cds.pooled objects to plot the mean of normalized counts vs the dispersion for all included exons, I obtain the attached plots. In the case of the cds.pooled object, this message was displayed:
The 1,472 mentioned y dispersion values are < 0. They are also the ones on the bottom of the plot obtained with the cds.pooledCR.local (in this case, the values were really small, but > 0).
Long introduction for my two questions:
1) I thought DESeq makes sure the dispersion values are all above 0. How should I interpret the negative dispersion values obtained when the "method" option for estimateDispersions is "pooled"?
2) Which options for the estimateDispersions function are safest to use with exon data?
Thank you for your help!
Alexandra
I am using DESeq v1.12.0 to compare count data for ~6,000 exons in ~200 genes; these genes are the ones found to be significantly differentially expressed (FDR level) in a prior gene-centric DESeq analysis between 20 diseased and 20 normal human RNA-Seq samples. To get the exon counts that DESeq needs as an input file, I used two python scripts that Simon Anders created: dexseq_prepare_annotation.py and dexseq_count.py.
In DESeq, everything went smoothly until the estimateDispersions function. Initially, I tried using the "pooled-CR" method for this function, but got the following error:
Code:
> cds <- estimateDispersions(cds, method = "pooled-CR") Error in parametricDispersionFit(means, disps) : Parametric dispersion fit failed. Try a local fit and/or a pooled estimation. (See '?estimateDispersions')
Code:
cds.pooledCR.local <- estimateDispersions(cds, method = "pooled-CR", fitType="local") cds.pooled <- estimateDispersions(cds, method = "pooled")
Code:
Warning message: In xy.coords(x, y, xlabel, ylabel, log) : 1472 y values <= 0 omitted from logarithmic plot
Long introduction for my two questions:
1) I thought DESeq makes sure the dispersion values are all above 0. How should I interpret the negative dispersion values obtained when the "method" option for estimateDispersions is "pooled"?
2) Which options for the estimateDispersions function are safest to use with exon data?
Thank you for your help!
Alexandra
Comment