Welcome!
I am struggling with an error I get while trying to call function plotDEXSeq for a giant gene (titin). My data are very rich (high sequencing depth, multiple biological replicates) and I didn't get such a problem before with calling plotDEXSeq on other genes.
Exact message:
When such problems typically arise ? Is it a matter of a size of the gene ? Or there was too much dispersion in the data for exons in this gene ?
When I look at the head of the results part for this gene I can see this:
Why there are those NA's for KO and WT ? Maybe this is the culprit for the lack of fit of the GLM ?
I'll be very thankful for any suggestions.
I am struggling with an error I get while trying to call function plotDEXSeq for a giant gene (titin). My data are very rich (high sequencing depth, multiple biological replicates) and I didn't get such a problem before with calling plotDEXSeq on other genes.
Exact message:
> plotDEXSeq( allResults, "ENSMUSG00000051747", fitExpToVar="strain", norCounts=TRUE, legend=TRUE, cex.axis=1.2, cex=1.3, lwd=2 )
NULL
Warning message:
In plotDEXSeq(allResults, "ENSMUSG00000051747", fitExpToVar = "strain", :
glm fit failed for gene ENSMUSG00000051747
NULL
Warning message:
In plotDEXSeq(allResults, "ENSMUSG00000051747", fitExpToVar = "strain", :
glm fit failed for gene ENSMUSG00000051747
When such problems typically arise ? Is it a matter of a size of the gene ? Or there was too much dispersion in the data for exons in this gene ?
When I look at the head of the results part for this gene I can see this:
> allResults[grep("ENSMUSG00000051747",row.names(allResults)),]
LRT p-value: full vs reduced
DataFrame with 391 rows and 13 columns
groupID featureID exonBaseMean dispersion stat pvalue padj KO WT log2fold_WT_KO genomicData
<character> <character> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <GRanges>
ENSMUSG00000051747:E001 ENSMUSG00000051747 E001 20.08333 0.0017529976 0.912106068 3.395562e-01 NA NA NA NA 2:-:[76703980, 76703983]
ENSMUSG00000051747:E002 ENSMUSG00000051747 E002 57104.16667 0.0017028615 7.378929235 6.599245e-03 0.095686201 NA NA NA 2:-:[76703984, 76705030]
ENSMUSG00000051747:E003 ENSMUSG00000051747 E003 26387.08333 0.0006423373 19.660709503 9.248351e-06 0.005462552 NA NA NA 2:-:[76705031, 76705329]
ENSMUSG00000051747:E004 ENSMUSG00000051747 E004 19630.75000 0.0012588978 0.124517839 7.241853e-01 0.874623888 NA NA NA 2:-:[76705670, 76705972]
ENSMUSG00000051747:E005 ENSMUSG00000051747 E005 12821.16667 0.0013093168 0.005632633 9.401742e-01 0.980093291 NA NA NA 2:-:[76706440, 76706593]
LRT p-value: full vs reduced
DataFrame with 391 rows and 13 columns
groupID featureID exonBaseMean dispersion stat pvalue padj KO WT log2fold_WT_KO genomicData
<character> <character> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <GRanges>
ENSMUSG00000051747:E001 ENSMUSG00000051747 E001 20.08333 0.0017529976 0.912106068 3.395562e-01 NA NA NA NA 2:-:[76703980, 76703983]
ENSMUSG00000051747:E002 ENSMUSG00000051747 E002 57104.16667 0.0017028615 7.378929235 6.599245e-03 0.095686201 NA NA NA 2:-:[76703984, 76705030]
ENSMUSG00000051747:E003 ENSMUSG00000051747 E003 26387.08333 0.0006423373 19.660709503 9.248351e-06 0.005462552 NA NA NA 2:-:[76705031, 76705329]
ENSMUSG00000051747:E004 ENSMUSG00000051747 E004 19630.75000 0.0012588978 0.124517839 7.241853e-01 0.874623888 NA NA NA 2:-:[76705670, 76705972]
ENSMUSG00000051747:E005 ENSMUSG00000051747 E005 12821.16667 0.0013093168 0.005632633 9.401742e-01 0.980093291 NA NA NA 2:-:[76706440, 76706593]
I'll be very thankful for any suggestions.