Hey, Now I am trying to find splicing differences between two tumor and two normal samples(so only one condition: N and T factors), and prepare data following your vignette, everything goes well, but after the last step(bellow), I got the dxr dataframe I find amost half lines with exon usage coefficient as NA, when I extract the corresponding gene I find the exons well covered (one example bellow), so, my question is why so many NAs? could anyone please give some possible explanations which I would check.
Many Many Thanks.
######
my processes:
dxd = DEXSeqDataSetFromHTSeq(countFiles, sampleData=sampleTable,design=~sample + exon + condition:exon, flattenedfile=flattenedFile)
dxd=estimateSizeFactors(dxd)
dxd=estimateDispersions(dxd)
dxd=testForDEU(dxd)
dxd=estimateExonFoldChanges(dxd, fitExpToVar="condition")
dxr=DEXSeqResults(dxd)
example:
head(dxr)
LRT p-value: full vs reduced
DataFrame with 6 rows and 13 columns
groupID featureID exonBaseMean dispersion stat pvalue padj R S log2fold_R_S genomicData countData transcripts
<character> <character> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <GRanges> <matrix> <list>
ENSG00000000003:E001 ENSG00000000003 E001 259.50 0.005893988 0.7015350362 0.4022684 1 NA NA NA chrX:-:[99883667, 99884983] 199 201 471 ... ########
ENSG00000000003:E002 ENSG00000000003 E002 112.00 0.001052854 1.7011996938 0.1921312 1 NA NA NA chrX:-:[99885756, 99885863] 98 96 183 ... ########
ENSG00000000003:E003 ENSG00000000003 E003 92.75 0.005572201 0.0007320773 0.9784143 1 NA NA NA chrX:-:[99887482, 99887537] 97 76 145 ... ########
ENSG00000000003:E004 ENSG00000000003 E004 71.50 0.010739013 0.0385784467 0.8442862 1 NA NA NA chrX:-:[99887538, 99887565] 76 64 112 ... ########
ENSG00000000003:E005 ENSG00000000003 E005 81.00 0.007045578 0.7014820356 0.4022862 1 NA NA NA chrX:-:[99888402, 99888438] 90 61 131 ... ########
ENSG00000000003:E006 ENSG00000000003 E006 111.75 0.002231303 1.9566454114 0.1618725 1 NA NA NA chrX:-:[99888439, 99888536] 117 80 192 ... ########
exon counts:
head(counts(dxd))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
ENSG00000000003:E001 199 201 471 167 855 737 1587 511
ENSG00000000003:E002 98 96 183 71 956 842 1875 607
ENSG00000000003:E003 97 76 145 53 957 862 1913 625
ENSG00000000003:E004 76 64 112 34 978 874 1946 644
ENSG00000000003:E005 90 61 131 42 964 877 1927 636
ENSG00000000003:E006 117 80 192 58 937 858 1866 620
Many Many Thanks.
######
my processes:
dxd = DEXSeqDataSetFromHTSeq(countFiles, sampleData=sampleTable,design=~sample + exon + condition:exon, flattenedfile=flattenedFile)
dxd=estimateSizeFactors(dxd)
dxd=estimateDispersions(dxd)
dxd=testForDEU(dxd)
dxd=estimateExonFoldChanges(dxd, fitExpToVar="condition")
dxr=DEXSeqResults(dxd)
example:
head(dxr)
LRT p-value: full vs reduced
DataFrame with 6 rows and 13 columns
groupID featureID exonBaseMean dispersion stat pvalue padj R S log2fold_R_S genomicData countData transcripts
<character> <character> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <GRanges> <matrix> <list>
ENSG00000000003:E001 ENSG00000000003 E001 259.50 0.005893988 0.7015350362 0.4022684 1 NA NA NA chrX:-:[99883667, 99884983] 199 201 471 ... ########
ENSG00000000003:E002 ENSG00000000003 E002 112.00 0.001052854 1.7011996938 0.1921312 1 NA NA NA chrX:-:[99885756, 99885863] 98 96 183 ... ########
ENSG00000000003:E003 ENSG00000000003 E003 92.75 0.005572201 0.0007320773 0.9784143 1 NA NA NA chrX:-:[99887482, 99887537] 97 76 145 ... ########
ENSG00000000003:E004 ENSG00000000003 E004 71.50 0.010739013 0.0385784467 0.8442862 1 NA NA NA chrX:-:[99887538, 99887565] 76 64 112 ... ########
ENSG00000000003:E005 ENSG00000000003 E005 81.00 0.007045578 0.7014820356 0.4022862 1 NA NA NA chrX:-:[99888402, 99888438] 90 61 131 ... ########
ENSG00000000003:E006 ENSG00000000003 E006 111.75 0.002231303 1.9566454114 0.1618725 1 NA NA NA chrX:-:[99888439, 99888536] 117 80 192 ... ########
exon counts:
head(counts(dxd))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
ENSG00000000003:E001 199 201 471 167 855 737 1587 511
ENSG00000000003:E002 98 96 183 71 956 842 1875 607
ENSG00000000003:E003 97 76 145 53 957 862 1913 625
ENSG00000000003:E004 76 64 112 34 978 874 1946 644
ENSG00000000003:E005 90 61 131 42 964 877 1927 636
ENSG00000000003:E006 117 80 192 58 937 858 1866 620
Comment