I just started learning rna-seq data anlaysis following Nature Protocols 7, 562–578 (2012).
I used CuffDiff after tophat, but the genes with highly abundant reads have FPKM 0.
I tried all the different normalization options in CuffDiff, and --max-bundle-frags 2000000, but still FPKM 0 for the genes with the largest number of mapped reads. Is there any upper bound like Microarray excludes the saturated spots ?
I added example below,
In sample A, it has FPKM 0, because it has the most number of reads, but the same gene in sample B has FPKM 3789, because it's in the confidence zone in sample B ? Please, help me to get FPKM for those genes with highly abundant reads. How can I get FPKM for PM_0405A60 from the example below?
Thanks,
Both examples below were sorted by the # reads in the descending order.
I counted the number of reads using 'htseq-count'
Sample A
Gene | # reads | FPKM
PYYM_0405060 | 1447079 | 0
PYYM_1351050 | 675926 | 0
PYYM_1007060 | 559162 | 0
PYYM_1209040 | 421148 | 29315.7
Sample: B
Gene | # reads | FPKM
PYYM_0501060 | 1633262 | 0
PYYM_1351050 | 757339 | 0
PYYM_0405060 | 552553 | 3789.85
CuffDiff –verbose printed the below; PM_0405A60’s loci is PyYM_04_v1:214849-216007
Inspecting bundle PyYM_04_v1:206968-210685 with 5956 reads
Inspecting bundle PyYM_04_v1:211950-212458 with 368 reads
Inspecting bundle PyYM_04_v1:213051-213787 with 2864 reads
Inspecting bundle PyYM_04_v1:214849-216007 with 943663 reads
Inspecting bundle PyYM_04_v1:218737-223186 with 746 reads
Inspecting bundle PyYM_04_v1:223748-224267 with 889 reads
Inspecting bundle PyYM_04_v1:225620-226957 with 399 reads
Inspecting bundle PyYM_04_v1:227958-238759 with 10900 reads
I used CuffDiff after tophat, but the genes with highly abundant reads have FPKM 0.
I tried all the different normalization options in CuffDiff, and --max-bundle-frags 2000000, but still FPKM 0 for the genes with the largest number of mapped reads. Is there any upper bound like Microarray excludes the saturated spots ?
I added example below,
In sample A, it has FPKM 0, because it has the most number of reads, but the same gene in sample B has FPKM 3789, because it's in the confidence zone in sample B ? Please, help me to get FPKM for those genes with highly abundant reads. How can I get FPKM for PM_0405A60 from the example below?
Thanks,
Both examples below were sorted by the # reads in the descending order.
I counted the number of reads using 'htseq-count'
Sample A
Gene | # reads | FPKM
PYYM_0405060 | 1447079 | 0
PYYM_1351050 | 675926 | 0
PYYM_1007060 | 559162 | 0
PYYM_1209040 | 421148 | 29315.7
Sample: B
Gene | # reads | FPKM
PYYM_0501060 | 1633262 | 0
PYYM_1351050 | 757339 | 0
PYYM_0405060 | 552553 | 3789.85
CuffDiff –verbose printed the below; PM_0405A60’s loci is PyYM_04_v1:214849-216007
Inspecting bundle PyYM_04_v1:206968-210685 with 5956 reads
Inspecting bundle PyYM_04_v1:211950-212458 with 368 reads
Inspecting bundle PyYM_04_v1:213051-213787 with 2864 reads
Inspecting bundle PyYM_04_v1:214849-216007 with 943663 reads
Inspecting bundle PyYM_04_v1:218737-223186 with 746 reads
Inspecting bundle PyYM_04_v1:223748-224267 with 889 reads
Inspecting bundle PyYM_04_v1:225620-226957 with 399 reads
Inspecting bundle PyYM_04_v1:227958-238759 with 10900 reads
Comment