Dear All
I am a newbie to the RNA-seq data analysis field. Currently, I'm in
charge of analyzing some human NGS samples (single end) in a disease-control comparative setting. I have 10 BAM files (biological replicates) from tophat, each having the size~4GB.
I followed the tophat-cufflinks-cuffcompare-cuffdiff pipeline (using
hg19 reference) to find the differentially expressed genes between experimental and control conditions.
I have no problem getting assembled results from cufflinks for each sample but I am stuck at the final cuffdiff step (the problem seems to be an insufficient memory issue as I constantly received bad-alloc feedback from the shell)
So I wonder if I can just use the FPKM value from the cufflink genes.fpkm_tracking file of each sample as the gene expression values and use traditional statistical methods to identify differentially expressed genes between two groups? (e.g. multiple
t-test, SAM analysis etc.)
Thanks in advance
I am a newbie to the RNA-seq data analysis field. Currently, I'm in
charge of analyzing some human NGS samples (single end) in a disease-control comparative setting. I have 10 BAM files (biological replicates) from tophat, each having the size~4GB.
I followed the tophat-cufflinks-cuffcompare-cuffdiff pipeline (using
hg19 reference) to find the differentially expressed genes between experimental and control conditions.
I have no problem getting assembled results from cufflinks for each sample but I am stuck at the final cuffdiff step (the problem seems to be an insufficient memory issue as I constantly received bad-alloc feedback from the shell)
So I wonder if I can just use the FPKM value from the cufflink genes.fpkm_tracking file of each sample as the gene expression values and use traditional statistical methods to identify differentially expressed genes between two groups? (e.g. multiple
t-test, SAM analysis etc.)
Thanks in advance
Comment