Dear all,
I want to get SDE known genes between 2 conditions (Tumor vs Normal) with 8 replicates in each condition.
The pipeline used is the following:
1. (16x)tophat:
2. (1x)cuffdiff:
The results provided by cuffdiff show NO SDE genes at all (no q-Val<5%), which is very surprising, biologically speaking... all q-val are equal to 1.
About the p-Val, 2 genes are p-Val<0.01 and 16 genes are p-Val<0.05; that's weak numbers.
Having a look to a "positive" control, the SPP1 gene, here are it's numbers:
- in gene_exp.diff:
Note that the status is OK and the logFC is big (-6.1), but the p-Val and q-Val are bad.
- in genes.read_group_tracking:
Note that, visually, there is a clear difference between groups ADD-Tumor vs ADD-Normal in ie: FPKM or raw frags...
Why is that SPP1 gene not catched as significant neither in p-Val, nore in q-Val?
How can be tuned the parameters in order to be less stringent?
I'm thinking to add:
-u/--multi-read-correct
-c/--min-alignment-count 10
-F/--min-outlier-p 0.05
-N/--upper-quartile-norm (instead of --geometric-norm)
--emit-count-tables (for tracking reasons)
--max-frag-multihits 10
Many thanks for your help!
Happy new year!!
I want to get SDE known genes between 2 conditions (Tumor vs Normal) with 8 replicates in each condition.
The pipeline used is the following:
1. (16x)tophat:
Code:
tophat2 -p 12 -G /path/to/Ensembl/Homo_sapiens.GRCh37.69.gtf -o $outcache /path/to/hg19/bowtie2index/Ensembl/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome /path/to/_1.merged.fastq /path/to/_2.merged.fastq
Code:
cuffdiff(v2.0.2) -p 12 -L ADD-Tumor,ADD-Normal -o /path/to/AllADD-Tumor-vs-Normal -b /path/to/hg19/bowtie2index/Ensembl/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome.fa /path/to/Ensembl/Homo_sapiens.GRCh37.69.gtf /tumor/1/accepted_hits.bam,..,/tumor/n/accepted_hits.bam /normal/1/accepted_hits.bam,..,/normal/n/accepted_hits.bam
About the p-Val, 2 genes are p-Val<0.01 and 16 genes are p-Val<0.05; that's weak numbers.
Having a look to a "positive" control, the SPP1 gene, here are it's numbers:
- in gene_exp.diff:
Code:
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant ENSG00000118785 ENSG00000118785 SPP1 4:88896818-88904562 ADD-Tumor ADD-Normal OK 124.125 1.81571 -6.09512 0.158608 0.873978 1 no
- in genes.read_group_tracking:
Code:
tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status ENSG00000118785 ADD-Tumor 1 30290 25617.2 25752.3 314.011 - OK ENSG00000118785 ADD-Tumor 0 5660 5864.01 5894.94 72.4812 - OK ENSG00000118785 ADD-Tumor 2 3096 3420.64 3438.68 42.9502 - OK ENSG00000118785 ADD-Tumor 3 5706 2969.05 2984.71 36.866 - OK ENSG00000118785 ADD-Tumor 4 32526 30257 30416.6 369.57 - OK ENSG00000118785 ADD-Tumor 5 594 803.644 807.883 10.3996 - OK ENSG00000118785 ADD-Tumor 6 6095 7016.68 7053.69 87.6907 - OK ENSG00000118785 ADD-Tumor 7 6180 5622.82 5652.48 68.0664 - OK ENSG00000118785 ADD-Normal 1 165 94.5041 91.8415 1.10194 - OK ENSG00000118785 ADD-Normal 0 12 18.5537 18.031 0.277918 - OK ENSG00000118785 ADD-Normal 2 33 32.5715 31.6538 0.489188 - OK ENSG00000118785 ADD-Normal 3 155 182.537 177.394 2.15486 - OK ENSG00000118785 ADD-Normal 4 243 230.643 224.144 2.70818 - OK ENSG00000118785 ADD-Normal 5 117 142.523 138.508 1.93188 - OK ENSG00000118785 ADD-Normal 6 712 466.519 453.375 5.51449 - OK ENSG00000118785 ADD-Normal 7 69 63.7359 61.9401 0.743175 - OK
Why is that SPP1 gene not catched as significant neither in p-Val, nore in q-Val?
How can be tuned the parameters in order to be less stringent?
I'm thinking to add:
-u/--multi-read-correct
-c/--min-alignment-count 10
-F/--min-outlier-p 0.05
-N/--upper-quartile-norm (instead of --geometric-norm)
--emit-count-tables (for tracking reasons)
--max-frag-multihits 10
Many thanks for your help!
Happy new year!!
Comment