New differential testing of cuffdiff/cufflinks since 1.3.0

aguffanti replied

09-12-2013, 05:46 AM
Cuffllinks variance, gene expression level and ifferential expression

Hello ! which kind of sequencing was that - fragment or paired ends ? it changes a lot

You have to look at the *count.tracking files and examine the count variance values

There are instances (especially with fragment sequencing) for which the count variance is far greater than the counts - actually I would like to know from Cole wether this is somewhat expected

Here is an example (see the values for q2), with the associated gene expression reported below - looks like for q2 a FPKM of 0 is calculated due to the enormous variance, am I right ?

tracking_id q1_count q1_count_variance q1_count_uncertainty_var q1_count_dispersion_var q1_status q2_count q2_count_variance q2_count_uncertainty_var q2_count_dispersion_var q2_status

GAPDH 308519 40753000 0 308519 OK 207559 10265500000 0 10265500000 OK

test_id gene locus status MCF7 MCFS MCF7+MCFS log2(fold_change) test_stat p_value q_value significant

NM_001256799 GAPDH chr12:6643584-6647537 OK 11.25 0.00 11.25 =-inf nan 0.00005 0.0142079 yes

I am running software versions as follows on Color Space reads (so I have to stick to the old bowtie..)

bowtie version 1.0.0
TopHat v2.0.9
cufflinks v2.1.1

Thanks in advance for any feedback,

Alessandro
Leave a comment:
jp. replied

07-24-2013, 09:40 PM
Hi
May please someone check my attached cuff_diff results. I have problem with high error bar and low FPKM value. Is there something wrong with sequencing or its a cufflinks problem?
Second, there are only few genes in either samples (not in all) with 0 FPKM value; for example, control 3840.28 FPKM and treated 0 FPKM with Log2_fold_change=Inf. Attached File 0_FPKM.png Expression bar plot indicates high fold expression ? Can I ignore this one which seems significant ?

Thank you very much

Last edited by jp.; 07-24-2013, 10:49 PM. Reason: adding info
Leave a comment:
jwaage replied

08-27-2012, 02:47 AM
Hi all; I too, have some similar problems - I'm running the newest cuffdiff (2.0.2) on a two-sample, two-replicates illumina run using

cuffdiff --mask-file xxx.gtf -upper-quartile-norm

genes.diff and isoforms.diff are fine (although not many are called DE), but splicing.diff and promoters.diff only have NOTEST, LOWDATA or FAIL.

Any suggestions? I've tried running with -c 1, but to no avail.

All the best,
Johannes Waage
Uni of. Copenhagen
Leave a comment:
billstevens replied

08-09-2012, 04:40 PM
Originally posted by MeixiaZhao View Post

Hello,
I used the latest version cufflinks 2.0.2, comparing with my previous results from 1.3.0, the results are unbelievable:
Gene level: I got 340 significant genes with cuffdiff2 and 2692 with cuffdiff1.3;
Isoform level: I got 272 significant isoforms with cuffdiff2 and 11380 with cuffdiff1.3;
My samples are single replicate.
Any solutions to judge the results? Or which version should I use for later analysis?
Thanks a lot!

I've spoken to Cole about this extensively, and he says the new version is just much more accurate. It is an absolute shock going from one to the other.
Leave a comment:
MeixiaZhao replied

07-31-2012, 03:48 PM
Hello,
I used the latest version cufflinks 2.0.2, comparing with my previous results from 1.3.0, the results are unbelievable:
Gene level: I got 340 significant genes with cuffdiff2 and 2692 with cuffdiff1.3;
Isoform level: I got 272 significant isoforms with cuffdiff2 and 11380 with cuffdiff1.3;
My samples are single replicate.
Any solutions to judge the results? Or which version should I use for later analysis?
Thanks a lot!
Leave a comment:
billstevens replied

07-02-2012, 01:50 PM
Originally posted by Cole Trapnell View Post

That's excellent - thanks to you and others in this thread for the helpful feedback. The official release of 2.0.1 is imminent.

Hi Cole,

Thanks for all the notes. However, I'm getting the same issue here. I have 46,000 reads that test, and 40,000 that have NOTEST. I am using the -b option, but not -c 0, and not the multi-read correct or frag bias. I'm using Cufflinks 2.0.1 and Tophat 1.4. Was the shortcoming in regards to -b addressed in the newest Cufflinks?
Leave a comment:
sdodson replied

06-19-2012, 11:02 AM
Hi,

I'm a novice in RNA-seq analysis, and I have a question regarding Cuffdiff...

We generated our RNA-seq data using the Illumina HiSeq2000, and have been following the data analysis protocol for Tophat, Cufflinks, ect. from Nature Protocols. We are currently using Tophat v1.4.1 and Cufflinks v1.3.0. Our Cuffdiff output shows 10371 of 27080 genes are NOTEST. In an effect to decrease this value, can we re-run our data through the newest version of Cuffdiff without first re-running it through the newest version of Cufflinks? Or is something else we can change to minimize the number of NOTEST genes? (such as using -c 0 ?)

Thanks!
Leave a comment:
Cole Trapnell replied

06-15-2012, 10:22 AM
Originally posted by pinin4fjords View Post

I also confirm that the numbers produced using the new version at #32 look more sensible (using -b and min-outlier-p).

That's excellent - thanks to you and others in this thread for the helpful feedback. The official release of 2.0.1 is imminent.
Leave a comment:
Cole Trapnell replied

06-15-2012, 10:21 AM
Originally posted by gesdy View Post

Hi Cole,
I used the cuffdiff version you posted, and now it works.
I repeated the same analysis with the new version and the 1.3 (cuffdiff)
I got 43 genes called significant with cuffdiff2 and 940 with cuffdiff1.3
I have just one replicate for each conditions.
Do you think it's normal?
here my command line:

cuffdiff -b genome.fa -p 10 -u genes.gtf

any suggestions?
thank you very much!
Mat

Hi Mat,

That sounds fair to me - Cuffdiff 2 is far more conservative at the gene level than was 1.3 (as detailed on the site), and with a single replicate per sample, even more so. This isn't too surprising to me.
Leave a comment:
pinin4fjords replied

06-15-2012, 01:18 AM
I also confirm that the numbers produced using the new version at #32 look more sensible (using -b and min-outlier-p).
Leave a comment:
gesdy replied

06-13-2012, 11:50 AM
Hi Cole,
I used the cuffdiff version you posted, and now it works.
I repeated the same analysis with the new version and the 1.3 (cuffdiff)
I got 43 genes called significant with cuffdiff2 and 940 with cuffdiff1.3
I have just one replicate for each conditions.
Do you think it's normal?
here my command line:

cuffdiff -b genome.fa -p 10 -u genes.gtf

any suggestions?
thank you very much!
Mat
Leave a comment:
Cole Trapnell replied

06-12-2012, 06:30 AM
Hmm, we see that sometimes when one or more of the libraries have extremely low sequencing yield, for example. Can you show me a csDendro plot with replicates=T? It's important to verify that the replicates for each sample really segregate together. It might also be helpful to see a dispersion plot for each condition. If you're worried about revealing sample names you can email it to [email protected].
Leave a comment:
glados replied

06-12-2012, 06:13 AM
Cole, can you please take a look at this boxplot of one of the genes. I don't think the error bars are due to variances in my samples.
Attached Files

expressionbarplot1.pdf (12.3 KB, 116 views)
Leave a comment:
NicoBxl replied

06-11-2012, 04:39 AM
I've exactly the same problem with the -b param.
Leave a comment:
pinin4fjords replied

06-11-2012, 04:33 AM
Originally posted by Cole Trapnell View Post

We are testing a pre-release that fixes the issues reported so far, at least in our hands. Can you try:

404 Not Found

http://cufflinks.cbcb.umd.edu/downloads/cufflinks-2.0.1.Linux_x86_64.tar.gz

(No mac build or source yet - we were only able to reproduce this with our pre-compiled linux binary anyways)

Okay, will do. Do you still recommend the min-outlier-p option? Have corrections been made to Cuffdiff only, or is it necessary to re-run the whole Cufflinks sequence?
Leave a comment:

Previous 1 2 3 4 template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News