Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cuffdiff 2.0: figure this one out...

    I've been running the new release of cuffdiff to see how it does the last day or two. It certainly produces more conservative gene lists and it seems to be better at throwing out genes that might be DE on average across replicates but have expression that seems erratic. I love the new output files that allow you to "dig in" a little more than before to see what's going on.

    So that's what I'm doing and I found something strange.

    I've run a 3 vs 3 wt vs mutant test with cuffdiff. Viewing the scatter plot of condition 2 vs condition 1 the overall plot looks good. there are several genes that stand out pretty far from the main body of the scatter with FPKMs > 100 in one condition or the other but they are not called significant. So I wanted to take a look at those. The first one was "Snora64" (i'm working with Mouse).

    I tracked it down in several of the cuffdiff outputs:

    gene_exp.diff:
    Code:
    XLOC_009930	XLOC_009930	Snora64	chr17:24857007-24858872	wt	ko	OK	52.6625	172.027	1.70779	-0.171531	0.863806	0.999999	no
    isoform_exp.diff
    Code:
    uc008ayc.1	XLOC_009930	Snora64	chr17:24857007-24858872	wt	ko	OK	52.6625	172.027	1.70779	-0.171531	0.863806	0.999999	no
    tss_group_exp.diff
    Code:
    TSS13588	XLOC_009930	Snora64	chr17:24857007-24858872	wt	ko	OK	52.6625	172.027	1.70779	-0.171531	0.863806	0.999999	no
    Each of these files reports that this gene has FPKM of 52.6625 in the wt condition and 172.027 in the ko condition. So I figured there must be some wacky variance in this gene across replicates in each condition so I checked out the new file genes.read_groups_tracking to see how the gene is expressed and how many reads it received across conditions. This is where I get a little confused.

    genes.read_groups_tracking:
    Code:
    XLOC_009930	ko	0	2.00145	1.45042	1.50285	0.822565	-	OK	Snora64
    XLOC_009930	ko	1	0	0	0	0	-	OK	Snora64
    XLOC_009930	ko	2	1.00112	0.95579	0.99034	0.542051	-	OK	Snora64
    XLOC_009930	wt	0	0	0	0	0	-	OK	Snora64
    XLOC_009930	wt	1	1	1.14625	1.11053	0.607837	-	OK	Snora64
    XLOC_009930	wt	2	0	0	0	0	-	OK	Snora64
    This file shows the FPKM of this gene across each of the replicates in both conditions in the 7th column (one left of the '-' column). Those expressions are all less than 1. So why is the expression reported to be so high in every other file? Other genes with comparable expression in gene_exp.diff or genes.fpkm_tracking, when looked up in this file, match up pretty well. I'd believe the information in this file based on what the coverage looks like across the locus this gene is in over what is reported in the other files.

    There's actually several of these mis-matched expressions in my output - most of them are these same type of genes (short, single exon genes in intergenic regions of other genes). It's distracting to get odd expression values in the output like this. So why does it happen...and why is the more "correct" expression reported in genes.read_groups_tracking but a different, and much higher, expression level reported in the differential expression output files? I'm sure nobody can answer that one except Cole but I think it's good to report odd findings like this.
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

Latest Articles

Collapse

  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    Yesterday, 01:16 PM
  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 07:15 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-23-2024, 10:28 AM
0 responses
17 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-23-2024, 07:35 AM
0 responses
20 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-22-2024, 02:06 PM
0 responses
10 views
0 likes
Last Post seqadmin  
Working...
X