Cufflinks/cuffdiff for matched pair samples and other questions

azalea14

Junior Member

Join Date: Jan 2014

Posts: 1
- Share
- Tweet
#1

Cufflinks/cuffdiff for matched pair samples and other questions

01-13-2014, 08:27 AM

Hi all, I am quite new to RNA-seq and bioinformatics in general, so I apologize if these questions are poorly phrased or rather naïve.

(1) I have paired RNA-seq data (normal vs. cancer) for several patients and have begun DEG analysis through the tophat-->cufflinks-->cuffdiff pipeline. I’m wondering if this is even a viable pathway, as a thread regarding an earlier version of cuffdiff expressed some doubts about its suitability for paired designs: http://seqanswers.com/forums/showthread.php?t=7108. Does the latest version of cufflinks/cuffdiff address these problems, or should I consider a different approach?

(2) This is probably a dumb question, but what is wrong with just importing cufflinks’ FPKMs for all my samples into Excel and running a paired t-test on them to determine differential expression? As you can tell, I am not well-versed in the statistics of this...

(3) Assuming I can use cuffdiff, I’ve been encountering issues with the FPKMs it produces, which are all much, much larger than those given by cufflinks for the samples I’ve tried. I know inconsistencies like these have been caused by different default settings between cufflinks/cuffdiff in the past, but several threads have mentioned that the latest version (v2.1.1) should have this fixed. Am I doing something wrong? Below is my code and sample output.

For cufflinks:

Code:

cufflinks –G reference.gtf patient_normal.bam

In genes.fpkm_tracking:

Code:

gene_id locus FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status gene X chr1:123596-123889 0.04095203 0.0225433 0.0751442 OK

For cuffdiff:

Code:

cuffdiff reference.gtf patient_normal.bam patient_tumor.bam

In genes_exp.diff (value_1 should refer to FPKM for normal sample):

Code:

gene_id locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant gene X chr1:123596-123889 q1 q2 NOTEST 16.7791 17.0898 0.0264722 1 1 no

Thank you for all your help!
Tags: None
dpryan

Devon Ryan

Join Date: Jul 2011

Posts: 3478
- Share
- Tweet
#2

01-13-2014, 02:37 PM

(1) Don't use cuffdiff for that sort of analysis, it only handles pairwise and time-series designs. Look into DESeq2, edgeR, or limma, all of which can handle your design.

(2) Are FPKM values normally distributed (have just looked at a few, I wouldn't assume that)? Also, unless you have quite a few samples, you'll benefit from sharing information between genes (this is done in the aforementioned packages).
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Cufflinks/cuffdiff for matched pair samples and other questions

Comment

Latest Articles

ad_right_rmr

News