Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • IBseq
    replied
    hi all, i had the same prob and i was told to run cuffdiff WITHOUT the "N" option (perform quartile normalization)

    hope it helps....
    ib

    Leave a comment:


  • mmanrique
    replied
    Hi,

    we had the same problem and tried the new Cufflinks version 2.0.2 and it seems the values from Cufflinks and Cuffdiff are the same (have to check it more carefully)

    these are the commands I used

    Code:
    cufflinks -o ./Sample001_cufflinks_out_No_N_2.0.2 -u -g ../genes.gtf -p 2 --total-hits-norm ../Sample_001_accepted_hits.bam
    Code:
    cuffdiff -o ./COMPARISON1_SAMPLE1_SAMPLE1BIS_cuffdiff_out/ -L SAMPLE1,SAMPLE1BIS -p 2 -u -v -emit-count-tables -total-hits-norm ../Sample001_cufflinks_out/transcripts.gtf ../Sample_001_accepted_hits.sam ../Sample_001_bis_accepted_hits.sam
    I know it's weird to use cuffdiff to compare one sample to itself but I had no other choice...

    HTH

    Marina

    EDIT: Though the FPKM values from Cufflinks and Cuffdiff are now more similar I still get unreasonable high FPKM values specially for very short genes (around 37nt, regulatory RNAs I guess). Searching for some kind of explanation I found this thread http://seqanswers.com/forums/showthread.php?t=20702 it's worth reading it, good explanation by Cole Trapnell on why in small genes you can get extremely high FPKM values
    Last edited by mmanrique; 08-04-2012, 07:25 AM.

    Leave a comment:


  • sudders
    replied
    Did you ever find a solution to this? We run into the same problem.

    Our pipeline is thus:
    We map reads with tophat for each sample
    Run cufflinks on each sample to generate a transcriptome assembly

    the command looks something like:
    Code:
     cufflinks --label tax-Pre-R5
                   --num-threads 4
                   --library-type fr-secondstrand
                   --frag-bias-correct /ifs/mirror/genomes/bowtie/hg19.fa
                   --multi-read-correct
                   --upper-quartile-norm
                   /ifs/projects/proj004/rnaseq4/tax-Pre-R5.accepted.bam
    Run Cuffmerge and Cuffcompare to generate merged gene sets.

    We also run cuff diff to test for differences.

    Our cuffdiff commands look like:

    Code:
     cuffdiff --output-dir abinitio.cuffdiff.dir             
                     --library-type fr-secondstrand
                     --upper-quartile-norm 
                     --frag-bias-correct /ifs/mirror/genomes/bowtie/hg19.fa
                     --multi-read-correct
                     --verbose
                     --num-threads 16
                     --labels Prostate-Pre-agg,Prostate-Post-agg,tax-Pre-agg,tax-Post-agg              
                     --FDR 0.050000
                    abinitio.gtf
                  Prostate-Pre-R7.accepted.bam,Prostate-Pre-R1.accepted.bam,Prostate-Pre-R4.accepted.bam,Prostate-Pre-R2.accepted.bam,Prostate-Pre-R8.accepted.bam,Prostate-Pre-R5.accepted.bam,Prostate-Pre-R3.accepted.bam,Prostate-Pre-R6.accepted.bam
                 Prostate-Post-R7.accepted.bam,Prostate-Post-R8.accepted.bam,Prostate-Post-R6.accepted.bam,Prostate-Post-R3.accepted.bam,Prostate-Post-R5.accepted.bam,Prostate-Post-R2.accepted.bam,Prostate-Post-R4.accepted.bam,Prostate-Post-R1.accepted.bam   
                tax-Pre-R1.accepted.bam,tax-Pre-R3.accepted.bam,tax-Pre-R2.accepted.bam,tax-Pre-R6.accepted.bam,tax-Pre-R4.accepted.bam,tax-Pre-R5.accepted.bam
               tax-Post-R6.accepted.bam,tax-Post-R1.accepted.bam,tax-Post-R4.accepted.bam,tax-Post-R5.accepted.bam,tax-Post-R2.accepted.bam,tax-Post-R3.accepted.bam
    If we compare the FPKMs coming out of cuffcompare and cuffdiff they are not even within two or three orders of magnitude of each other, with the cuffcompare FPKMs being in the millions or tens of millions, while the cuffdiff outputs being in the more sensible 0 - several hundred range.

    We're using cufflinks 1.3.1.

    Leave a comment:


  • peromhc
    replied
    also, I just realized that log10(1630419.4581286784) is about 6, which is pretty close to 10.. I wonder if the difference is this easy.

    Leave a comment:


  • polyatail
    replied
    Just at first glance, in your cufflinks run you specify two different parameters that will affect the FPKM calculation.
    Code:
    --upper-quartile-norm --max-mle-iterations 20000
    I would try changing --max-mle-iterations to match cuffdiff, disabling quartile normization, and running the biological replicates through cufflinks separately to see if this difference is true. Then I would try cufflinks with the merged BAMs. Internally the same code does the quantification in both cufflinks and cuffdiff.

    Also, I noticed you're looking in transcripts.gtf for cufflinks and gene_exp.diff for cuffdiff. It would be better to look in isoforms.fpkm_tracking for both cufflinks and cuffdiff, as gene_exp.diff lists quantification at the locus level while transcripts.gtf is at the isoform level.

    Leave a comment:


  • peromhc
    replied
    I should note that 'social.bam' is just a product of samtools merge for all the individuals in the social treatment.. Those bamfiles are listed individually in Cuffdiff-- to indicate that there are biological replicates.

    So, in essence, the FPKM from social.bam from cufflinks should be the average value from all the individuals in that group.

    Leave a comment:


  • peromhc
    started a topic cufflinks FPKM >>> Cuffdiff FPKM

    cufflinks FPKM >>> Cuffdiff FPKM

    I cannot understand why the FPKM estimated in cufflinks is SO much larger than that in cuffdiff:

    Cufflinks
    Code:
    cufflinks -p8 -m320 -u -o /media/hd/working/tuco/17Jan12socialcuff -L social \
    --upper-quartile-norm --max-mle-iterations 20000 \
    /media/hd/working/tuco/b2.social/social.bam
    
    cat transcripts.gtf | grep 'comp14388_c0_seq1'
    
    comp14388_c0_seq1; FPKM "[B]1630419.4581286784[/B]";
    I merged the .gtf files from each cufflinks run, and fed that to cufflinks
    I have 5 biological reps for each group

    Cuffdiff
    Code:
    mkdir /media/hd/working/tuco/17Jan.cuffdiff
    cd /media/hd/working/tuco/17Jan.cuffdiff
    
    cuffdiff -p8 -L social,solitary -N -u \
    --max-mle-iterations 10000 /media/hd/working/tuco/17Jan12cuffcompare/*gtf \
    /media/hd/working/tuco/b2.bams/406A.bam,\
    /media/hd/working/tuco/b2.bams/4262.bam,\
    /media/hd/working/tuco/b2.bams/2354.bam,\
    /media/hd/working/tuco/b2.bams/4241.bam,\
    /media/hd/working/tuco/b2.bams/401C.bam \
    /media/hd/working/tuco/b2.bams/6236.bam,\
    /media/hd/working/tuco/b2.bams/2226.bam,\
    /media/hd/working/tuco/b2.bams/5B5C.bam,\
    /media/hd/working/tuco/b2.bams/255D.bam,\
    /media/hd/working/tuco/b2.bams/4572.bam
    
    cat gene_exp.diff | grep 'comp14388_c0_seq1'
    
    comp14388_c0_seq1:0-1977	social	solitary	[B]10.5437[/B]	8.08172

    ok... 1630419.4581286784 >>> 10.5437 Why??

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
34 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
37 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
33 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X