Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • inconsistence between cufflinks and cuffdiff

    I am using your cufflinks and cuffdiff to do some RNASeq analysis.

    1. The commands are:

    Cufflinks command:
    cufflinks --no-update-check -p 4 -G /Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf -b /Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -o cufflinks accepted_hits_C1.bam &

    cuffdiff command:

    cuffdiff -o cuffdiff_output -p 8 -L C1, C2 -b /Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -u /Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf accepted_hits_C1.bam accepted_hits_C2.bam

    2. The results are (one gene as an example):

    1) cufflinks results:

    SDF4 - - SDF4 SDF4 TSS23074 chr1:1152287-1167447 - - 3.8779 3.48702 4.27041 OK

    2) cuffdiff results:

    SDF4 - - SDF4 SDF4 TSS23074 chr1:1152287-1167447 - - 35.3147 18.8787 51.6994 OK 37.1116 19.9766 54.2858 OK

    The FPKM for the same gene SDF4 are different between the cufflinks result and cuffdiff result.
    There are a lot of genes which have different FPKMs from cufflinks and cuffdiff.


    My understanding is that they should be the consistent. Do you know why ?


    Another question is that:

    1) cufflinks result:

    The gene locus boundary is the RefSeq gene boundary. All the gene locus are correct.

    2) cuffdiff result:

    The gene locus boundary for some genes is larger than the actual RefSeq gene boundary.

    For example,

    AAGAB - - AAGAB AAGAB TSS3153 chr15:67493366-67547074 - - 20.5908 9.4824 29.4023 OK 16.6396 9.07309 24.1494 OK

    It showed the locus is chr15:67493366-67547074, but the actual locus should be below from UCSC genome browser:

    AAGAB at chr15:67493013-67547074 - (NM_024666) alpha- and gamma-adaptin-binding protein p34 isoform 1
    AAGAB at chr15:67493013-67547536 - (NM_001271885) alpha- and gamma-adaptin-binding protein p34 isoform 2
    AAGAB at chr15:67493013-67547074 - (NM_001271886) alpha- and gamma-adaptin-binding protein p34 isoform 2

    Can you tell me why ?

    Thanks,

  • #2
    Nobody can answer this question ? My question is that why the FPKMs for the same sample from cufflinks and cuffdiff are different. I think they should be same since they are from the same sample.

    Another question is that in the cuffdiff output file, the locus boundary is not the actual RefSeq gene boundary. It is larger than the actual boundary.

    Comment


    • #3
      Originally posted by jxu666 View Post
      Nobody can answer this question ? My question is that why the FPKMs for the same sample from cufflinks and cuffdiff are different. I think they should be same since they are from the same sample.

      Another question is that in the cuffdiff output file, the locus boundary is not the actual RefSeq gene boundary. It is larger than the actual boundary.
      Hi, I have the exact same problem (using GRCh38.gtf):

      Half of all gene lengths given by Cuffdiff are different (and wrong) compared to the ones given by Cufflinks (and ensembl and refseq).

      To summarize, 31707 genes are of wrong length, median gene size difference = +82.7kb (!), average gene size difference = +176kb.., so, yes, larger gene boundaries than the real ones.

      At the same time, 32528 genes show the exact same gene coordinates.

      How is this possible, I'm quite baffled. I'm pretty sure this leads to the wrong calculation of FPKM, since reads of "other genes" are included in those 31.7k genes..

      Is there a solution for this? Can't find anything so far about it..

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Advances in Sequencing Analysis Tools
        by seqadmin


        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
        05-06-2024, 07:48 AM
      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 02:46 PM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-07-2024, 06:57 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-06-2024, 07:17 AM
      0 responses
      17 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-02-2024, 08:06 AM
      0 responses
      23 views
      0 likes
      Last Post seqadmin  
      Working...
      X