Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sohnic
    Junior Member
    • Mar 2011
    • 6

    cufflinks : analysis comparison with and without a gtf reference file

    Hi,
    I have many questions about cufflinks output. Here one of them :
    First I've used tophat to map my RNAseq (100pb) to obtain a accepted_hits.bam file.
    Then I've used cufflinks in two ways :
    - simply : cufflinks accepted_hits.bam
    - with a gtf file, that is the actually annotation of my genome (eucalyptus) :
    cufflinks -g annotGenome.gtf accepted_hits.bam
    Note that I've used the –g and not the –G option.

    One example result :
    - without reference gtf :
    one gene / one isoform : 12110-17714
    The first part of this isoform 12-16530 has the same structure intron/exon than isoforms formed with the reference.
    Then I have a last exon 16530-17714.

    -with reference gtf
    one gene/two isoforms
    + transcript 1: 12024-17350 = exact transcript from the reference
    full_read_support "no";
    The corresponding no reference last exon is now :
    16530 - 16561
    16597 - 17350
    That's my reference, but in my run this intron is mapped. There is no read that split in two parts. A few reads begin at position 16595. I've checked no read ending at 16561.
    I thing this RNA doesn't exist in my transcriptome.
    + transcript 2 : 12024-17714
    full_read_support "yes";
    last exon : 16530-17714, the same exon than the no reference version
    Why this transcript2 contains the 12024-12109 portion that is not mapped with RNAseq (instead the reference=transcript1 begin with this sequence) ?

    for the two isoforms, I have FPKM values (4 for transcript1 that doesn't seem to exist in my transcriptome and 13 for the transcript2). How cufflinks attributes those values ?

    With the version without gtf reference, I have a FPKM=36, that is the double comparing with the version with reference (13+4=17) while the mapping file is the same.

    At least, note that those transcripts are located on the forward strand of the genome and that there is nothing in gtf and cufflink results on the opposite strand at this location.

    Many thanks for your suggestion,

    S.
  • Stefano Manzini
    Junior Member
    • Dec 2013
    • 3

    #2
    Hello sohnic,

    I have no answer to your question, but I would like to ask you one because you're posting about the very same issue I am wondering about.

    Can you just give me some hints about the difference of using (or not) a local reference to run cufflinks?
    I would like to understand what does cufflinks use to assembly the information contained in target.bam file when you don't provide it a local reference. Further, I'd also like to know whether using it will make cufflinks name the transcripts with the gene names instead of CUFF.65279 and the like.

    Thank you for your help.

    Comment

    • ahalfpen
      Junior Member
      • Oct 2016
      • 1

      #3
      My experience with cufflinks

      Cufflinks will use the mapped reads and the reference genome to create a parsimonious assembly (a minimum spanning tree-esque structure) that explains the reads. The logic of the algorithm is based on the Burset and Guigo paper (Burset, Guigo, Evaluation of Gene Structure Prediction, 1996) and their observations regarding high false positive rates of gene finding programs that are guided by annotation. This is my understanding of the matter based on the past year I have spent working on similar issues.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        New Genomics Tools and Methods Shared at AGBT 2025
        by seqadmin


        This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

        The Headliner
        The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
        03-03-2025, 01:39 PM
      • seqadmin
        Investigating the Gut Microbiome Through Diet and Spatial Biology
        by seqadmin




        The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
        02-24-2025, 06:31 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 03-20-2025, 05:03 AM
      0 responses
      20 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 03-19-2025, 07:27 AM
      0 responses
      26 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 03-18-2025, 12:50 PM
      0 responses
      19 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 03-03-2025, 01:15 PM
      0 responses
      187 views
      0 reactions
      Last Post seqadmin  
      Working...