Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks gtf files and fpkms

    Hi!

    I'm working with Cufflinks and I need help urgently!

    According to the documentation of Cufflinks, the -g <reference_annotation.(gtf/gff)> option does the following:

    Tells Cufflinks to use the reference Supplied annotation (GFF) to guide RABT assembly. Reference transcripts will be tiled with faux-reads to Provide additional information in assembly. Output will include all reference transcripts as well as novel genes and isoforms Any That are assembled.

    Therefore we should expect that in output files, all genes from GTF reference file (57,559 annotated genes) would appear and this files should have at least 57,000 lines (one for each gene).

    However, I've got an 89,063 FPKMs genes file, 32,000 of which are annotated genes and the rest are potential new genes...

    The only explanation I find is that although the documentation says, the GTF reference file genes not found in SAM file (because they were not expressing I guess), are being ignored. But then, why appear genes with a FPKM of 0?
    Maybe they are expressed genes but after normalization the FPKM value is so low that Cufflinks rounded it to 0?.

    In addition, there is another problem with the results of Cufflinks:
    Cufflinks generates 2 files transcripts.gtf and genes.fpkm_tracking.
    First one contains all the genes and isomorphs assemblies, while second one contains FPKM values per gene without isomorphisms.
    It should be expected that a gene appears once in the previous file, but sometimes genes appear several times with different FPKM values and I'm not able to find a criteria to discriminate which is better than the other ...

    Any idea?

    Thanks and best regards!

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-25-2024, 11:49 AM
0 responses
19 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-24-2024, 08:47 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
62 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Working...
X