Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Sparx
    Junior Member
    • Oct 2010
    • 7

    A little help with cufflinks

    Hi!
    I hope these questions won't be to stupid as I'm pretty new in the Bioinformatics field. And if these questions have been answered please point me in the right direction as I haven't found it.
    So the project I'm working on needs expression values for transcripts, genes or proteins and I am suppose to look in to high throughput sequencing data. Mainly RNA-Seq in this case.

    So I have sett up bowtie->tophat->cufflinks to extract expression values for transcripts and i put in a fastq file in one end and get a couple of output files in the other.

    I use the basic commands
    Code:
    $tophat d_mel_genome_fb5_22/d_melanogaster_fb5_22 SRR034813.fastq
    and
    Code:
    $cufflinks ../tophat_out/accepted_hits.bam
    1) Is this a good enough approach? Is there some flags that are common to use to get better(more accurate) results? Is there something I generally should think about?

    2) When i get my output (genes.expr, transcripts.expr, transcripts.gtf) files. How do I map cufflinks internal gene ID to a real ID found at for example ensemble or other more general sources.

    3) When i find for example an already processed .bam file at the GEO-Database and try to run it through cufflinks it spits out mainly errors. I assume there is more then one type of .bam formats. How do i convert (if possible) to something cufflinks accepts.

    4) Does cufflinks accept other types of processed files like wiggle files or .bed files and convert to expression values for transcirpts? Is there any other programs that does or do these files not contain that type of information. The main thing here is if there is a way to cheat a couple of steps and lower the risk of making errors in the procedure.

    Any help is appreciated and I hope my neewbiness doesn't shine through all that much. Thanks
  • mgogol
    Senior Member
    • Mar 2008
    • 197

    #2
    If you give the cufflinks command a gtf file, it will calculate FPKMs for genes and transcripts with the names from the gtf file. If you don't, it tries to define the transcriptome itself. You can also do cuffcompare with an annotation gtf file and the transcripts.gtf file generated by cufflinks to see how different they are.

    Comment

    • Sparx
      Junior Member
      • Oct 2010
      • 7

      #3
      Originally posted by mgogol View Post
      If you give the cufflinks command a gtf file, it will calculate FPKMs for genes and transcripts with the names from the gtf file. If you don't, it tries to define the transcriptome itself. You can also do cuffcompare with an annotation gtf file and the transcripts.gtf file generated by cufflinks to see how different they are.
      Thanks for the info!

      On my 4th question... Is there a pipeline to do what I want there? I haven't been able to find it. But may be that someone here has also been interested in expression values for specific genes that has.

      Comment

      Latest Articles

      Collapse

      • SEQadmin2
        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
        by SEQadmin2


        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
        ...
        06-02-2026, 10:05 AM
      • SEQadmin2
        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
        by SEQadmin2


        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


        Introduction

        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
        05-22-2026, 06:42 AM
      • SEQadmin2
        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
        by SEQadmin2

        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
        05-06-2026, 09:04 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      19 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 11:40 AM
      0 responses
      14 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 05-28-2026, 11:40 AM
      0 responses
      29 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 05-26-2026, 10:12 AM
      0 responses
      31 views
      0 reactions
      Last Post SEQadmin2  
      Working...