Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks timing out - computing power required?

    I am analysing human transcriptome data (Illumina) via the Tophat -> Cufflinks pipeline (v2.0.2) using iGenomes references. My dataset comprises 14 patients and 6 controls, so I have 2 "conditions" to analyse with 14 and 6 biological replicates respectively.

    Until now I have been bypassing the full cufflinks protocol and just running cuffdiff providing a GTF, as follows:

    PHP Code:
    cuffdiff -p 8 -./cuffdiff_out -b genome.fa genes.gtf P1.bam,P2.bam,P3.bam,P4.bam,P5.bam,P6.bam,P7.bam,P8.bam,P9.bam,P10.bam,P11.bam,P12.bam,P13.bam,P14.bam C1.bam,C2.bam,C3.bam,C4.bam,C5.bam,C6.bam 
    This operation runs across 8 cores of our server (4GB per core) in 11-12h.

    However, I have been trying to run the full cufflinks -> cuffmerge -> cuffdiff protocol (as per the Nature Protocols publication) but as yet have not been able to successfully complete the entire process. My IT support team have been very helpful but the final cuffdiff job which I run is requiring HUGE amounts of computing power and time and I wonder what other people's experience of this is are or if I am doing something wrong.

    I have successfully run these operations:-

    Cufflinks for each BAM file:
    PHP Code:
    cufflinks -p 8 -./output_dir -b genome.fa -g genes.gtf P1.bam 
    Then create assemblies.txt file:-
    PHP Code:
    ./path/to/P1.bam
    ./path/to/P2.bam
    ...
    etc 
    Cuffmerge (this took 1h):
    PHP Code:
    cuffmerge -p 8 -./cuffmerge_out -g genes.gtf -s genome.fa assemblies.txt 
    Cuffdiff:
    PHP Code:
    cuffdiff -p 8 -./cuffdiff_out -b genome.fa -u merged.gtf P1.bam,P2.bam,P3.bam,P4.bam,P5.bam,P6.bam,P7.bam,P8.bam,P9.bam,P10.bam,P11.bam,P12.bam,P13.bam,P14.bam C1.bam,C2.bam,C3.bam,C4.bam,C5.bam,C6.bam 
    The last time I tried to run the cuffdiff step I was allocated 160GB across 8 cores for 5 days. The job timed out at the "Testing for differential expression and regulation in locus" step. It also only ever used ~30GB of the 160GB allocated.

    Can anyone offer any advice / suggestions / or even let me know how much computing power / time they use for their runs?

    Much appreciated
    Helen

  • #2
    Is this an issue just with the newest version of cufflinks (v.2.02) or did it also occur with older versions of cufflinks?

    Comment


    • #3
      Hi hlwright,

      I am also having the same problem. Could you pls tell me how you've solved your problem ?

      Thanks!

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Exploring the Dynamics of the Tumor Microenvironment
        by seqadmin




        The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
        07-08-2024, 03:19 PM
      • seqadmin
        Exploring Human Diversity Through Large-Scale Omics
        by seqadmin


        In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
        06-25-2024, 06:43 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 07-16-2024, 05:49 AM
      0 responses
      26 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-15-2024, 06:53 AM
      0 responses
      32 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-10-2024, 07:30 AM
      0 responses
      40 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-03-2024, 09:45 AM
      0 responses
      205 views
      0 likes
      Last Post seqadmin  
      Working...
      X