Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat v2.0.4 memory usage

    I have a question about memory use by the new version of Tophat (v2.0.4). We ran the previous versions of Tophat on our cluster over 8 threads with 2 GB RAM per thread (16 GB total) with no problems. In fact, the systems admin think it was only actually using 10GB despite 16GB being available. Since we upgraded to the new version we have found that Tophat runs out of memory during the merging of the BAM files stage. We've tried several solutions, including increasing RAM to 4GB per thread over 8 threads, but the only solution which has worked is running over 4 threads and requesting 8GM memory per thread (i.e. 32 GB total). Whilst this solves our problem, it is quite heavy in terms of requesting memory from the cluster for our jobs. We are also not completely sure / convinced that Tophat is using all the memory we have requested.

    I wonder if anyone has experienced this kind of problem with the new version, or can offer any tips or suggestions which may help. Also, does anyone know why the new version of Tophat is so memory heavy, more so than the older versions?

    For info, the error message we were getting was:

    PHP Code:
            [FAILED]
    Error: [Errno 12Cannot allocate memory
    Found 123034 junctions from happy spliced reads 
    Also for info, we are mapping 50bp single end Illumina reads to the human genome using the iGenomes reference files. The use of Bowtie1 or Bowtie2 within Tophat doesn't make any difference - both run out of memory with the same error message.

    Thanks, Helen

  • #2
    I have also experienced this problem, as have others here:

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    I've "upgraded" back to 1.4.1 after seeing these issues myself. It appears using the -G option that tophat 1.4.x uses the same strategy as Tophat 2 uses to initially map to the transcriptome first, if available.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    27 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    30 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    26 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    52 views
    0 likes
    Last Post seqadmin  
    Working...
    X