Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rjsb
    Junior Member
    • Jan 2009
    • 1

    System requirements linux comp. for off-machine assembly/analysis

    Dear all,

    I (we) would like to assembly 454-reads (50-100 Mb, possibly 200-400 Mb) on a computer off the actual sequencing FLX-machine. Roche proposes for this a 64-bit dual processor (dual x86 CPU) with 8 Gb RAM computer running Linux (brochure october 2008).

    1. Is this requirement still valid?
    2. Should I apply a computer with Quad-core and 8 or 16 Gb memory?
    3. Are there other people running the Roche 454-software off-machine?

    thanks,

    Richard
  • cariaso
    Member
    • Jan 2008
    • 31

    #2
    Requirements seem to have changed

    BioTeam has recently been setting up a decent sized off rig analysis cluster for a client. The 454 software comes with a script valTool.sh which will report if your rig is large enough. I was quite surprised to see that this wanted 16G on the master node, since we were following the same guidelines you mentioned. We had plenty of extra nodes, but all were configured with 8G of ram. During initial testing on one of these 8G machines it was thrashing hard. Long before base calling ever finished we found some mpi environment variables which allowed the work to run across the cluster quite quickly, but I'd be very wary of a single node analysis rig with only 8G.

    Other notable requirements include :
    • Master linux kernel >= 2.6.9-34 smp 64b
    • Disk space accessible from Master >= 1TB available
    • Compute nodes require >= 4GB RAM and same CPU/ARCH/OS specs as head/master



    [email protected]
    Last edited by cariaso; 02-12-2009, 11:44 AM. Reason: base calling, not assembly

    Comment

    • Tom Bair
      Member
      • Oct 2008
      • 28

      #3
      last time I did a top when using gsMapper or gsAssembler it was only using 1 core. The image analysis/base caller for titanium is mpi/multi core aware but I don't think the other tools are so the only thing that will help you is the additional memory. On the other hand we are using an 8 core 32G machine to do image/base calling ~14 hours per full plate. So you may want to take that under consideration if that is in your plans.

      Slow to post so adding:

      I think cariaso is talking about base calling. Not assembly. I think. FLX is easy either way it is just titanium that taxes everything.
      Last edited by Tom Bair; 02-12-2009, 09:20 AM. Reason: Slow to post

      Comment

      • cariaso
        Member
        • Jan 2008
        • 31

        #4
        true I did intend base calling. corrected. It seems I've been doing too many assemblies this week.

        Comment

        • cdwan
          Junior Member
          • Apr 2009
          • 6

          #5
          runAssembly run times

          I didn't see any examples of run times for various sizes of assembly, so I thought I would post some here. Apologies if this isn't the right place.

          We're running Roche's "runAssembly" wrapper, version 2.0.00.20

          The interesting discovery that prompts this post is the "-large" flag. If you provide this flag to runAssembly, it "shortcuts some of the computationally expensive tasks" in the algorithm.

          Here are some runtimes, for single threads running on dedicated x86_64 linux machines with 8GB of RAM.

          1 data directory: 9.5M "seeds". 15 min, 9 min with LARGE flag
          2 data directories: 14M "seeds". 31 min. 21 min with LARGE flag
          3 data directories: 23M "seeds". 85 min. 21 min with LARGE flag
          4 data directories: 31M "seeds". still running. 30 min with LARGE flag.
          ...
          10 data directories: 78M "seeds". killed. 42 min with LARGE flag.

          These are sequences from a prokaryote. Your milage may vary.

          Comment

          • erimar77
            Junior Member
            • Apr 2009
            • 2

            #6
            Originally posted by cdwan View Post
            10 data directories: 78M "seeds". killed. 42 min with LARGE flag.
            What do you mean by "killed"? Did the software fail? I've had newbler assembler fail with large amounts of data as well.

            Comment

            • hlu
              Member
              • Jan 2009
              • 32

              #7
              De Novo Assembly into large genome 50 Mb to 100 MB, that is into insect range, beyond fungal genomes.

              It requires lots of memory. 8Mb memory machine is not enough.

              We are using 4 core, 32 MB machine, 64 bits. Our machine works for GS Assembly for fungal. But insect assembly is tough. Fungal runAssembly on this machine for 1 run only takes 1 hour or 2. But I did an insect assembly before on 35 runs of FLX, it took about 10 days to finish.

              -large flag for gs Assembly helps on speed. But still, I would prefer a beefy machine with huge memory. I would say as large memory as possible.

              Assembly is memory hog computation.

              Comment

              • cdwan
                Junior Member
                • Apr 2009
                • 6

                #8
                Originally posted by erimar77 View Post
                What do you mean by "killed"? Did the software fail? I've had newbler assembler fail with large amounts of data as well.
                We have no idea whether it would have succeeded eventually or not. It seemed to be progressing - slowly - through the all vs. all comparison stage. We ran out of time to mess with it.

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Today, 06:09 AM
                0 responses
                11 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                33 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                38 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                43 views
                0 reactions
                Last Post SEQadmin2  
                Working...