Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • qqcandy
    Member
    • Sep 2008
    • 15

    novoalign multi-threading or para-processing?

    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by qqcandy View Post
    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !
    Use the threading. I have found that it works well and then you have to avoid a merging step.

    Comment

    • Torst
      Senior Member
      • Apr 2008
      • 275

      #3
      Originally posted by qqcandy View Post
      1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;
      2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);
      Which one is faster, if there is a difference?
      I would choose Option 2. You will have 8 cores processing a single data set. This will minimize RAM and disk I/O. Option 1 would require 8x as much RAM and cause lots of disk I/O switching between inputs.

      Comment

      • qqcandy
        Member
        • Sep 2008
        • 15

        #4
        Thanks a lot! I think we've got a consensus for option 2

        Comment

        • sparks
          Senior Member
          • Mar 2008
          • 126

          #5
          Just clarify, option 1 wouldn't use 8x RAM as the index is in a shared memory segment and common to all instances of Novoalign. Performance would be similar.
          Colin

          Comment

          • lh3
            Senior Member
            • Feb 2008
            • 686

            #6
            @sparks

            Is it a feature of novoalign, or a feature of Linux? On LSF, we have to request enough memory for each job; otherwise LSF will kill the job due to memory limit.

            Comment

            • sparks
              Senior Member
              • Mar 2008
              • 126

              #7
              Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
              LSF I don't know about.

              Comment

              • nilshomer
                Nils Homer
                • Nov 2008
                • 1283

                #8
                Originally posted by sparks View Post
                Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
                LSF I don't know about.
                Also check out shm.h, which allows processors to share memory by attaching a shared memory segment to their own processes memory segment.

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM
                • SEQadmin2
                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                  by SEQadmin2


                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                  Introduction

                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                  05-22-2026, 06:42 AM
                • SEQadmin2
                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                  by SEQadmin2

                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                  05-06-2026, 09:04 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                19 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 11:40 AM
                0 responses
                14 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 05-28-2026, 11:40 AM
                0 responses
                29 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 05-26-2026, 10:12 AM
                0 responses
                31 views
                0 reactions
                Last Post SEQadmin2  
                Working...