Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • novoalign multi-threading or para-processing?

    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !

  • #2
    Originally posted by qqcandy View Post
    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !
    Use the threading. I have found that it works well and then you have to avoid a merging step.

    Comment


    • #3
      Originally posted by qqcandy View Post
      1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;
      2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);
      Which one is faster, if there is a difference?
      I would choose Option 2. You will have 8 cores processing a single data set. This will minimize RAM and disk I/O. Option 1 would require 8x as much RAM and cause lots of disk I/O switching between inputs.

      Comment


      • #4
        Thanks a lot! I think we've got a consensus for option 2

        Comment


        • #5
          Just clarify, option 1 wouldn't use 8x RAM as the index is in a shared memory segment and common to all instances of Novoalign. Performance would be similar.
          Colin

          Comment


          • #6
            @sparks

            Is it a feature of novoalign, or a feature of Linux? On LSF, we have to request enough memory for each job; otherwise LSF will kill the job due to memory limit.

            Comment


            • #7
              Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
              LSF I don't know about.

              Comment


              • #8
                Originally posted by sparks View Post
                Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
                LSF I don't know about.
                Also check out shm.h, which allows processors to share memory by attaching a shared memory segment to their own processes memory segment.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Advanced Methods for the Detection of Infectious Disease
                  by seqadmin




                  The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
                  ...
                  11-27-2023, 01:15 PM
                • seqadmin
                  Strategies for Investigating the Microbiome
                  by seqadmin




                  Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
                  11-09-2023, 07:02 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 02:24 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Today, 07:37 AM
                0 responses
                15 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 08:23 AM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 12-01-2023, 09:55 AM
                0 responses
                23 views
                0 likes
                Last Post seqadmin  
                Working...
                X