Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • novoalign multi-threading or para-processing?

    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !

  • #2
    Originally posted by qqcandy View Post
    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !
    Use the threading. I have found that it works well and then you have to avoid a merging step.

    Comment


    • #3
      Originally posted by qqcandy View Post
      1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;
      2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);
      Which one is faster, if there is a difference?
      I would choose Option 2. You will have 8 cores processing a single data set. This will minimize RAM and disk I/O. Option 1 would require 8x as much RAM and cause lots of disk I/O switching between inputs.

      Comment


      • #4
        Thanks a lot! I think we've got a consensus for option 2

        Comment


        • #5
          Just clarify, option 1 wouldn't use 8x RAM as the index is in a shared memory segment and common to all instances of Novoalign. Performance would be similar.
          Colin

          Comment


          • #6
            @sparks

            Is it a feature of novoalign, or a feature of Linux? On LSF, we have to request enough memory for each job; otherwise LSF will kill the job due to memory limit.

            Comment


            • #7
              Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
              LSF I don't know about.

              Comment


              • #8
                Originally posted by sparks View Post
                Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
                LSF I don't know about.
                Also check out shm.h, which allows processors to share memory by attaching a shared memory segment to their own processes memory segment.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X