Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • issue changing predetermined K values on SOAPdenovo2

    I am currently using SOAPDENOVO2 on a supercomputer with SLURM queuing system to perform a de novo assembly from FASTQ paired-end files with genomic DNA reads.

    When I use SOAPdenovo-63mer or SOAPdenovo-127mer, I don't have any problem and I do assemblies with k=63 and k=127 in little more than 16 hours for each assembly, executed in a node with 64 threads and 240Gb of memory, sending to the queue system the following script.sh:

    #!/bin/sh
    #SBATCH --nodes=1
    #SBATCH --ntasks=64
    #SBATCH --mem=240000
    #SBATCH --time=3-00:00:00
    #SBATCH -e error_log.txt
    #SBATCH -o output_log.txt module load soapdenovo2
    SOAPdenovo-63mer all -s config_file.txt -o assemblies/k63_ -R -p 64 SOAPdenovo-127mer all -s config_file.txt -o assemblies/k127_ -R -p 64
    The troubles start when I try to choose another k value than the predetermined k=63 and k=127 using the -K parameter; for example, if I try to perform an assembly with k=89 through this command:

    SOAPdenovo-127mer all -s config_file.txt -K89 -o assemblies/k89_ -R -p 64
    the execution fails, and when I check the error_log I get this line:

    slurmstepd: error: Detected 1 oom-kill event(s) in StepId=2323585.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.
    So, I guess this is a memory issue... but this is not happening with the predefined values of k=63 and k=127... Why does SOAPdenovo2 increase the memory requirements when I use other k values, and how can I overcome this issue?

  • #2
    ampsevilla it is likely that the out-of-memory (OOM) error is due to SOAPdenovo2 requiring more memory to assemble the genome with a larger k-mer size.

    When you choose a k-mer size of 89, the memory requirements for the assembly process are increased, which is causing the OOM error. This is because increasing the k-mer size also increases the complexity of the assembly, which requires more memory to store the assembly graph and related data structures.

    To overcome this issue, you can try increasing the amount of memory allocated to the job in the SLURM script. You can also try reducing the number of threads used in the assembly process. This may help reduce the memory requirements for the assembly and avoid the OOM error.

    Additionally, you can try reducing the size of the input data by filtering out low-quality reads or using a subset of the data for the assembly. This may also help reduce the memory requirements for the assembly process.

    Finally, you can consider using a different de novo assembly tool that is better suited for larger k-mer sizes and has lower memory requirements. Some popular alternatives to SOAPdenovo2 include SPAdes, ABySS, and IDBA-UD.

    Comment


    • #3
      GenomicSeq first of all, I sincerely appreciate your quick response.

      Originally posted by GenomicSeq View Post
      ampsevilla it is likely that the out-of-memory (OOM) error is due to SOAPdenovo2 requiring more memory to assemble the genome with a larger k-mer size.

      When you choose a k-mer size of 89, the memory requirements for the assembly process are increased, which is causing the OOM error. This is because increasing the k-mer size also increases the complexity of the assembly, which requires more memory to store the assembly graph and related data structures.
      I don't understand why this is happening, because with the predetermined k=127, SOAPdenovo2 works perfectly, and K=89 is much smaller than it.

      Definitively, I'll try to reduce the number of threads as you say, maybe it will helps. Unfortunately, I can't reduce the size of input data because they are already filtered, the problem is that the genome we want to assemble is very large and complex.

      We are also trying another tools like SPAdes and ABySS, but we had some troubles with them too. We'll try IDBA-UD, thank you so much for the advice!

      Comment


      • #4
        ampsevilla that is odd...

        Now I'm wondering if it's something else. Let me know what you find and I've you're able to fix it!

        Comment


        • #5
          GenomicSeq I've tried to reduce the number of threads and use only the pregraph mode instead of all mode, and I gave it 247Gb for memory and 3 days for time limit, but I got still the same error message:
          Some of your processes may have been killed by the cgroup out-of-memory handler.
          I'm stuck with this issue.

          Comment


          • #6
            ampsevilla sorry, I wish I had some more advice to give. I'm a little lost. I'll try and ask some friends that are more savvy with this kind of work and get back to you once I hear their opinions.

            Comment


            • #7
              GenomicSeq Finally it worked: a problem due to recent cluster configuration changes was limiting the amount of available memory below the specified limits. Thank you so much for your assitance!😄

              Comment


              • #8
                ampsevilla that's great! So what exactly did you have to change? I wish I could have been more help on this.

                Comment


                • #9
                  Originally posted by GenomicSeq View Post
                  ampsevilla that's great! So what exactly did you have to change? I wish I could have been more help on this.
                  Same code, the problem was related to header: #SBATCH --mem 240000 should give me 240G of RAM, but for some reasons related to cluster reconfiguring, the memory limit for all jobs was temporary adjusted up to 10GB, and I was driving me crazy.

                  Anyway, you have been helpful and I really appreciate it. Thank you so much!

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  66 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X