Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Subread segmentation faults on Scientific Linux

    Hi

    I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

    It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

    Has anyone seen this before?

  • #2
    Memory does not sound like the culprit here. Are you running into some other limit, say storage (quota) or tmp space or time assigned for job?

    Comment


    • #3
      Unfortunately not, the disk space assigned to the node is 10TB, the walltime is 5 days (when typically the job takes 2-3 hours) and the temp space is "unlimited" although practically is about 4TB.

      Comment


      • #4
        Dr. Shi (author of Subread) participates here and we may hear something enlightening from him. But I would have thought that once the genome index is read into memory that requirement should be more or less satisfied. Unless subread works differently (I don't use subread) than other aligners.

        Comment


        • #5
          Originally posted by abeggs View Post
          Hi

          I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

          It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

          Has anyone seen this before?
          Could you please provide the screen output and also your commands? Subread has no problems to process more than 35 million reads.

          Comment


          • #6
            Hi,

            The command I use for subjunc is:
            subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions
            And the output is:

            ========== _____ _ _ ____ _____ ______ _____
            ===== / ____| | | | _ \| __ \| ____| /\ | __ \
            ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
            ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
            ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
            ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
            v1.5.0-p1

            //============================= subjunc setting ==============================\\
            || ||
            || Function : Read alignment + Junction/Fusion detection (RNA-Seq) ||
            || Threads : 8 ||
            || Input file 1 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Input file 2 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Output file : /gpfs/projects/s-beggsa01/P141115-N-DW-28-2673271 ... ||
            || Index name : /scratch/beggsa_909907.bb2torque.bb2.cluster/hg19 ||
            || Phred offset : 33 ||
            || ||
            || All subreads : 14 ||
            || Min read1 votes : 1 ||
            || Min read2 votes : 1 ||
            || Max fragment size : 600 ||
            || Min fragment size : 50 ||
            || ||
            || Allowed mismatch : 3 bases ||
            || Max indels : 5 ||
            || # of Best mapping : 1 ||
            || Unique mapping : no ||
            || Hamming distance : no ||
            || Quality scores : no ||
            || ||
            \\===================== http://subread.sourceforge.net/ ======================//

            //====================== Running (31-Mar-2016 15:30:00) ======================\\
            || ||
            || The input file contains base space reads. ||
            || The range of Phred scores observed in the data is [2,36] ||
            || Load the 1-th index block... ||
            || Map fragments... ||
            || 0% completed, 0.3 mins elapsed, rate=3.7k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 5% completed, 11 mins elapsed, rate=3.7k fragments per second ||
            || 5% completed, 11 mins elapsed, rate=3.8k fragments per second ||
            || 6% completed, 11 mins elapsed, rate=3.9k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.0k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.1k fragments per second ||
            || 7% completed, 12 mins elapsed, rate=4.2k fragments per second ||
            || 7% completed, 13 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 7% completed, 13 mins elapsed, rate=4.4k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 25 mins elapsed, rate=4.1k fragments per second ||
            || 14% completed, 25 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 15% completed, 26 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 15% completed, 27 mins elapsed, rate=4.3k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 20% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Map fragments... ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            ./SubReadRNAPipeline-highmem: line 43: 19380 Segmentation fault subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions

            Comment


            • #7
              Thanks for providing the info. Can you also send us the fastq files so that we can reproduce the problem and find out what went wrong?
              Best,
              Wei

              Comment


              • #8
                I have solved it!

                I recompiled from source instead of using the binaries and it worked fine.

                We have a scientific linux HPC and it seems there was something about that which was causing problems if you ran the precompiled binaries.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X