Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Subread segmentation faults on Scientific Linux

    Hi

    I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

    It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

    Has anyone seen this before?

  • #2
    Memory does not sound like the culprit here. Are you running into some other limit, say storage (quota) or tmp space or time assigned for job?

    Comment


    • #3
      Unfortunately not, the disk space assigned to the node is 10TB, the walltime is 5 days (when typically the job takes 2-3 hours) and the temp space is "unlimited" although practically is about 4TB.

      Comment


      • #4
        Dr. Shi (author of Subread) participates here and we may hear something enlightening from him. But I would have thought that once the genome index is read into memory that requirement should be more or less satisfied. Unless subread works differently (I don't use subread) than other aligners.

        Comment


        • #5
          Originally posted by abeggs View Post
          Hi

          I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

          It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

          Has anyone seen this before?
          Could you please provide the screen output and also your commands? Subread has no problems to process more than 35 million reads.

          Comment


          • #6
            Hi,

            The command I use for subjunc is:
            subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions
            And the output is:

            ========== _____ _ _ ____ _____ ______ _____
            ===== / ____| | | | _ \| __ \| ____| /\ | __ \
            ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
            ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
            ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
            ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
            v1.5.0-p1

            //============================= subjunc setting ==============================\\
            || ||
            || Function : Read alignment + Junction/Fusion detection (RNA-Seq) ||
            || Threads : 8 ||
            || Input file 1 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Input file 2 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Output file : /gpfs/projects/s-beggsa01/P141115-N-DW-28-2673271 ... ||
            || Index name : /scratch/beggsa_909907.bb2torque.bb2.cluster/hg19 ||
            || Phred offset : 33 ||
            || ||
            || All subreads : 14 ||
            || Min read1 votes : 1 ||
            || Min read2 votes : 1 ||
            || Max fragment size : 600 ||
            || Min fragment size : 50 ||
            || ||
            || Allowed mismatch : 3 bases ||
            || Max indels : 5 ||
            || # of Best mapping : 1 ||
            || Unique mapping : no ||
            || Hamming distance : no ||
            || Quality scores : no ||
            || ||
            \\===================== http://subread.sourceforge.net/ ======================//

            //====================== Running (31-Mar-2016 15:30:00) ======================\\
            || ||
            || The input file contains base space reads. ||
            || The range of Phred scores observed in the data is [2,36] ||
            || Load the 1-th index block... ||
            || Map fragments... ||
            || 0% completed, 0.3 mins elapsed, rate=3.7k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 5% completed, 11 mins elapsed, rate=3.7k fragments per second ||
            || 5% completed, 11 mins elapsed, rate=3.8k fragments per second ||
            || 6% completed, 11 mins elapsed, rate=3.9k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.0k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.1k fragments per second ||
            || 7% completed, 12 mins elapsed, rate=4.2k fragments per second ||
            || 7% completed, 13 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 7% completed, 13 mins elapsed, rate=4.4k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 25 mins elapsed, rate=4.1k fragments per second ||
            || 14% completed, 25 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 15% completed, 26 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 15% completed, 27 mins elapsed, rate=4.3k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 20% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Map fragments... ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            ./SubReadRNAPipeline-highmem: line 43: 19380 Segmentation fault subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions

            Comment


            • #7
              Thanks for providing the info. Can you also send us the fastq files so that we can reproduce the problem and find out what went wrong?
              Best,
              Wei

              Comment


              • #8
                I have solved it!

                I recompiled from source instead of using the binaries and it worked fine.

                We have a scientific linux HPC and it seems there was something about that which was causing problems if you ran the precompiled binaries.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Genetic Variation in Immunogenetics and Antibody Diversity
                  by seqadmin



                  The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                  11-06-2024, 07:24 PM
                • seqadmin
                  Choosing Between NGS and qPCR
                  by seqadmin



                  Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                  10-18-2024, 07:11 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 11:09 AM
                0 responses
                24 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Today, 06:13 AM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 11-01-2024, 06:09 AM
                0 responses
                30 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-30-2024, 05:31 AM
                0 responses
                21 views
                0 likes
                Last Post seqadmin  
                Working...
                X