Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • abeggs
    Junior Member
    • Dec 2014
    • 4

    Subread segmentation faults on Scientific Linux

    Hi

    I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

    It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

    Has anyone seen this before?
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    Memory does not sound like the culprit here. Are you running into some other limit, say storage (quota) or tmp space or time assigned for job?

    Comment

    • abeggs
      Junior Member
      • Dec 2014
      • 4

      #3
      Unfortunately not, the disk space assigned to the node is 10TB, the walltime is 5 days (when typically the job takes 2-3 hours) and the temp space is "unlimited" although practically is about 4TB.

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        Dr. Shi (author of Subread) participates here and we may hear something enlightening from him. But I would have thought that once the genome index is read into memory that requirement should be more or less satisfied. Unless subread works differently (I don't use subread) than other aligners.

        Comment

        • shi
          Wei Shi
          • Feb 2010
          • 236

          #5
          Originally posted by abeggs View Post
          Hi

          I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

          It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

          Has anyone seen this before?
          Could you please provide the screen output and also your commands? Subread has no problems to process more than 35 million reads.

          Comment

          • abeggs
            Junior Member
            • Dec 2014
            • 4

            #6
            Hi,

            The command I use for subjunc is:
            subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions
            And the output is:

            ========== _____ _ _ ____ _____ ______ _____
            ===== / ____| | | | _ \| __ \| ____| /\ | __ \
            ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
            ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
            ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
            ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
            v1.5.0-p1

            //============================= subjunc setting ==============================\\
            || ||
            || Function : Read alignment + Junction/Fusion detection (RNA-Seq) ||
            || Threads : 8 ||
            || Input file 1 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Input file 2 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Output file : /gpfs/projects/s-beggsa01/P141115-N-DW-28-2673271 ... ||
            || Index name : /scratch/beggsa_909907.bb2torque.bb2.cluster/hg19 ||
            || Phred offset : 33 ||
            || ||
            || All subreads : 14 ||
            || Min read1 votes : 1 ||
            || Min read2 votes : 1 ||
            || Max fragment size : 600 ||
            || Min fragment size : 50 ||
            || ||
            || Allowed mismatch : 3 bases ||
            || Max indels : 5 ||
            || # of Best mapping : 1 ||
            || Unique mapping : no ||
            || Hamming distance : no ||
            || Quality scores : no ||
            || ||
            \\===================== http://subread.sourceforge.net/ ======================//

            //====================== Running (31-Mar-2016 15:30:00) ======================\\
            || ||
            || The input file contains base space reads. ||
            || The range of Phred scores observed in the data is [2,36] ||
            || Load the 1-th index block... ||
            || Map fragments... ||
            || 0% completed, 0.3 mins elapsed, rate=3.7k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 5% completed, 11 mins elapsed, rate=3.7k fragments per second ||
            || 5% completed, 11 mins elapsed, rate=3.8k fragments per second ||
            || 6% completed, 11 mins elapsed, rate=3.9k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.0k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.1k fragments per second ||
            || 7% completed, 12 mins elapsed, rate=4.2k fragments per second ||
            || 7% completed, 13 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 7% completed, 13 mins elapsed, rate=4.4k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 25 mins elapsed, rate=4.1k fragments per second ||
            || 14% completed, 25 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 15% completed, 26 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 15% completed, 27 mins elapsed, rate=4.3k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 20% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Map fragments... ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            ./SubReadRNAPipeline-highmem: line 43: 19380 Segmentation fault subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions

            Comment

            • shi
              Wei Shi
              • Feb 2010
              • 236

              #7
              Thanks for providing the info. Can you also send us the fastq files so that we can reproduce the problem and find out what went wrong?
              Best,
              Wei

              Comment

              • abeggs
                Junior Member
                • Dec 2014
                • 4

                #8
                I have solved it!

                I recompiled from source instead of using the binaries and it worked fine.

                We have a scientific linux HPC and it seems there was something about that which was causing problems if you ran the precompiled binaries.

                Comment

                Latest Articles

                Collapse

                • GATTACAT
                  Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by GATTACAT
                  Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                  Yesterday, 11:43 AM
                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-30-2026, 05:37 AM
                0 responses
                9 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                18 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                52 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                110 views
                0 reactions
                Last Post SEQadmin2  
                Working...