Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bwa sampe segmentation fault

    Hi all,
    i am using bwa-0.5.9 for illumina output.

    i have reads from 2 samples of same organism.
    So i have 2 databases.

    when i run BWA on the first, everything went well. the SAM file is 3.3GB.

    On the second, i got segmentation fault in the last part, when converting to *.sam.
    it succeed to convert 2.2GB , and then seg` fault.

    My steps are:
    bwa index -a is database.fasta

    bwa aln database.fasta read_1.fastq > database_aln_sa_1.sai
    bwa aln database.fasta read_2.fastq > database_aln_sa_2.sai

    bwa sampe database.fasta database_aln_sa_1.sai database_aln_sa_2.sai read_1.fastq read_2.fastq > database_aln.sam

    When i run bwa with samse on each end separately , i got 2 sam file , each in size 1.6GB , as expected.


    My questions are:
    How can i solve the Segmentation fault?
    Or, Can i merge the 2 sam files that have been resulted separately to one sam?without loosing any data?

    Thanks in advance..

  • #2
    Dear papori,

    my best solution at BWA segmentation fault has always been to change cpu and increase RAM. Otherwise, you may try splitting your original paired fastq files in half and try again.

    And then merge the resulting (paired) BAM files with Picard Tools MergeSamFiles

    Good luck!

    Comment


    • #3
      I'm running into a similar problem. I've got about 1 million paired-end reads (~500,000 pairs) that I'm aligning to about 6,000 transcript sequences. bwa aln works fine for both of them, but bwa sampe exits immediately with a segmentation fault, both with and without the -P option.

      I'm using the following format:
      bwa sampe -f part_1_part_2.sam transcripts.fa part_1.aln part_2.aln part_1.fq part_2.fq
      The only feedback I get is:
      Segmentation fault (core dumped)
      I'm working on a machine with 128GB memory, so I really can't imagine it's a memory issue. I've used bwa sampe successfully with similar data (500,000 pairs and 28,000 transcripts), but not with this particular data set.

      It's frustrating, to say the least.

      Comment


      • #4
        I'm having the same trouble with bwa 0.6.2. Its really frustrating.

        I have a machine with Intel Core i5 and 8 GB RAM.

        Comment


        • #5
          Did you guys ever find solutions to your problems of segmentation fault?

          Comment


          • #6
            In our case the problem went away when we reduced the size of the headers in the FASTA file. We were storing meta-data in the headers that sometimes grew pretty long (100's of characters). By storing the meta-data in a table and using only the indexes as FASTA headers, the problem went away.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            25 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            27 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X