Seqanswers Leaderboard Ad

**dara** · 07-17-2009, 10:18 AM

I am also experiencing this issue- bwa samse generates a segmentation fault for the genome the size of human reference and about 30 million reads

any help would be appreciated. thanks

**luisczul** · 07-21-2009, 07:03 AM

I am having the same error, as early as converting the human genome to a fasta file format with the command fasta2bfa.

**nilshomer** · 07-21-2009, 01:12 PM

Originally posted by luisczul View Post

I am having the same error, as early as converting the human genome to a fasta file format with the command fasta2bfa.

Are you running out of RAM?

**xguo** · 07-24-2009, 11:21 AM

The error I got is not related to memory, since I have even tried it in a machine with 512 GB memory. I suspect that the conversion from SOLID csfasta/quality format to fastq format may have problem. Using bwa samse -n 2 ..., I can get a simplified alignment output. There are some weird records such as:

>-"8$ 2 1865904808
chr10 -90253347 0
chr10 -50629021 0

It seems that part of the quality value is mistaken as a new read record and it was aligned to the genome millions of times. Most of the other reads look fine with the output like:

>test:1279_470_1023 1 1
chr22 +42910109 0
>test:1279_470_1108 1 1
chr18 -43820923 0
>test:1279_470_1122 0 0

Segmentation error occurs if I use bwa samse -n -1 to disable outputting multiple hits.

Any help is greatly appreciated.

Xiang

**nilshomer** · 07-24-2009, 12:03 PM

Try PMing Heng Li (lh3) who is the author of bwa. If you are in a bind, there are other SOLiD aligners (like my own BFAST), etc.

**xguo** · 07-27-2009, 07:46 AM

missing value in phred Ascii representation

It seems that solid2fastq.pl script doesn't handle missing quality value. It generates -" for phred score -1. Does anyone know how to transform score -1 to ASCII?

thanks
Xiang

**nilshomer** · 07-27-2009, 09:35 AM

Originally posted by xguo View Post

It seems that solid2fastq.pl script doesn't handle missing quality value. It generates -" for phred score -1. Does anyone know how to transform score -1 to ASCII?

thanks
Xiang

Are missing quality values listed as blanks for you? I will update the code accordingly. If you have a blank quality score, you could always give it a phred score of 1 stating not to trust the color call, or you could give it a maximum value 255 stating that you should trust the uncalled color. Tailor it to your situation. Feel free to PM me to get your issues resolved.

**xguo** · 07-27-2009, 09:45 AM

The missing quality is encoded as -1 in QV file generated by SOLID platform. The solid2fastq.pl script treated it as two values, so the resulting fastq has uneven length for the read and quality field. I changed -1 to 0, and everything is fine now.

thanks
Xiang

**nilshomer** · 07-27-2009, 11:15 AM

Originally posted by xguo View Post

The missing quality is encoded as -1 in QV file generated by SOLID platform. The solid2fastq.pl script treated it as two values, so the resulting fastq has uneven length for the read and quality field. I changed -1 to 0, and everything is fine now.

thanks
Xiang

I have changed this in BFAST's solid2fastq.pl script (which now is implemented in C for efficiency). I will release this script in an upcoming update but let me know if you want it earlier.

**GenoMax** · 07-31-2009, 10:31 AM

I would like to add that I am observing the same problem of having bwa "samse" analysis seg fault. The dataset is human illumina reads (~500 million). BWA converted about 220 million reads before the seg fault. The machine I am running this has 32GB of RAM. The process was using only about 2.3 GB.

-- hk

**totalnew** · 07-31-2009, 01:15 PM

I like to build color-space indexing by bwa. The input fast should be in nucleotide space, so I use following command to index whole human genome:

>bwa index -c human.fasta

But segmentation fault occurred everytime like this,

[bwa_index] Pack nucleotide FASTA... 60.48 sec
[bwa_index] Convert nucleotide PAC to color PAC... 31.13 sec
[bwa_index] Reverse the packed sequence... 16.62 sec
[bwa_index] Construct BWT for the packed sequence...
Segmentation fault

Can anyone tell me why that happen?

thanks
totalnew is offline Reply With Quote

**baohua100** · 08-05-2009, 04:18 PM

bwa sampe ../../genome/genome.fa aln_sa1.sai aln_sa2.sai 4_1.fq 4_2.fq > pairs.sam

also a [1]+ Segmentation fault

**zxl124** · 08-24-2009, 11:30 AM

Same problem here.

Tried the above mentioned method, change -1 to 0 in qual file. Now seq and qual have the same length in fastq file. But still the same segmentation fault problem. Same symptom as above. Use "bwa samse -n 2" can get output, and see some strange read names which are actually part of quality strings.

Could anyone help fix that?

**fpruzius** · 09-17-2009, 02:08 AM

Probable solution for segfault

I had the same problem when converting the alignment files to SAM format. I have a solution that works for me.

I used version 0.5.1 from BWA.

I'm not convinced that changing the quality value from -1 to 0 helps because the quality values are log values. And zero is not a log value. So I change every -1 and 0 in the quality files to 1.

I have written my own fastq transformation script in C and I tested it, no segmentation faults with 'bwa samse'.
However when I used the perl script on the same data I got segmentation faults.

The C script can create multiple smaller fastq files, because we align on a large cluster.

And the C script is 10 to 20 times faster than the perl script.

csfastaToFastq.tar.gz

Just run 'make' in the extracted folder.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 55 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 52 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

bwa samse segmentation fault

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News