Unconfigured Ad

**nickloman** · 07-13-2010, 08:17 AM

Isn't BWA called as follows:

bwa aln <database> <FASTQ file>

a la

http://bio-bwa.sourceforge.net/bwa.shtml

Edit: Actually I can see you did that already. Did you run bwa index on the mouse database first?

**raela** · 07-13-2010, 08:33 AM

Also, if you did index it, are you sure you indexed correctly? I had the same issue until I made another index for the genome, then it ran fine. If you want to test if it's the reads, only grab one chromosome, index it, and see if it aligns then.

**stubrown** · 07-13-2010, 09:50 AM

OK, that was a very helpful idea to work with the index of just one chromosome. That worked fine. So I guess my problem is really about how do I index an entire genome of chromosome *.fa files? (i.e. the Mouse mm9 genome from UCSC); or once indexed, how do I point to all of them as a database for the 'bwa aln' command?

I did this by running the 'bwa index' command inside of a 'foreach' loop for each of my *.fa files. I got a lot of index files (.pac, .rpac, etc), but perhaps this is the source of my segmentation fault error.

**raela** · 07-13-2010, 10:19 AM

That would most likely be the issue! You need to combine the chromosomes into one file. On a *nix machine, say each is named chr##.fa (I know it's this way for the horse.. chr1, chr2, chr3, ... chrX) - you would do
cat chr*.fa > genome.fa
This tells it to put the contents of all files in the new file genome.fa. Then you index, but, you probably want to use -a bwtsw.. I believe not including that flag was my error. So, you'd do
bwa index -p prefix -a bwtsw genome.fa

**history_of_robots** · 07-03-2011, 07:39 PM

Originally posted by raela View Post

That would most likely be the issue! You need to combine the chromosomes into one file. On a *nix machine, say each is named chr##.fa (I know it's this way for the horse.. chr1, chr2, chr3, ... chrX) - you would do
cat chr*.fa > genome.fa
This tells it to put the contents of all files in the new file genome.fa. Then you index, but, you probably want to use -a bwtsw.. I believe not including that flag was my error. So, you'd do
bwa index -p prefix -a bwtsw genome.fa

You are suggesting to use '-a bwtsw'. On BWA manual (http://bio-bwa.sourceforge.net/bwa.shtml) it says: "BWA-SW can also be used to align ~100bp reads, but it is slower than the short-read algorithm." and "On low-error short queries, BWA-SW is slower and less accurate than the first algorithm [IS], but on long queries, it is better". So it is '-a is' that is seemed to be required for BWA indexing a genome for subsequent short read alignment. However, in the same manual page it says: 'IS is moderately fast, but does not work with database larger than 2GB'. A complete genome can be larger than that (for example mm9.fa is 2.5GB). So I am wondering if chromosomes should be indexed separately. Unfortunately, it seems that in this case BWA will have to be run on each chromosome separately it seems. Or is there another way to use IS on the whole genome?

**raela** · 07-03-2011, 08:24 PM

Originally posted by history_of_robots View Post

You are suggesting to use '-a bwtsw'. On BWA manual (http://bio-bwa.sourceforge.net/bwa.shtml) it says: "BWA-SW can also be used to align ~100bp reads, but it is slower than the short-read algorithm." and "On low-error short queries, BWA-SW is slower and less accurate than the first algorithm [IS], but on long queries, it is better". So it is '-a is' that is seemed to be required for BWA indexing a genome for subsequent short read alignment. However, in the same manual page it says: 'IS is moderately fast, but does not work with database larger than 2GB'. A complete genome can be larger than that (for example mm9.fa is 2.5GB). So I am wondering if chromosomes should be indexed separately. Unfortunately, it seems that in this case BWA will have to be run on each chromosome separately it seems. Or is there another way to use IS on the whole genome?

No, what I said is correct - you are thinking of "bwa bwasw" as an alternate to aln + samse/sampe. The -a bwtsw uses the algorithm for indexing a large genome. BWTSW vs. IS has nothing to do with read size - just genome size.

**history_of_robots** · 07-03-2011, 09:00 PM

Oh cool, you are absolutely right! That's the right option to index a genome for short read alignment. Thanks very much.

**niti217** · 12-28-2011, 12:07 PM

I am having similar problem - any help would be greatly appreciated.

I am trying to index Homo_sapiens.GRCh37 ...fa file using the command

bwa index -p myGenome -a bwtsw /directory/myGenome.fa

but it keeps giving me the following error

[bwa_index] Pack FASTA... 56.76 sec
[bwa_index] Reverse the packed sequence... Segmentation fault

Can someone please help me with possible suggestion to fix this. Thank you.

**houkto** · 12-31-2011, 03:11 PM

Originally posted by niti217 View Post

I am having similar problem - any help would be greatly appreciated.

I am trying to index Homo_sapiens.GRCh37 ...fa file using the command

bwa index -p myGenome -a bwtsw /directory/myGenome.fa

but it keeps giving me the following error

[bwa_index] Pack FASTA... 56.76 sec
[bwa_index] Reverse the packed sequence... Segmentation fault

Can someone please help me with possible suggestion to fix this. Thank you.

I saw a similar error in another forum http://biostar.stackexchange.com/que...entation-fault it turns out to be a hardware issue i.e memory less than 2G

N.B happy new year

**niti217** · 01-03-2012, 10:13 AM

Originally posted by houkto View Post

I saw a similar error in another forum http://biostar.stackexchange.com/que...entation-fault it turns out to be a hardware issue i.e memory less than 2G

N.B happy new year

Thanks so much for your help. I really appreciate it.

Topics	Statistics	Last Post
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 27 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 37 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 61 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM

Unconfigured Ad

Need workflow from SRA to BreakDancer

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News