Seqanswers Leaderboard Ad

**Bukowski** · 09-23-2011, 03:38 PM

Originally posted by kursuni View Post

Hi, I'm very new to Bioinformatics and I have 2 FASTQ files to align. Im trying to align these sequences with BWA. But It doesnt let me do it.
It says:
[bwa_index] fail to open file '/datasets/SRR035022_1.filt.fastq'. Abort!
Aborted

Do I have to use FASTA format with BWA or I have something wrong in elsewhere ?

Thanks in Advance...

bwa 'index' is the clue. You bwa index the reference genome, then align your fastq files to it.

Your index is a FASTA file.

Your reads are fastq format.

You posted the error, but not the command you ran, so it's hard to tell what your usage was to generate the error. Failing to open the file could also mean you're just not pointing bwa to the right location, but even so it seems like you might have skipped a step ahead in your workflow

**haojam** · 09-24-2011, 01:15 AM

SeqAnswers Group

Mr/Mrs.,
I have installed and make BWA aligner software, but after getting into bwa-0.5.9 folder I get this

COPYING bamlite.c bwa bwase.o bwt_gen bwtaln.o bwtio.c bwtsw2_aux.o bwtsw2_main.o kseq.h main.c solid2fastq.pl utils.o
ChangeLog bamlite.h bwa.1 bwaseqio.c bwt_lite.c bwtgap.c bwtio.o bwtsw2_chain.c cs2nt.c ksort.h main.h stdaln.c
Makefile bamlite.o bwape.c bwaseqio.o bwt_lite.h bwtgap.h bwtmisc.c bwtsw2_chain.o cs2nt.o kstring.c main.o stdaln.h
NEWS bntseq.c bwape.o bwt.c bwt_lite.o bwtgap.o bwtmisc.o bwtsw2_core.c is.c kstring.h qualfa2fq.pl stdaln.o
README bntseq.h bwase.c bwt.h bwtaln.c bwtindex.c bwtsw2.h bwtsw2_core.o is.o kstring.o simple_dp.c utils.c
SRR038263_1.sai bntseq.o bwase.h bwt.o bwtaln.h bwtindex.o bwtsw2_aux.c bwtsw2_main.c khash.h kvec.h simple_dp.o utils.h

How should I start alignment for reference sequence (hg18 fasta format) with SRR038263.fastq . I would be glad for your support.

Regards,
Momo

**kursuni** · 09-24-2011, 09:05 AM

Originally posted by Bukowski View Post

bwa 'index' is the clue. You bwa index the reference genome, then align your fastq files to it.

Your index is a FASTA file.

Your reads are fastq format.

You posted the error, but not the command you ran, so it's hard to tell what your usage was to generate the error. Failing to open the file could also mean you're just not pointing bwa to the right location, but even so it seems like you might have skipped a step ahead in your workflow

Thank you for your reply..

Since I'm learning now, I just realized that I need to use FASTA as my reference index file and FASTQ as my read file..

I guess I'm pointing bwa to the right location, and the problem is my knowledge lack on dna sequencing and how to use these programs on linux environment..

However, Is there any source that I can download FASTA file from ?

**haojam** · 09-24-2011, 08:30 PM

Hi,

When I run BWA for aligning fasta chr.1 sequence with fastq SRR reads there is an error as below . Could you plz help me out.

Regards,

[bwa-0.5.9]$ ./bwa aln /home/DATA/chr1.fa /home/DATA/SRA/SRA012240/SRX017837/SRR038263_1.fastq > SRR038263_1.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwt_restore_bwt] fail to open file '/home/DATA/chr1.fa.bwt'. Abort!

**nilshomer** · 09-25-2011, 08:06 AM

Index your reference first!
./bwa index -a bwtsw /home/DATA/chr1.fa

**kursuni** · 09-26-2011, 01:53 PM

I indexed my reference first :

$bwa index -p indexed_chr1 -a is -c ~/fasta/chr1.fa
[bwa_index] Pack nucleotide FASTA... 4.97 sec
[bwa_index] Convert nucleotide PAC to color PAC... 1.48 sec
[bwa_index] Reverse the packed sequence... 1.57 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 110.54 seconds elapse.
[bwa_index] Construct BWT for the reverse packed sequence...
[bwa_index] 110.08 seconds elapse.
[bwa_index] Update BWT... 1.03 sec
[bwa_index] Update reverse BWT... 1.06 sec
[bwa_index] Construct SA from BWT and Occ... 49.80 sec
[bwa_index] Construct SA from reverse BWT and Occ... 50.04 sec

then I tried to align it using the commands below then it gives error while opening fastq file..

$ bwa aln ~/fasta/chr1.fa ~/datasets/SRR035022_1.filt.fastq > aln_sa.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_seq_open] fail to open file '/home/ukursuncu/datasets/SRR035022_1.filt.fastq'. Abort!
Aborted

What would I be doing wrong ?

**sdvie** · 09-26-2011, 11:54 PM

kursuni, just a general remark, I would always put the complete path to your reference file, read file and output file, just to make sure, like:

./bwa aln /path/to/folder/with/fasta/chr1.fa /path/to/folder/with/datasets/SRR035022_1.filt.fastq > /path/to/output/aln_sa.sai

If you are not so familiar with the command line yet, you can also look up the file names in your graphical file browser and copy/paste them into your command. That way, usually, the whole path is copied.

cheers,
Sophia

**haojam** · 09-28-2011, 08:20 PM

Hello,

I would like to index for aligning FASTQ sequences to FASTA

[haojamrocky@melon bwa-0.5.9]$ ./bwa index -a bwtsw -c /home/haojamrocky/DATA/hg18chr/hg18.fasta
[bwa_index] Pack nucleotide FASTA... [bns_fasta2bntseq] zero length sequence. Abort!

**haojam** · 09-28-2011, 08:24 PM

Hello,

I would like to index for aligning FASTQ sequences to FASTA hg18 reference sequence. The FASTQ sample sequences are SOLID reads. Could you please assist me. I hereby attach the error message while running on BWA for indexing the reference hg18.fasta .

[haojamrocky@melon bwa-0.5.9]$ ./bwa index -a bwtsw -c /home/haojamrocky/DATA/hg18chr/hg18.fasta
[bwa_index] Pack nucleotide FASTA... [bns_fasta2bntseq] zero length sequence. Abort!

Regards,
HR

**haojam** · 09-28-2011, 09:16 PM

Hello,

Before indexing hg18.fa , I put all chr1 to chrY including chrM to hg18.fa using this code mention below. When I tried for single chr1 fasta for indexing it runs properly. Is this error due to this code.

To cat all sequence together into one single fasta record:
$ cat chr*.fa | sed -e "/^>/d" >> hg18.fa

Regards,
HR

**sdvie** · 09-28-2011, 11:26 PM

Originally posted by haojam View Post

Hello,

To cat all sequence together into one single fasta record:
$ cat chr*.fa | sed -e "/^>/d" >> hg18.fa

Regards,
HR

Did you check how the hg18.fa looks like after this and what size it has?

**haojam** · 09-30-2011, 01:43 AM

Hello,

Does BWA support SRR.....fastq.bz2 file for aligning with the human genome reference sequence?

Regards,
HR

**ulz_peter** · 09-30-2011, 01:57 AM

Hi kursuni,

you may have a look here:

Exome sequencing analysis manual - SEQanswers

http://seqanswers.com/forums/showthread.php?t=14038

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

that might help you

**kursuni** · 10-03-2011, 11:21 AM

Originally posted by ulz_peter View Post

Hi kursuni,

you may have a look here:

Exome sequencing analysis manual - SEQanswers

http://seqanswers.com/forums/showthread.php?t=14038

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

that might help you

Dear ulz_peter, It was very helpful.. Thank you very much...
Since I'm very new to bioinformatics, I really need so much help and am trying to learn through books, papers and internet resources such as this forum. But it takes time to learn bioinformatics anyway.. However, since I'm working on my thesis about bioinformatics, I need to do this as fast as I can due to the time limitation.. Therefore, I really appreciate any help on this..
If you may suggest any other papers or document, I would appreciate it as well..

Thanks again.
Best Regards..
Ugur

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

BWA & FASTQ or FASTA

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News