Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA & FASTQ or FASTA

    Hi, I'm very new to Bioinformatics and I have 2 FASTQ files to align. Im trying to align these sequences with BWA. But It doesnt let me do it.
    It says:
    [bwa_index] fail to open file '/datasets/SRR035022_1.filt.fastq'. Abort!
    Aborted

    Do I have to use FASTA format with BWA or I have something wrong in elsewhere ?

    Thanks in Advance...

  • #2
    Originally posted by kursuni View Post
    Hi, I'm very new to Bioinformatics and I have 2 FASTQ files to align. Im trying to align these sequences with BWA. But It doesnt let me do it.
    It says:
    [bwa_index] fail to open file '/datasets/SRR035022_1.filt.fastq'. Abort!
    Aborted

    Do I have to use FASTA format with BWA or I have something wrong in elsewhere ?

    Thanks in Advance...
    bwa 'index' is the clue. You bwa index the reference genome, then align your fastq files to it.

    Your index is a FASTA file.

    Your reads are fastq format.

    You posted the error, but not the command you ran, so it's hard to tell what your usage was to generate the error. Failing to open the file could also mean you're just not pointing bwa to the right location, but even so it seems like you might have skipped a step ahead in your workflow

    Comment


    • #3
      SeqAnswers Group

      Mr/Mrs.,
      I have installed and make BWA aligner software, but after getting into bwa-0.5.9 folder I get this

      COPYING bamlite.c bwa bwase.o bwt_gen bwtaln.o bwtio.c bwtsw2_aux.o bwtsw2_main.o kseq.h main.c solid2fastq.pl utils.o
      ChangeLog bamlite.h bwa.1 bwaseqio.c bwt_lite.c bwtgap.c bwtio.o bwtsw2_chain.c cs2nt.c ksort.h main.h stdaln.c
      Makefile bamlite.o bwape.c bwaseqio.o bwt_lite.h bwtgap.h bwtmisc.c bwtsw2_chain.o cs2nt.o kstring.c main.o stdaln.h
      NEWS bntseq.c bwape.o bwt.c bwt_lite.o bwtgap.o bwtmisc.o bwtsw2_core.c is.c kstring.h qualfa2fq.pl stdaln.o
      README bntseq.h bwase.c bwt.h bwtaln.c bwtindex.c bwtsw2.h bwtsw2_core.o is.o kstring.o simple_dp.c utils.c
      SRR038263_1.sai bntseq.o bwase.h bwt.o bwtaln.h bwtindex.o bwtsw2_aux.c bwtsw2_main.c khash.h kvec.h simple_dp.o utils.h

      How should I start alignment for reference sequence (hg18 fasta format) with SRR038263.fastq . I would be glad for your support.

      Regards,
      Momo

      Comment


      • #4
        Originally posted by Bukowski View Post
        bwa 'index' is the clue. You bwa index the reference genome, then align your fastq files to it.

        Your index is a FASTA file.

        Your reads are fastq format.

        You posted the error, but not the command you ran, so it's hard to tell what your usage was to generate the error. Failing to open the file could also mean you're just not pointing bwa to the right location, but even so it seems like you might have skipped a step ahead in your workflow
        Thank you for your reply..

        Since I'm learning now, I just realized that I need to use FASTA as my reference index file and FASTQ as my read file..

        I guess I'm pointing bwa to the right location, and the problem is my knowledge lack on dna sequencing and how to use these programs on linux environment..

        However, Is there any source that I can download FASTA file from ?

        Comment


        • #5
          Hi,

          When I run BWA for aligning fasta chr.1 sequence with fastq SRR reads there is an error as below . Could you plz help me out.

          Regards,

          [bwa-0.5.9]$ ./bwa aln /home/DATA/chr1.fa /home/DATA/SRA/SRA012240/SRX017837/SRR038263_1.fastq > SRR038263_1.sai
          [bwa_aln] 17bp reads: max_diff = 2
          [bwa_aln] 38bp reads: max_diff = 3
          [bwa_aln] 64bp reads: max_diff = 4
          [bwa_aln] 93bp reads: max_diff = 5
          [bwa_aln] 124bp reads: max_diff = 6
          [bwa_aln] 157bp reads: max_diff = 7
          [bwa_aln] 190bp reads: max_diff = 8
          [bwa_aln] 225bp reads: max_diff = 9
          [bwt_restore_bwt] fail to open file '/home/DATA/chr1.fa.bwt'. Abort!

          Comment


          • #6
            Index your reference first!
            ./bwa index -a bwtsw /home/DATA/chr1.fa

            Comment


            • #7
              I indexed my reference first :

              $bwa index -p indexed_chr1 -a is -c ~/fasta/chr1.fa
              [bwa_index] Pack nucleotide FASTA... 4.97 sec
              [bwa_index] Convert nucleotide PAC to color PAC... 1.48 sec
              [bwa_index] Reverse the packed sequence... 1.57 sec
              [bwa_index] Construct BWT for the packed sequence...
              [bwa_index] 110.54 seconds elapse.
              [bwa_index] Construct BWT for the reverse packed sequence...
              [bwa_index] 110.08 seconds elapse.
              [bwa_index] Update BWT... 1.03 sec
              [bwa_index] Update reverse BWT... 1.06 sec
              [bwa_index] Construct SA from BWT and Occ... 49.80 sec
              [bwa_index] Construct SA from reverse BWT and Occ... 50.04 sec


              then I tried to align it using the commands below then it gives error while opening fastq file..

              $ bwa aln ~/fasta/chr1.fa ~/datasets/SRR035022_1.filt.fastq > aln_sa.sai
              [bwa_aln] 17bp reads: max_diff = 2
              [bwa_aln] 38bp reads: max_diff = 3
              [bwa_aln] 64bp reads: max_diff = 4
              [bwa_aln] 93bp reads: max_diff = 5
              [bwa_aln] 124bp reads: max_diff = 6
              [bwa_aln] 157bp reads: max_diff = 7
              [bwa_aln] 190bp reads: max_diff = 8
              [bwa_aln] 225bp reads: max_diff = 9
              [bwa_seq_open] fail to open file '/home/ukursuncu/datasets/SRR035022_1.filt.fastq'. Abort!
              Aborted

              What would I be doing wrong ?
              Last edited by kursuni; 09-26-2011, 01:56 PM.

              Comment


              • #8
                kursuni, just a general remark, I would always put the complete path to your reference file, read file and output file, just to make sure, like:

                ./bwa aln /path/to/folder/with/fasta/chr1.fa /path/to/folder/with/datasets/SRR035022_1.filt.fastq > /path/to/output/aln_sa.sai

                If you are not so familiar with the command line yet, you can also look up the file names in your graphical file browser and copy/paste them into your command. That way, usually, the whole path is copied.

                cheers,
                Sophia

                Comment


                • #9
                  Hello,

                  I would like to index for aligning FASTQ sequences to FASTA

                  [haojamrocky@melon bwa-0.5.9]$ ./bwa index -a bwtsw -c /home/haojamrocky/DATA/hg18chr/hg18.fasta
                  [bwa_index] Pack nucleotide FASTA... [bns_fasta2bntseq] zero length sequence. Abort!

                  Comment


                  • #10
                    Hello,

                    I would like to index for aligning FASTQ sequences to FASTA hg18 reference sequence. The FASTQ sample sequences are SOLID reads. Could you please assist me. I hereby attach the error message while running on BWA for indexing the reference hg18.fasta .

                    [haojamrocky@melon bwa-0.5.9]$ ./bwa index -a bwtsw -c /home/haojamrocky/DATA/hg18chr/hg18.fasta
                    [bwa_index] Pack nucleotide FASTA... [bns_fasta2bntseq] zero length sequence. Abort!

                    Regards,
                    HR

                    Comment


                    • #11
                      Hello,

                      Before indexing hg18.fa , I put all chr1 to chrY including chrM to hg18.fa using this code mention below. When I tried for single chr1 fasta for indexing it runs properly. Is this error due to this code.

                      To cat all sequence together into one single fasta record:
                      $ cat chr*.fa | sed -e "/^>/d" >> hg18.fa

                      Regards,
                      HR

                      Comment


                      • #12
                        Originally posted by haojam View Post
                        Hello,

                        To cat all sequence together into one single fasta record:
                        $ cat chr*.fa | sed -e "/^>/d" >> hg18.fa

                        Regards,
                        HR
                        Did you check how the hg18.fa looks like after this and what size it has?

                        Comment


                        • #13
                          Hello,

                          Does BWA support SRR.....fastq.bz2 file for aligning with the human genome reference sequence?

                          Regards,
                          HR

                          Comment


                          • #14
                            Hi kursuni,

                            you may have a look here:
                            Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                            that might help you

                            Comment


                            • #15
                              Originally posted by ulz_peter View Post
                              Hi kursuni,

                              you may have a look here:
                              Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                              that might help you
                              Dear ulz_peter, It was very helpful.. Thank you very much...
                              Since I'm very new to bioinformatics, I really need so much help and am trying to learn through books, papers and internet resources such as this forum. But it takes time to learn bioinformatics anyway.. However, since I'm working on my thesis about bioinformatics, I need to do this as fast as I can due to the time limitation.. Therefore, I really appreciate any help on this..
                              If you may suggest any other papers or document, I would appreciate it as well..

                              Thanks again.
                              Best Regards..
                              Ugur

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Non-Coding RNA Research and Technologies
                                by seqadmin




                                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                Nobel Prize for MicroRNA Discovery
                                This week,...
                                10-07-2024, 08:07 AM
                              • seqadmin
                                Recent Developments in Metagenomics
                                by seqadmin





                                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                09-23-2024, 06:35 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 06:55 AM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-02-2024, 04:51 AM
                              0 responses
                              105 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-01-2024, 07:10 AM
                              0 responses
                              113 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-30-2024, 08:33 AM
                              1 response
                              117 views
                              0 likes
                              Last Post EmiTom
                              by EmiTom
                               
                              Working...
                              X