Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA & FASTQ or FASTA

    Hi, I'm very new to Bioinformatics and I have 2 FASTQ files to align. Im trying to align these sequences with BWA. But It doesnt let me do it.
    It says:
    [bwa_index] fail to open file '/datasets/SRR035022_1.filt.fastq'. Abort!
    Aborted

    Do I have to use FASTA format with BWA or I have something wrong in elsewhere ?

    Thanks in Advance...

  • #2
    Originally posted by kursuni View Post
    Hi, I'm very new to Bioinformatics and I have 2 FASTQ files to align. Im trying to align these sequences with BWA. But It doesnt let me do it.
    It says:
    [bwa_index] fail to open file '/datasets/SRR035022_1.filt.fastq'. Abort!
    Aborted

    Do I have to use FASTA format with BWA or I have something wrong in elsewhere ?

    Thanks in Advance...
    bwa 'index' is the clue. You bwa index the reference genome, then align your fastq files to it.

    Your index is a FASTA file.

    Your reads are fastq format.

    You posted the error, but not the command you ran, so it's hard to tell what your usage was to generate the error. Failing to open the file could also mean you're just not pointing bwa to the right location, but even so it seems like you might have skipped a step ahead in your workflow

    Comment


    • #3
      SeqAnswers Group

      Mr/Mrs.,
      I have installed and make BWA aligner software, but after getting into bwa-0.5.9 folder I get this

      COPYING bamlite.c bwa bwase.o bwt_gen bwtaln.o bwtio.c bwtsw2_aux.o bwtsw2_main.o kseq.h main.c solid2fastq.pl utils.o
      ChangeLog bamlite.h bwa.1 bwaseqio.c bwt_lite.c bwtgap.c bwtio.o bwtsw2_chain.c cs2nt.c ksort.h main.h stdaln.c
      Makefile bamlite.o bwape.c bwaseqio.o bwt_lite.h bwtgap.h bwtmisc.c bwtsw2_chain.o cs2nt.o kstring.c main.o stdaln.h
      NEWS bntseq.c bwape.o bwt.c bwt_lite.o bwtgap.o bwtmisc.o bwtsw2_core.c is.c kstring.h qualfa2fq.pl stdaln.o
      README bntseq.h bwase.c bwt.h bwtaln.c bwtindex.c bwtsw2.h bwtsw2_core.o is.o kstring.o simple_dp.c utils.c
      SRR038263_1.sai bntseq.o bwase.h bwt.o bwtaln.h bwtindex.o bwtsw2_aux.c bwtsw2_main.c khash.h kvec.h simple_dp.o utils.h

      How should I start alignment for reference sequence (hg18 fasta format) with SRR038263.fastq . I would be glad for your support.

      Regards,
      Momo

      Comment


      • #4
        Originally posted by Bukowski View Post
        bwa 'index' is the clue. You bwa index the reference genome, then align your fastq files to it.

        Your index is a FASTA file.

        Your reads are fastq format.

        You posted the error, but not the command you ran, so it's hard to tell what your usage was to generate the error. Failing to open the file could also mean you're just not pointing bwa to the right location, but even so it seems like you might have skipped a step ahead in your workflow
        Thank you for your reply..

        Since I'm learning now, I just realized that I need to use FASTA as my reference index file and FASTQ as my read file..

        I guess I'm pointing bwa to the right location, and the problem is my knowledge lack on dna sequencing and how to use these programs on linux environment..

        However, Is there any source that I can download FASTA file from ?

        Comment


        • #5
          Hi,

          When I run BWA for aligning fasta chr.1 sequence with fastq SRR reads there is an error as below . Could you plz help me out.

          Regards,

          [bwa-0.5.9]$ ./bwa aln /home/DATA/chr1.fa /home/DATA/SRA/SRA012240/SRX017837/SRR038263_1.fastq > SRR038263_1.sai
          [bwa_aln] 17bp reads: max_diff = 2
          [bwa_aln] 38bp reads: max_diff = 3
          [bwa_aln] 64bp reads: max_diff = 4
          [bwa_aln] 93bp reads: max_diff = 5
          [bwa_aln] 124bp reads: max_diff = 6
          [bwa_aln] 157bp reads: max_diff = 7
          [bwa_aln] 190bp reads: max_diff = 8
          [bwa_aln] 225bp reads: max_diff = 9
          [bwt_restore_bwt] fail to open file '/home/DATA/chr1.fa.bwt'. Abort!

          Comment


          • #6
            Index your reference first!
            ./bwa index -a bwtsw /home/DATA/chr1.fa

            Comment


            • #7
              I indexed my reference first :

              $bwa index -p indexed_chr1 -a is -c ~/fasta/chr1.fa
              [bwa_index] Pack nucleotide FASTA... 4.97 sec
              [bwa_index] Convert nucleotide PAC to color PAC... 1.48 sec
              [bwa_index] Reverse the packed sequence... 1.57 sec
              [bwa_index] Construct BWT for the packed sequence...
              [bwa_index] 110.54 seconds elapse.
              [bwa_index] Construct BWT for the reverse packed sequence...
              [bwa_index] 110.08 seconds elapse.
              [bwa_index] Update BWT... 1.03 sec
              [bwa_index] Update reverse BWT... 1.06 sec
              [bwa_index] Construct SA from BWT and Occ... 49.80 sec
              [bwa_index] Construct SA from reverse BWT and Occ... 50.04 sec


              then I tried to align it using the commands below then it gives error while opening fastq file..

              $ bwa aln ~/fasta/chr1.fa ~/datasets/SRR035022_1.filt.fastq > aln_sa.sai
              [bwa_aln] 17bp reads: max_diff = 2
              [bwa_aln] 38bp reads: max_diff = 3
              [bwa_aln] 64bp reads: max_diff = 4
              [bwa_aln] 93bp reads: max_diff = 5
              [bwa_aln] 124bp reads: max_diff = 6
              [bwa_aln] 157bp reads: max_diff = 7
              [bwa_aln] 190bp reads: max_diff = 8
              [bwa_aln] 225bp reads: max_diff = 9
              [bwa_seq_open] fail to open file '/home/ukursuncu/datasets/SRR035022_1.filt.fastq'. Abort!
              Aborted

              What would I be doing wrong ?
              Last edited by kursuni; 09-26-2011, 01:56 PM.

              Comment


              • #8
                kursuni, just a general remark, I would always put the complete path to your reference file, read file and output file, just to make sure, like:

                ./bwa aln /path/to/folder/with/fasta/chr1.fa /path/to/folder/with/datasets/SRR035022_1.filt.fastq > /path/to/output/aln_sa.sai

                If you are not so familiar with the command line yet, you can also look up the file names in your graphical file browser and copy/paste them into your command. That way, usually, the whole path is copied.

                cheers,
                Sophia

                Comment


                • #9
                  Hello,

                  I would like to index for aligning FASTQ sequences to FASTA

                  [haojamrocky@melon bwa-0.5.9]$ ./bwa index -a bwtsw -c /home/haojamrocky/DATA/hg18chr/hg18.fasta
                  [bwa_index] Pack nucleotide FASTA... [bns_fasta2bntseq] zero length sequence. Abort!

                  Comment


                  • #10
                    Hello,

                    I would like to index for aligning FASTQ sequences to FASTA hg18 reference sequence. The FASTQ sample sequences are SOLID reads. Could you please assist me. I hereby attach the error message while running on BWA for indexing the reference hg18.fasta .

                    [haojamrocky@melon bwa-0.5.9]$ ./bwa index -a bwtsw -c /home/haojamrocky/DATA/hg18chr/hg18.fasta
                    [bwa_index] Pack nucleotide FASTA... [bns_fasta2bntseq] zero length sequence. Abort!

                    Regards,
                    HR

                    Comment


                    • #11
                      Hello,

                      Before indexing hg18.fa , I put all chr1 to chrY including chrM to hg18.fa using this code mention below. When I tried for single chr1 fasta for indexing it runs properly. Is this error due to this code.

                      To cat all sequence together into one single fasta record:
                      $ cat chr*.fa | sed -e "/^>/d" >> hg18.fa

                      Regards,
                      HR

                      Comment


                      • #12
                        Originally posted by haojam View Post
                        Hello,

                        To cat all sequence together into one single fasta record:
                        $ cat chr*.fa | sed -e "/^>/d" >> hg18.fa

                        Regards,
                        HR
                        Did you check how the hg18.fa looks like after this and what size it has?

                        Comment


                        • #13
                          Hello,

                          Does BWA support SRR.....fastq.bz2 file for aligning with the human genome reference sequence?

                          Regards,
                          HR

                          Comment


                          • #14
                            Hi kursuni,

                            you may have a look here:
                            Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                            that might help you

                            Comment


                            • #15
                              Originally posted by ulz_peter View Post
                              Hi kursuni,

                              you may have a look here:
                              Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                              that might help you
                              Dear ulz_peter, It was very helpful.. Thank you very much...
                              Since I'm very new to bioinformatics, I really need so much help and am trying to learn through books, papers and internet resources such as this forum. But it takes time to learn bioinformatics anyway.. However, since I'm working on my thesis about bioinformatics, I need to do this as fast as I can due to the time limitation.. Therefore, I really appreciate any help on this..
                              If you may suggest any other papers or document, I would appreciate it as well..

                              Thanks again.
                              Best Regards..
                              Ugur

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Exploring the Dynamics of the Tumor Microenvironment
                                by seqadmin




                                The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                                07-08-2024, 03:19 PM
                              • seqadmin
                                Exploring Human Diversity Through Large-Scale Omics
                                by seqadmin


                                In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                                06-25-2024, 06:43 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 07-10-2024, 07:30 AM
                              0 responses
                              26 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-03-2024, 09:45 AM
                              0 responses
                              201 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-03-2024, 08:54 AM
                              0 responses
                              212 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-02-2024, 03:00 PM
                              0 responses
                              193 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X