Seqanswers Leaderboard Ad

Collapse
X
Collapse
+ More Options
Posts
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • niti217
    Member
    • Dec 2011
    • 10

    #16
    I am having similar problem - past 5 hours i spent on debugging error -but in vain.
    I would really appreciate any help in this regard.

    I am trying to index Homo_sapiens.GRCh37 ...fa file using the command

    bwa index -a bwtsw /directory/filename.fa

    but it keeps giving me the following error

    [bwa_index] Pack FASTA... 56.76 sec
    [bwa_index] Reverse the packed sequence... Segmentation fault

    Can someone please help me with possible suggestion to fix this. Thank.

    Comment

    • slengyel
      Junior Member
      • Dec 2012
      • 7

      #17
      BWA aln error: can't locate index

      Greetings,

      I'm trying to align my raw paired end illumina reads to my best abyss contigs.fa.

      The commands I used to index and align are as follows:

      /home/stephen/Programs/BWA/bwa-0.7.3a/bwa index -p contigs.fa -a bwtsw /DATA/ANALYSIS/stephen/k62/contigs.fa

      /home/stephen/Programs/BWA/bwa-0.7.3a/bwa aln /DATA/ANALYSIS/stephen/k62/contigs.fa /DATA/RAW_DATA/$1.read1.gz -t 4 >/DATA/ANALYSIS/stephen/$1.read1.sai

      I repeat the second command for the read2.gz file.

      The indexing appears to go smoothly, i.e. the proper outputs are there. However, when I run the bwa aln command, the following occurs:

      [bwa_aln] 17bp reads: max_diff = 2
      [bwa_aln] 38bp reads: max_diff = 3
      [bwa_aln] 64bp reads: max_diff = 4
      [bwa_aln] 93bp reads: max_diff = 5
      [bwa_aln] 124bp reads: max_diff = 6
      [bwa_aln] 157bp reads: max_diff = 7
      [bwa_aln] 190bp reads: max_diff = 8
      [bwa_aln] 225bp reads: max_diff = 9
      [bwa_aln] fail to locate the index
      [main] Version: 0.7.3a-r367

      I'm running the commands in the same directory as the index outputs. Why would bwa not be able to find the indices? There seems to be no parameter to tell bwa aln where the index files are located.

      The end result is to obtain the metric outputs from CollectInsertSizeMetrics after further picard tools conversions. This is in order to verify insert size and standard deviation values required for input files for ALL-PATHS-LG.

      All help is appreciated, and thanks in advance.

      Comment

      • mastal
        Senior Member
        • Mar 2009
        • 666

        #18
        BWA & FASTQ or FASTA

        Hi,

        you don't say how long your Illumina reads are.

        bwa aln (BWA-backtrack) only works for reads up to 100 bp, so this could be the problem.

        Comment

        • slengyel
          Junior Member
          • Dec 2012
          • 7

          #19
          this particular data set has a read length of 90..two others I was going to attempt later have read lengths of 140.

          Comment

          • mastal
            Senior Member
            • Mar 2009
            • 666

            #20
            BWA & FASTQ or FASTA

            OK, so it looks like the read length shouldn't be the problem.

            Is the bwa index in the same directory as the contigs.fa contigs file? That may be why you get the 'failed to locate the index' error message.

            Comment

            • slengyel
              Junior Member
              • Dec 2012
              • 7

              #21
              The bwa index, is not in the same directory as the reference contigs.fa. I'll make a copy of it in the directory where the indices are. I will update with Results.

              Thank you for your reply.

              Comment

              • slengyel
                Junior Member
                • Dec 2012
                • 7

                #22
                That seems to have done the trick. Thanks again. It's always good to gain a different perspective. Like many, I am new to bioinformatics and assembly.

                Comment

                • rob123king
                  Junior Member
                  • Feb 2013
                  • 9

                  #23
                  BWA index non-recognition issue

                  Hi,

                  I have indexed my tomato genome thus:
                  bwa index -a bwtsw -p S_lycopersicum S_lycopersicum.fa

                  which does not produce any errors.

                  reads are 101bp x50 coverage

                  all files i.e
                  S_lycopersicum.fa, S_lycopersicum.amb S_lycopersicum.ann S_lycopersicum.bwt S_lycopersicum.pac S_lycopersicum.sa
                  are located in the same directory but when run this:
                  bwa mem -pPM -t 10 S_lycopersicum.fa LIB2975_LDI2549_R1L78merged.fastq LIB2975_LDI2549_R2L78merged.fastq > IB2975_LDI2549.sam

                  I get this error:
                  [E::bwa_idx_load] fail to locate the index files

                  Any thoughts on what I'm doing wrong??
                  Thanks

                  Comment

                  • ulz_peter
                    Senior Member
                    • Feb 2010
                    • 219

                    #24
                    With the -p parameter you specify the prefix. In the bwa mem command you need to give it the exact same prefix as you specified using the -p argument of the index step:

                    Hence: drop the .fa suffix of S_lycopersicum.fa

                    It should look like this
                    bwa mem -pPM -t 10 S_lycopersicum LIB2975_LDI2549_R1L78merged.fastq LIB2975_LDI2549_R2L78merged.fastq > IB2975_LDI2549.sam

                    Comment

                    • rob123king
                      Junior Member
                      • Feb 2013
                      • 9

                      #25
                      yes that was it. love ya thanks

                      Comment

                      • rob123king
                        Junior Member
                        • Feb 2013
                        • 9

                        #26
                        Its running but im getting this at command prompt:
                        has it not used both files??? does this output look normal?

                        [W::main_mem] when '-p' is in use, the second query file will be ignored.
                        [M::main_mem] read 990100 sequences (100000100 bp)...
                        [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (7, 1, 5, 5)
                        [M::mem_pestat] skip orientation FF as there are not enough pairs
                        [M::mem_pestat] skip orientation FR as there are not enough pairs
                        [M::mem_pestat] skip orientation RF as there are not enough pairs
                        [M::mem_pestat] skip orientation RR as there are not enough pairs
                        [M::worker2@3] performed mate-SW for 0 reads
                        [M::worker2@7] performed mate-SW for 0 reads
                        [M::worker2@8] performed mate-SW for 0 reads
                        [M::worker2@2] performed mate-SW for 0 reads
                        [M::worker2@9] performed mate-SW for 0 reads
                        [M::worker2@0] performed mate-SW for 0 reads
                        [M::worker2@5] performed mate-SW for 0 reads

                        Comment

                        • ulz_peter
                          Senior Member
                          • Feb 2010
                          • 219

                          #27
                          is there any reason you spcified the -pPM parameters?

                          if you've got two fastq files skip the -p option. And I#m not sure about the P option; for a first shot give it a try like that:

                          bwa mem -M -t 10 S_lycopersicum LIB2975_LDI2549_R1L78merged.fastq LIB2975_LDI2549_R2L78merged.fastq > IB2975_LDI2549.sam

                          Comment

                          • rob123king
                            Junior Member
                            • Feb 2013
                            • 9

                            #28
                            I was looking at the manual and -pP seemed like a good options. -M for picard marking duplicates later. I was just thinking that. Yea I'll give that go, keep it simple.
                            Thanks

                            Comment

                            • lh3
                              Senior Member
                              • Feb 2008
                              • 686

                              #29
                              If you have two fastq, one for each end, don't use -p. If you want to perform paired-end alignment, don't use -P. These options are for special use cases. Usually you don't need them. Use -M, though.

                              Comment

                              • rob123king
                                Junior Member
                                • Feb 2013
                                • 9

                                #30
                                All sorted now thanks very much. completing using samtools mpileup now.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Pathogen Surveillance with Advanced Genomic Tools
                                  by seqadmin




                                  The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                                  03-24-2025, 11:48 AM
                                • seqadmin
                                  New Genomics Tools and Methods Shared at AGBT 2025
                                  by seqadmin


                                  This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                                  The Headliner
                                  The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                                  03-03-2025, 01:39 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Today, 12:59 PM
                                0 responses
                                6 views
                                0 reactions
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 10:17 AM
                                0 responses
                                7 views
                                0 reactions
                                Last Post seqadmin  
                                Started by seqadmin, 03-20-2025, 05:03 AM
                                0 responses
                                49 views
                                0 reactions
                                Last Post seqadmin  
                                Started by seqadmin, 03-19-2025, 07:27 AM
                                0 responses
                                60 views
                                0 reactions
                                Last Post seqadmin  
                                Working...