Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bwa_indexing_error

    I try to map illumina results to genome by bwa.
    However, I have stalled at the entrance.

    I prepared concatenated human genome sequence derived from UCSC.
    It was removed random, Un, hap files in advance.

    Although I entered following command, I encountered error message.

    $ bwa index -a bwtsw hg19.fa
    $ [bwa_index] fail to open 'hg19.fa'. Abort!

    When I used only chromosome 1 instead of whole chromosomes, program was not stopped.
    I guess bwa on my PC is functional and I felt point is file size.
    Nevertheless, size of hg19.fa is 2.9 GB.
    In addition, everyone reported on the net can use hg19.fa.

    Could you please help me?

  • #2
    is hg19.fa actually there?

    Comment


    • #3
      Yes.

      I can actually glance it by "more" command.

      Comment


      • #4
        Originally posted by angelpie View Post
        Yes.

        I can actually glance it by "more" command.
        Mmm... I'm looking at bwa code. You get that error when gzopen is called on your hg19.fa. gzopen returns 0 (actually NULL) if the file could not be opened, if there was insufficient memory to allocate the gzFile state...

        Program: bwa (alignment via Burrows-Wheeler transformation)
        Version: 0.5.8-4 (r1544)

        d

        Comment


        • #5
          I use the bwa-0.5.8c on Ubuntu 10.10.

          You mentioned gzopen.
          That means bwa_index uses gziped files, is that right?

          In fact, I try to use gziped file once.
          In that case, bwa_index didn't stall.

          However, I usually try to use extracted files.
          Because every samples I found on the net didn't use compressed files.

          Comment


          • #6
            Originally posted by angelpie View Post
            I use the bwa-0.5.8c on Ubuntu 10.10.

            You mentioned gzopen.
            That means bwa_index uses gziped files, is that right?
            Not necessarily. The reader is smart enough to read uncompressed stream. I usually index uncompressed fasta files. Now I know I can use gzipped genomes, though, thank you :-)
            Sorry for your genome, I don't know what to say or to test.

            Comment


            • #7
              @angelpie Can you list all the files in the directory where you have your fasta file?
              -drd

              Comment


              • #8
                Originally posted by drio View Post
                @angelpie Can you list all the files in the directory where you have your fasta file?
                My genome sequence file, hg19.fa, was prepared depending on
                http://chagall.med.cornell.edu/NGSco...eparations.pdf.

                I try to do it in some directories.
                In one case,
                there are chr1.fa - chr22.fa, chrX.fa, chrY.fa, chrM.fa, chromFa.tar,
                and hg19.fa.

                Comment


                • #9
                  I have the same problem.
                  Is there a link to get the full hg19 directly, withouth assembling it from all chromosomes?
                  I also use Ubuntu on a 32 bit machine.

                  Comment


                  • #10
                    I successfully indexed hg19, built from the concatenated separate chromosomes, using an Ubuntu 8.04 system with 8GB of RAM. I think 32 bit machines are going to be fairly limited here when indexing / aligning to hg19 because the maximum amount of RAM which can be used per process is around 2.5GB.

                    Suggestion - try a 64 bit machine.

                    Comment


                    • #11
                      Dear colindaven,

                      I thank you for your suggestion.
                      However, I use Ubuntu 10.10 on PC with Xeon W3503 and 12 GB memories.

                      I also suspect that my trouble is derived from my PC specific environment.
                      Unfortunately, I cannot try to work on alternate PC.

                      Comment


                      • #12
                        In that case you could try a virtual machine image, or another aligner.

                        Although I can sometimes get results from bwa, it does tend to exhaust the memory and thereby make the server unusable.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          The Impact of AI in Genomic Medicine
                          by seqadmin



                          Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                          Yesterday, 02:07 PM
                        • seqadmin
                          Multiomics Techniques Advancing Disease Research
                          by seqadmin


                          New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

                          A major leap in the field has
                          ...
                          02-08-2024, 06:33 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 02-23-2024, 04:11 PM
                        0 responses
                        55 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 02-21-2024, 08:52 AM
                        0 responses
                        62 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 02-20-2024, 08:57 AM
                        0 responses
                        53 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 02-14-2024, 09:19 AM
                        0 responses
                        65 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X