Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie2 mystery problem: empty fasta file

    Hi,

    until recently I was using Bowtie2 with no problems. Today I am trying to build a new index e.g.

    bowtie2-build test.fna test

    the fasta file was some plasmid sequence but kept giving me the below error after a few mintues 'stalled':


    **********************************************
    fchr[A]: 0
    fchr[C]: 51813
    fchr[G]: 65001
    fchr[T]: 84678
    fchr[$]: 126810
    Exiting Ebwt::buildToDisk()
    Returning from initFromVector
    Wrote 4236774 bytes to primary EBWT file: p.1.bt2
    Wrote 31708 bytes to secondary EBWT file: p.2.bt2
    Re-opening _in1 and _in2 as input streams
    Returning from Ebwt constructor
    Headers:
    len: 126810
    bwtLen: 126811
    sz: 31703
    bwtSz: 31703
    lineRate: 6
    offRate: 4
    offMask: 0xfffffff0
    ftabChars: 10
    eftabLen: 20
    eftabSz: 80
    ftabLen: 1048577
    ftabSz: 4194308
    offsLen: 7926
    offsSz: 31704
    lineSz: 64
    sideSz: 64
    sideBwtSz: 48
    sideBwtLen: 192
    numSides: 661
    numLines: 661
    ebwtTotLen: 42304
    ebwtTotSz: 42304
    color: 0
    reverse: 0
    Total time for call to driver() for forward index: 00:00:00
    Warning: Empty fasta file: 'test.fna'
    Warning: All fasta inputs were empty
    Total time for backward call to driver() for mirror index: 00:00:41
    Error: Encountered internal Bowtie 2 exception (#1)
    Command: bowtie2-build --wrapper basic-0 test.fna p
    Deleting "p.3.bt2" file written during aborted indexing attempt.
    Deleting "p.4.bt2" file written during aborted indexing attempt.
    Deleting "p.1.bt2" file written during aborted indexing attempt.
    Deleting "p.2.bt2" file written during aborted indexing attempt.

    ******************************************************

    I experimented with the file size, and irrespective of the actual sequence used
    my limit is 126811 bases.
    Any longer and the program gives the 'empty fasta file' error. Anyone have any clue what is going on?? previously I have had no problems iwth much bigger files on the same machine (80GB RAM).

    Thanks,

    S

  • #2
    Try redownloading bowtie2.

    Comment


    • #3
      Originally posted by dpryan View Post
      Try redownloading bowtie2.

      yes, that was my next move. thanks.

      Comment


      • #4
        I just installed the latest version, 2.2.4 from source.

        I still get exactly the same file size error... so this appears to really be another file size bug in bowtie2?

        Does anyone know how I can fix this. I really need to build an index with this (pseudo) genome.

        Btw, bowtie1 gives the same error. So does every other version I have tried a few 10s of bp smaller and the file works. and a few hundred larger and it works again, but the range I need, 204308 bp, no dice.

        Thanks for any help.. getting desperate.

        S.

        Comment


        • #5
          Ok, I thought of a solution.. I've just duplicate the whole sequence in tandem and its worked. I only want to use the reference to filter out reads (keeping the unmapped reads) so this should work just as well..

          S.

          Comment


          • #6
            Can you post the reference that's giving you problems somewhere? Then one of us can try to reproduce it and see if this actually is a bug or is just some local computer problem.

            Comment


            • #7
              ok, I'll post it next chance I get. Thanks.

              Comment


              • #8
                Hi, my file is 203Kb and the forum limit is 19.5 Kb, so I can't upload it

                Comment


                • #9
                  google drive, dropbox, there are plenty of options.

                  Comment


                  • #10
                    ok.. here's the reference file. Just tried it again. Same error:

                    Getting block 8 of 8
                    Reserving size (38308) for bucket
                    Calculating Z arrays
                    Calculating Z arrays time: 00:00:00
                    Entering block accumulator loop:
                    10%
                    20%
                    30%
                    40%
                    50%
                    60%
                    70%
                    80%
                    90%
                    100%
                    Block accumulator loop time: 00:00:00
                    Sorting block of length 18959
                    (Using difference cover)
                    Sorting block time: 00:00:00
                    Returning block of 18960
                    Exited Ebwt loop
                    fchr[A]: 0
                    fchr[C]: 83161
                    fchr[G]: 104600
                    fchr[T]: 136354
                    fchr[$]: 204308
                    Exiting Ebwt::buildToDisk()
                    Returning from initFromVector
                    Wrote 4262634 bytes to primary EBWT file: plasmids.1.bt2
                    Wrote 51084 bytes to secondary EBWT file: plasmids.2.bt2
                    Re-opening _in1 and _in2 as input streams
                    Returning from Ebwt constructor
                    Headers:
                    len: 204308
                    bwtLen: 204309
                    sz: 51077
                    bwtSz: 51078
                    lineRate: 6
                    offRate: 4
                    offMask: 0xfffffff0
                    ftabChars: 10
                    eftabLen: 20
                    eftabSz: 80
                    ftabLen: 1048577
                    ftabSz: 4194308
                    offsLen: 12770
                    offsSz: 51080
                    lineSz: 64
                    sideSz: 64
                    sideBwtSz: 48
                    sideBwtLen: 192
                    numSides: 1065
                    numLines: 1065
                    ebwtTotLen: 68160
                    ebwtTotSz: 68160
                    color: 0
                    reverse: 0
                    Total time for call to driver() for forward index: 00:00:00
                    Warning: Empty fasta file: 'plasmids.fna'
                    Warning: All fasta inputs were empty
                    Total time for backward call to driver() for mirror index: 00:00:00
                    Error: Encountered internal Bowtie 2 exception (#1)
                    Command: bowtie2-build --wrapper basic-0 plasmids.fna plasmids
                    Deleting "plasmids.3.bt2" file written during aborted indexing attempt.
                    Deleting "plasmids.4.bt2" file written during aborted indexing attempt.
                    Deleting "plasmids.1.bt2" file written during aborted indexing attempt.
                    Deleting "plasmids.2.bt2" file written during aborted indexing attempt.
                    Attached Files

                    Comment


                    • #11
                      .. I just had to gzip it for the forum.

                      Comment


                      • #12
                        command was simply:

                        bowtie2-build plasmids.fna plasmids

                        (after gunzipping the file obviously)

                        Comment


                        • #13
                          This is a local computer problem. I had no problem creating the indices.

                          Comment


                          • #14
                            Ok, thanks.

                            but I can't see why I get this problem. My Debian machine has plenty of memory, no other problems and there is no indication of what is going wrong in the log. Bowtie2 starts building then just seems to stop at a point when it decides the input is empy.. Its very frustrating. And why would it only do it for certain files size??

                            Comment


                            • #15
                              Are you running out of space on that partition? Aside from that, I haven't a clue.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 08:47 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              54 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X