Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • javijevi
    Member
    • Jan 2010
    • 38

    BFAST error in FindMatchesInIndexSet function

    Hi all,

    I successfully went along the first steps of BFAST pipeline, including the indexes creation, but got the below copied error when running 'bfast match' step with the following command for a fastq test file with 9 reads:

    bfast match -f reference_genome.fa -A 1 -r test.fastq -i 1 -I 2-10 1> matches.bmf 2> match.log &

    Contents of match.log:
    (...)
    Searching index file 1/1 (index #1, bin #1) complete...
    Found 4 matches.
    Found matches for 4 reads.
    Copying unmatched reads for secondary index search.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.

    Any idea?

    Thanks in advance.
  • javijevi
    Member
    • Jan 2010
    • 38

    #2
    Originally posted by javijevi View Post
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Just to tell that I made a mistake in copying twice the last two lines of the output.

    Comment

    • nilshomer
      Nils Homer
      • Nov 2008
      • 1283

      #3
      Originally posted by javijevi View Post
      Hi all,

      I successfully went along the first steps of BFAST pipeline, including the indexes creation, but got the below copied error when running 'bfast match' step with the following command for a fastq test file with 9 reads:

      bfast match -f reference_genome.fa -A 1 -r test.fastq -i 1 -I 2-10 1> matches.bmf 2> match.log &

      Contents of match.log:
      (...)
      Searching index file 1/1 (index #1, bin #1) complete...
      Found 4 matches.
      Found matches for 4 reads.
      Copying unmatched reads for secondary index search.
      Splitting unmatched reads into temp files.
      bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
      Splitting unmatched reads into temp files.
      bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.

      Any idea?

      Thanks in advance.
      Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).

      This may be a bug (with the secondary search). Please submit your report to [email protected] so we can resolve the issue quickly.

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        Originally posted by nilshomer View Post
        Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).

        This may be a bug (with the secondary search). Please submit your report to [email protected] so we can resolve the issue quickly.
        I have found the bug and fixed the latest source code available via GIT. Let me know if you have any problems: )

        Comment

        • javijevi
          Member
          • Jan 2010
          • 38

          #5
          Originally posted by nilshomer View Post
          Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).
          In BFAST book, you can find the following: 'If you wish to have a secondary set of indexes, which are used if no matches are found in the main set of indexes, use the -I option'. So, I thought that it was more efficient to not use a mismatch-allowing index, e.g., 1110111110011111, for reads which were already mapped by using an all-matchs index, that is, 11111111111111.

          Obviously, I missed something important in this issue because of the complexity of the index-based search algorithm for a biologist, and I therefore will blindly follow your recommendation about not using secondary indexes.

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            Originally posted by javijevi View Post
            In BFAST book, you can find the following: 'If you wish to have a secondary set of indexes, which are used if no matches are found in the main set of indexes, use the -I option'. So, I thought that it was more efficient to not use a mismatch-allowing index, e.g., 1110111110011111, for reads which were already mapped by using an all-matchs index, that is, 11111111111111.

            Obviously, I missed something important in this issue because of the complexity of the index-based search algorithm for a biologist, and I therefore will blindly follow your recommendation about not using secondary indexes.
            I have spent a lot of time thinking about the indexing strategy and I would follow the strategy found in section 7.1 where we use 10 "main" indexes and no secondary indexes.

            I apologize for the confusion but I tried to keep options for flexibility.

            Comment

            Latest Articles

            Collapse

            • GATTACAT
              Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by GATTACAT
              Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
              07-01-2026, 11:43 AM
            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Yesterday, 11:08 AM
            0 responses
            6 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-30-2026, 05:37 AM
            0 responses
            11 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-26-2026, 11:10 AM
            0 responses
            19 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            53 views
            0 reactions
            Last Post SEQadmin2  
            Working...