Announcement

Collapse
No announcement yet.

BBMap (aligner for DNA/RNAseq) is now open-source and available for download.

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi Brian Bushnell

    I recently installed BBMap_38.98.tar.gz from SourceForge on a machine running Ubuntu 20.04.4 LTS.

    Per the usage guide, I checked my current version of Java with "java -Xmx90m -version" which returned the following:
    openjdk version "11.0.15" 2022-04-19
    OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1)
    OpenJDK 64-Bit Server VM (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1, mixed mode, sharing)

    I tried running reformat on a sam file with a command like:
    reformat.sh in=in.sam out=out.sam forcetrimleft=10

    And got the following error:
    java -ea -Xms300m -cp /data/usr/croy/tools/bbmap/current/ jgi.ReformatReads in=in.sam out=out.sam forcetrimleft=10
    Executing jgi.ReformatReads [in=in.sam, out=out.sam, forcetrimleft=10]

    Input is being processed as unpaired
    Waiting on header to be read from a sam file.
    Exception in thread "main" java.lang.AssertionError: H; 22M123H
    FS10001085:123:BRL95608-3321:1:1101:7680:1000 2195 chr3 50608025 18 22M123H = 50607939 -98 AGTGGTTTTCACTGACAGCGTG F,:FFFFFFFF::FFF:F,F,F NM:i:0 MD:Z:22 MC:Z:93M21D52M AS:i:22 XS:i:0 SA:Z:chr3,50608053,-,17S128M,60,1;
    at shared.TrimRead.trimReadWithMatchFast(TrimRead.java:625)
    at shared.TrimRead.trimReadWithMatch(TrimRead.java:684)
    at shared.TrimRead.trimByAmount(TrimRead.java:279)
    at shared.TrimRead.trimToPosition(TrimRead.java:240)
    at jgi.ReformatReads.process(ReformatReads.java:919)
    at jgi.ReformatReads.main(ReformatReads.java:53)

    The sam file was aligned to hg38 using a recent version of BWA MEM (v0.7.17-r1188). The CIGAR string seems to follow the SAM specs as far as I can tell, so I'm not sure where things are going wrong.

    Please let me know if you need any more information.

    Best,

    Charles

    Comment


    • I think you found a bug ...
      When you look at the given line 625 in the file ./current/shared/TrimRead.java, it evaluates the CIGAR string and asserts that it ends with a match (M or =). This should be the case because the trimReadWithMatchFast() function is only invoked for 100% match or unmapped reads:

      Code:
      char c=sl.cigar.charAt(sl.cigar.length()-1);
      assert(c=='M' || c=='=') : c+"; "+sl.cigar+"\n"+sl;
      But in your case, you have the hard-trimmed right part, so c is a H and the assertion fails, even though the function is rightfully invoked, since your remaining 22 bases are all matching.

      Comment


      • Originally posted by Thias View Post
        I think you found a bug ...
        When you look at the given line 625 in the file ./current/shared/TrimRead.java, it evaluates the CIGAR string and asserts that it ends with a match (M or =). This should be the case because the trimReadWithMatchFast() function is only invoked for 100% match or unmapped reads:

        Code:
        char c=sl.cigar.charAt(sl.cigar.length()-1);
        assert(c=='M' || c=='=') : c+"; "+sl.cigar+"\n"+sl;
        But in your case, you have the hard-trimmed right part, so c is a H and the assertion fails, even though the function is rightfully invoked, since your remaining 22 bases are all matching.
        Yes, that is exactly correct; good analysis. I need to add support for H symbols, which I did not expect to encounter in that function.

        Comment

        Working...
        X