Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • bbmap aborts after mapping some reads

    Hello Brian,

    we are using bbmap to see in how far it is possible to quantify gene expression by mapping Illumina RNA-seq reads to the genome of a closely related species, e.g. map chimpanzee reads to human or as in this example Macaque reads.

    To this end, we generated Macaque Illumina SE reads using flux-simulator and map them to
    hg38 and for comparison we were also trying also Mmul8, downloaded from ensembl (wget

    Everything mapped fine to hg38, but not to Mmul8.

    Exception in thread "Thread-12" java.lang.AssertionError
    at align2.BBIndex.extendScore(
    at align2.BBIndex.slowWalk3(
    at align2.BBIndex.find(
    at align2.BBIndex.find(
    at align2.BBIndex.findAdvanced(
    at align2.AbstractMapThread.quickMap(
    at align2.BBMapThread.processRead(

    I tried to run on one thread, increased memory to 101G, removed small contigs of <100kb ... but the error message remains the same.

    We are running a Debian system with java version "1.8.0_181" and have BBMap version 38.02 -- the detailed error output is in the attached file.

    The false Mapping Rates of bbmap are so much better than for STAR & GSNAP, that we definitely want to use bbmap for our paper and we are nearly done all other species (marmoset, gorilla, chimpanzee and orangutan) and the simulations ran through -- the only missing piece is the mapping to the Mmul8.

    Any help would be greatly appreciated.

    Best, Ines
    Attached Files


    • bbmap for demultiplexing dual barcodes.

      I need it if possible to use dual indexes.

      For example: In bold dual barcode

      #R1 read

      #R2 read

      Here are 16 possible in the file I am working on.

      The first four nts are the barcode like our example before would be:

      But you would need both reads to tell you that it's GACT-CTGA and not something else.
      What would the command look like for this? Does this demux script do the dual barcoding?


      • ref input for BBMap and paired ends

        I am sorry if this question is very basic but I am getting a low percentage of mapping reads to the reference genome, about the 36% of the pct reads mapped. Any clue what this is the case?

        I am using as the reference genome the genome in scaffolds and paired-end reads...


        • Originally posted by juanita View Post
          I am sorry if this question is very basic but I am getting a low percentage of mapping reads to the reference genome, about the 36% of the pct reads mapped. Any clue what this is the case?

          I am using as the reference genome the genome in scaffolds and paired-end reads...
          Have you trimmed adapters away from the reads (short fragments will create reads that are part genomic and part adapter and may not map). You could use the related BBmap tool sendsketch to get a sense of what is in your reads (after trimming). When we do genotyping of samples, many samples have contaminating using sendsketch can help figure out what is in there. You can input the entire fastq file with sendsketch, or go to read mose and get a result on a per read basis.

          You can also grab 100 reads, turn them into fasta format and do blastn with them (if online use the blastn rather than megablast option) and see read by read what is in there.

          Other options...your sample is not highly related to the reference, the reference may be incomplete and missing regions, the reference is lacking high copy repeat content like mtDNA or chloroplast and many reads go to those.
          Providing nextRAD genotyping and PacBio sequencing services.


          • How to use usejni with latest version (

            I just installed the latest version of the BBTools (38.26), and I can't seem to get the usejni flag to work. The java .so file compiles fine, but then I get error messages like this when I run, e.g., with usejni=t:
            Native library can not be found in java.library.path.
            I found this in the changelog:
            Removed JNI path flag from BBMerge, BBMap, and RQCFilter shell scripts.
            and this in docs/compiling.txt:
            3) C code. This was developed by Jonathan Rood to accelerate BBMap, BBMerge, and Dedupe, but is currently disabled.
            And sure enough, it is commented out in the code:
                    #local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
                    local CMD="java $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
            If I revert to the previous version of the CMD, with the java.library.path set, then the command runs with the C code fine.

            Why was this disabled? Does this affect previous analyses that used this C code? Or was this purely a performance issue?

            Sorry if I've missed this posted somewhere else, and thanks in advance for any help.



            • usejni and compiled C code in BBTools

              I just installed the latest version of the BBTools (38.26), and I notice that the C code provided by the usejni=t flag for some tools has been depreciated / disabled.

              I found this in the changelog:
              Removed JNI path flag from BBMerge, BBMap, and RQCFilter shell scripts.
              and this in docs/compiling.txt:
              3) C code. This was developed by Jonathan Rood to accelerate BBMap, BBMerge, and Dedupe, but is currently disabled.
              Sure enough, it is commented out in the code:
                      #local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
                      local CMD="java $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
              If I revert to the previous version of the CMD, with the java.library.path set, then the command runs with the compiled C code just fine.

              Why was this disabled? Does this affect previous analyses that used this C code? That is, does the C code contain an error that means usejni=t in previous versions will produce different output than the java-only code? Or was this purely a performance or compatibility issue, or something else?

              Sorry if I've missed this already posted somewhere, and thanks in advance for any help.



              • usejni and compiled C code in BBTools

                I just installed the latest version of the BBTools (38.26), and I notice that the C code provided by the usejni=t flag for some tools has been depreciated / disabled.

                I found this in the changelog:
                Removed JNI path flag from BBMerge, BBMap, and RQCFilter shell scripts.
                and this in docs/compiling.txt:
                3) C code. This was developed by Jonathan Rood to accelerate BBMap, BBMerge, and Dedupe, but is currently disabled.
                Sure enough, it is commented out in the code:
                        #local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
                        local CMD="java $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
                If I revert to the previous version of the CMD, with the java.library.path set, then the command runs with the compiled C code just fine.

                Why was this disabled? Does this affect previous analyses that used this C code? That is, does the C code contain an error that means usejni=t in previous versions will produce different output than the java-only code? Or was this purely a performance or compatibility issue, or something else?

                Sorry if I've missed this already posted somewhere, and thanks in advance for any help.



                • Hi Brian & all,
                  I'm using BBmap 38.26 with a very big reference genome, and some chromosome in this genome is big enough to break the bbmap ref building session.

                  Here is the fasta index of this reference:
                  Chr01 301019445 7 60 61
                  Chr02 163962470 306036450 60 61
                  Chr03 261511374 472731635 60 61
                  Chr04 215701946 738601539 60 61
                  Chr05 217274494 957898525 60 61
                  Chr06 219521584 1178794268 60 61
                  Chr07 222112641 1401974553 60 61
                  Chr08 153299543 1627789079 60 61
                  Chr09 238794889 1783643622 60 61
                  Chr10 205736368 2026418433 60 61
                  Chr11 220335243 2235583748 60 61
                  Chr12 229934170 2459591253 60 61
                  Chr00 714758103 2693357667 60 61
                  Can see that the longest chromosome is beyond 536670912, which cause a problem like this:
                  bbmap-38.26/ ref=ref.fasta rebuild=t usemodulo=t -Xmx60g
                  java -ea -Xmx60g -cp /home/sn/software/bbmap-38.26/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 ref=/home/yangjy/16T4/Genome/GEN181516HEB/_db/Capsicum.annuum.L_Zunla-1_Release_2.0.fasta rebuild=t usemodulo=t -Xmx60g
                  Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, ref=/home/yangjy/16T4/Genome/GEN181516HEB/_db/Capsicum.annuum.L_Zunla-1_Release_2.0.fasta, rebuild=t, usemodulo=t, -Xmx60g]
                  Version 38.26

                  No output file.
                  Writing reference.
                  Executing dna.FastaToChromArrays2 [/home/yangjy/16T4/Genome/GEN181516HEB/_db/Capsicum.annuum.L_Zunla-1_Release_2.0.fasta, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=false]

                  Set genScaffoldInfo=true
                  Writing chunk 1
                  Writing chunk 2
                  Writing chunk 3
                  Writing chunk 4
                  Writing chunk 5
                  Writing chunk 6
                  Exception in thread "main" java.lang.AssertionError: 714758103, 8000, 7999, 536670912
                  at dna.FastaToChromArrays2.makeNextChrom(
                  at dna.FastaToChromArrays2.makeChroms(
                  at dna.FastaToChromArrays2.main2(
                  at align2.RefToIndex.makeIndex(
                  at align2.BBMap.setup(
                  at align2.AbstractMapper.<init>(
                  at align2.BBMap.<init>(
                  at align2.BBMap.main(
                  I'm pretty sure it's the 'maxlen' argument of dna.FastaToChromArrays2 that is not fit my situation, but I'm not sure how can I fix this.

                  Did anyone deal with this kinda things before? Any suggestion and discussion is of help! >_<


                  • I see that you are assigning 60G of RAM. Have you tried to assign more and see if it helps?


                    • That's great new. I will download it asap . Thanks for sharing!
                      I am Sarah, an enthusiastic blondie that has worked as a Brussels escort. These days I am a full-time blogger .


                      • Originally posted by GenoMax View Post
                        I see that you are assigning 60G of RAM. Have you tried to assign more and see if it helps?
                        Thanks for your replying. In my test I've tried adding java RAM upper limit from 20G all the way to 400G. Yet still the "maxlen" arguments of dna.FastaToChromArrays2 hadn't changed, neighter the error message.


                        • @1989sn1027: Brian has not been participating on SA for last few months. You could try to create a ticket at Source Forge and see if he responds to this report.


                          • Originally posted by GenoMax View Post
                            @1989sn1027: Brian has not been participating on SA for last few months. You could try to create a ticket at Source Forge and see if he responds to this report.
                            Thanks for your directing. I'll give that a shot.


                            • Hi Brain, could you please answer my questions posted here at your convenience?

                              Thanks in advance.


                              • problem with output

                                Hi I am having troubles when running bbwrap. Also how can I get a file that tells me that perfectage that was mapped and unmapped.

                                This is what I am running:

                                cd /space/home/aguilar/Ofav_temp/Trim
                                /space/home/aguilar/Programs/bbmap/ t=40 in=\
                                S1_F_paired_1.fq,S10_F_paired_1.fq,S11_F_paired_1.fq,S12_F_paired_1.fq,S13_F_paired_1.fq,S14_F_paired_1.fq,S15_F_paired_1.fq,S16_F_paired_1.fq,S17_F_paired_1.fq,S18_F_paired_1.fq,S19_F_paired_1.fq,S2_F_paired_1.fq,S20_F_paired_1.fq,S21_F_paired_1.fq,S22_F_paired_1.fq,S23_F_paired_1.fq,S24_F_paired_1.fq,S25_F_paired_1.fq,S26_F_paired_1.fq,S27_F_paired_1.fq,S28_F_paired_1.fq,S29_F_paired_1.fq,S3_F_paired_1.fq,S30_F_paired_1.fq,S31_F_paired_1.fq,S32_F_paired_1.fq,S33_F_paired_1.fq,S34_F_paired_1.fq,S35_F_paired_1.fq,S36_F_paired_1.fq,S37_F_paired_1.fq,S38_F_paired_1.fq,S39_F_paired_1.fq,S4_F_paired_1.fq,S40_F_paired_1.fq,S41_F_paired_1.fq,S42_F_paired_1.fq,S43_F_paired_1.fq,S44_F_paired_1.fq,S45_F_paired_1.fq,S46_F_paired_1.fq,S47_F_paired_1.fq,S48_F_paired_1.fq,S5_F_paired_1.fq,S6_F_paired_1.fq,S7_F_paired_1.fq,S8_F_paired_1.fq,S9_F_paired_1.fq \
                                in2=S1_R_paired_2.fq,S10_R_paired_2.fq,S11_R_paired_2.fq,S12_R_paired_2.fq,S13_R_paired_2.fq,S14_R_paired_2.fq,S15_R_paired_2.fq,S16_R_paired_2.fq,S17_R_paired_2.fq,S18_R_paired_2.fq,S19_R_paired_2.fq,S2_R_paired_2.fq,S20_R_paired_2.fq,S21_R_paired_2.fq,S22_R_paired_2.fq,S23_R_paired_2.fq,S24_R_paired_2.fq,S25_R_paired_2.fq,S26_R_paired_2.fq,S27_R_paired_2.fq,S28_R_paired_2.fq,S29_R_paired_2.fq,S3_R_paired_2.fq,S30_R_paired_2.fq,S31_R_paired_2.fq,S32_R_paired_2.fq,S33_R_paired_2.fq,S34_R_paired_2.fq,S35_R_paired_2.fq,S36_R_paired_2.fq,S37_R_paired_2.fq,S38_R_paired_2.fq,S39_R_paired_2.fq,S4_R_paired_2.fq,S40_R_paired_2.fq,S41_R_paired_2.fq,S42_R_paired_2.fq,S43_R_paired_2.fq,S44_R_paired_2.fq,S45_R_paired_2.fq,S46_R_paired_2.fq,S47_R_paired_2.fq,S48_R_paired_2.fq,S5_R_paired_2.fq,S6_R_paired_2.fq,S7_R_paired_2.fq,S8_R_paired_2.fq,S9_R_paired_2.fq \
                                ref=/space/home/aguilar/Ofav_temp/Genomes/Orbicella_faveolata_v2_scaffolds.fa \
                                outu=/space/home/aguilar/Ofav_temp/bbmap/ReadsUnm.R1.fastq.gz \
                                outu2=/space/home/aguilar/Ofav_temp/bbmap/ReadsUnmR2.fastq.gz \
                                outm=/space/home/aguilar/Ofav_temp/bbmap/ReadsMappedR1.fastq.gz \
                                outm2=/space/home/aguilar/Ofav_temp/bbmap/ReadsMappedR2.fastq.gz \

                                And I am getting this message:

                                Retaining first best site only for ambiguous mappings.
                                No output file.
                                Exception in thread "main" java.lang.AssertionError: ASCII encoding for quality (currently ASCII-33) appears to be wrong.



                                Latest Articles


                                • seqadmin
                                  Understanding Genetic Influence on Infectious Disease
                                  by seqadmin

                                  During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                                  Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                                  09-09-2024, 10:59 AM
                                • seqadmin
                                  Addressing Off-Target Effects in CRISPR Technologies
                                  by seqadmin

                                  The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
                                  08-27-2024, 04:44 AM





                                Topics Statistics Last Post
                                Started by seqadmin, Today, 06:25 AM
                                0 responses
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 01:02 PM
                                0 responses
                                Last Post seqadmin  
                                Started by seqadmin, 09-18-2024, 06:39 AM
                                0 responses
                                Last Post seqadmin  
                                Started by seqadmin, 09-11-2024, 02:44 PM
                                0 responses
                                Last Post seqadmin  