Seqanswers Leaderboard Ad

**ellybelly** · 08-23-2018, 06:44 AM

bbmap aborts after mapping some reads

Hello Brian,

we are using bbmap to see in how far it is possible to quantify gene expression by mapping Illumina RNA-seq reads to the genome of a closely related species, e.g. map chimpanzee reads to human or as in this example Macaque reads.

To this end, we generated Macaque Illumina SE reads using flux-simulator and map them to
hg38 and for comparison we were also trying also Mmul8, downloaded from ensembl (wget ftp://ftp.ensembl.org/pub/release-92...toplevel.fa.gz).

Everything mapped fine to hg38, but not to Mmul8.

Exception in thread "Thread-12" java.lang.AssertionError
at align2.BBIndex.extendScore(BBIndex.java:2612)
at align2.BBIndex.slowWalk3(BBIndex.java:1389)
at align2.BBIndex.find(BBIndex.java:777)
at align2.BBIndex.find(BBIndex.java:623)
at align2.BBIndex.findAdvanced(BBIndex.java:400)
at align2.AbstractMapThread.quickMap(AbstractMapThread.java:750)
at align2.BBMapThread.processRead(BBMapThread.java:408)
at align2.AbstractMapThread.run(AbstractMapThread.java:508)

I tried to run on one thread, increased memory to 101G, removed small contigs of <100kb ... but the error message remains the same.

We are running a Debian system with java version "1.8.0_181" and have BBMap version 38.02 -- the detailed error output is in the attached file.

The false Mapping Rates of bbmap are so much better than for STAR & GSNAP, that we definitely want to use bbmap for our paper and we are nearly done all other species (marmoset, gorilla, chimpanzee and orangutan) and the simulations ran through -- the only missing piece is the mapping to the Mmul8.

Any help would be greatly appreciated.

Best, Ines

Attached Files

Mmul1.701837.txt (4.0 KB, 127 views)

**raw937** · 09-07-2018, 09:24 AM

bbmap for demultiplexing dual barcodes.

Hello,
I need it if possible to use dual indexes.

For example: In bold dual barcode

#R1 read
@SOLEXA1_0069_FC:3:1:1673:948#ACAGTG/1
GACTAACCGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAATGTTAGCCGTCGGGCAGTATACTGTTCGG
+
BMMQNTWSWWb_____b_bb__________Y_________YYYYY[[[Y[__________XXRWXVVVVTYYYYYT

#R2 read
@SOLEXA1_0069_FC:3:1:1673:948#ACAGTG/2
CTGAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATCTCACGACACGAGCTGACGACAGCCATGCAGCACCTGT
+
ghgaggfghhhhhhhhhhghhhhhhhhhhfhhhghfWffch[hhgahhedffddR[^W^Zc^_cac[Wb]^W^

Here are 16 possible in the file I am working on.
TCAG-TCAG
CTGA-CTGA
TCAG-GACT
GACT-GACT
AGTC-AGTC
GACT-TCAG
GACT-AGTC
GACT-CTGA
TCAG-CTGA
AGTC-TCAG
AGTC-GACT
CTGA-AGTC
CTGA-GACT
AGTC-CTGA
TCAG-AGTC
CTGA-TCAG

The first four nts are the barcode like our example before would be:
GACT-CTGA_R1.fq
GACT-CTGA_R2.fq

But you would need both reads to tell you that it's GACT-CTGA and not something else.
What would the command look like for this? Does this demux script do the dual barcoding?

**juanita** · 09-25-2018, 08:01 AM

ref input for BBMap and paired ends

I am sorry if this question is very basic but I am getting a low percentage of mapping reads to the reference genome, about the 36% of the pct reads mapped. Any clue what this is the case?

I am using as the reference genome the genome in scaffolds and paired-end reads...

**SNPsaurus** · 09-25-2018, 09:47 AM

Originally posted by juanita View Post

I am sorry if this question is very basic but I am getting a low percentage of mapping reads to the reference genome, about the 36% of the pct reads mapped. Any clue what this is the case?

I am using as the reference genome the genome in scaffolds and paired-end reads...

Have you trimmed adapters away from the reads (short fragments will create reads that are part genomic and part adapter and may not map). You could use the related BBmap tool sendsketch to get a sense of what is in your reads (after trimming). When we do genotyping of samples, many samples have contaminating species...so using sendsketch can help figure out what is in there. You can input the entire fastq file with sendsketch, or go to read mose and get a result on a per read basis.

You can also grab 100 reads, turn them into fasta format and do blastn with them (if online use the blastn rather than megablast option) and see read by read what is in there.

Other options...your sample is not highly related to the reference, the reference may be incomplete and missing regions, the reference is lacking high copy repeat content like mtDNA or chloroplast and many reads go to those.

**csmiller** · 10-10-2018, 01:40 PM

How to use usejni with latest version (

I just installed the latest version of the BBTools (38.26), and I can't seem to get the usejni flag to work. The java .so file compiles fine, but then I get error messages like this when I run, e.g., bbmap.sh with usejni=t:

Code:

Native library can not be found in java.library.path.

I found this in the changelog:

Removed JNI path flag from BBMerge, BBMap, and RQCFilter shell scripts.

and this in docs/compiling.txt:

3) C code. This was developed by Jonathan Rood to accelerate BBMap, BBMerge, and Dedupe, but is currently disabled.

And sure enough, it is commented out in the bbmap.sh code:

Code:

        #local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
        local CMD="java $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"

If I revert to the previous version of the CMD, with the java.library.path set, then the command runs with the C code fine.

Why was this disabled? Does this affect previous analyses that used this C code? Or was this purely a performance issue?

Sorry if I've missed this posted somewhere else, and thanks in advance for any help.

Chris

**csmiller** · 10-11-2018, 03:48 AM

usejni and compiled C code in BBTools

I just installed the latest version of the BBTools (38.26), and I notice that the C code provided by the usejni=t flag for some tools has been depreciated / disabled.

I found this in the changelog:

Removed JNI path flag from BBMerge, BBMap, and RQCFilter shell scripts.

and this in docs/compiling.txt:

3) C code. This was developed by Jonathan Rood to accelerate BBMap, BBMerge, and Dedupe, but is currently disabled.

Sure enough, it is commented out in the bbmap.sh code:

Code:

        #local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
        local CMD="java $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"

If I revert to the previous version of the CMD, with the java.library.path set, then the command runs with the compiled C code just fine.

Why was this disabled? Does this affect previous analyses that used this C code? That is, does the C code contain an error that means usejni=t in previous versions will produce different output than the java-only code? Or was this purely a performance or compatibility issue, or something else?

Sorry if I've missed this already posted somewhere, and thanks in advance for any help.

Chris

**csmiller** · 10-11-2018, 07:36 AM

usejni and compiled C code in BBTools

I just installed the latest version of the BBTools (38.26), and I notice that the C code provided by the usejni=t flag for some tools has been depreciated / disabled.

I found this in the changelog:

Removed JNI path flag from BBMerge, BBMap, and RQCFilter shell scripts.

and this in docs/compiling.txt:

3) C code. This was developed by Jonathan Rood to accelerate BBMap, BBMerge, and Dedupe, but is currently disabled.

Sure enough, it is commented out in the bbmap.sh code:

Code:

        #local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"
        local CMD="java $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 $@"

If I revert to the previous version of the CMD, with the java.library.path set, then the command runs with the compiled C code just fine.

Why was this disabled? Does this affect previous analyses that used this C code? That is, does the C code contain an error that means usejni=t in previous versions will produce different output than the java-only code? Or was this purely a performance or compatibility issue, or something else?

Sorry if I've missed this already posted somewhere, and thanks in advance for any help.

Chris

**1989sn1027** · 10-24-2018, 07:15 PM

Hi Brian & all,
I'm using BBmap 38.26 with a very big reference genome, and some chromosome in this genome is big enough to break the bbmap ref building session.

Here is the fasta index of this reference:

Chr01 301019445 7 60 61
Chr02 163962470 306036450 60 61
Chr03 261511374 472731635 60 61
Chr04 215701946 738601539 60 61
Chr05 217274494 957898525 60 61
Chr06 219521584 1178794268 60 61
Chr07 222112641 1401974553 60 61
Chr08 153299543 1627789079 60 61
Chr09 238794889 1783643622 60 61
Chr10 205736368 2026418433 60 61
Chr11 220335243 2235583748 60 61
Chr12 229934170 2459591253 60 61
Chr00 714758103 2693357667 60 61

Can see that the longest chromosome is beyond 536670912, which cause a problem like this:

bbmap-38.26/bbmap.sh ref=ref.fasta rebuild=t usemodulo=t -Xmx60g

java -ea -Xmx60g -cp /home/sn/software/bbmap-38.26/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 ref=/home/yangjy/16T4/Genome/GEN181516HEB/_db/Capsicum.annuum.L_Zunla-1_Release_2.0.fasta rebuild=t usemodulo=t -Xmx60g
Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, ref=/home/yangjy/16T4/Genome/GEN181516HEB/_db/Capsicum.annuum.L_Zunla-1_Release_2.0.fasta, rebuild=t, usemodulo=t, -Xmx60g]
Version 38.26

No output file.
Writing reference.
Executing dna.FastaToChromArrays2 [/home/yangjy/16T4/Genome/GEN181516HEB/_db/Capsicum.annuum.L_Zunla-1_Release_2.0.fasta, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=false]

Set genScaffoldInfo=true
Writing chunk 1
Writing chunk 2
Writing chunk 3
Writing chunk 4
Writing chunk 5
Writing chunk 6
Exception in thread "main" java.lang.AssertionError: 714758103, 8000, 7999, 536670912
at dna.FastaToChromArrays2.makeNextChrom(FastaToChromArrays2.java:440)
at dna.FastaToChromArrays2.makeChroms(FastaToChromArrays2.java:343)
at dna.FastaToChromArrays2.main2(FastaToChromArrays2.java:151)
at align2.RefToIndex.makeIndex(RefToIndex.java:147)
at align2.BBMap.setup(BBMap.java:278)
at align2.AbstractMapper.<init>(AbstractMapper.java:57)
at align2.BBMap.<init>(BBMap.java:43)
at align2.BBMap.main(BBMap.java:31)

I'm pretty sure it's the 'maxlen' argument of dna.FastaToChromArrays2 that is not fit my situation, but I'm not sure how can I fix this.

Did anyone deal with this kinda things before? Any suggestion and discussion is of help! >_<

**GenoMax** · 10-25-2018, 03:02 AM

I see that you are assigning 60G of RAM. Have you tried to assign more and see if it helps?

**Sarah Muller** · 10-26-2018, 06:00 AM

That's great new. I will download it asap

. Thanks for sharing!

**1989sn1027** · 10-28-2018, 05:51 PM

Originally posted by GenoMax View Post

I see that you are assigning 60G of RAM. Have you tried to assign more and see if it helps?

Thanks for your replying. In my test I've tried adding java RAM upper limit from 20G all the way to 400G. Yet still the "maxlen" arguments of dna.FastaToChromArrays2 hadn't changed, neighter the error message.

**GenoMax** · 10-29-2018, 03:57 AM

@1989sn1027: Brian has not been participating on SA for last few months. You could try to create a ticket at Source Forge and see if he responds to this report.

**1989sn1027** · 10-29-2018, 05:28 PM

Originally posted by GenoMax View Post

@1989sn1027: Brian has not been participating on SA for last few months. You could try to create a ticket at Source Forge and see if he responds to this report.

Thanks for your directing. I'll give that a shot.

**zeam** · 12-06-2018, 05:38 AM

Hi Brain, could you please answer my questions posted here at your convenience? http://seqanswers.com/forums/showthread.php?t=85967

Thanks in advance.

**catagui** · 12-13-2018, 10:47 AM

problem with output

Hi I am having troubles when running bbwrap. Also how can I get a file that tells me that perfectage that was mapped and unmapped.

This is what I am running:

#!/bin/bash
cd /space/home/aguilar/Ofav_temp/Trim
/space/home/aguilar/Programs/bbmap/bbwrap.sh t=40 in=\
S1_F_paired_1.fq,S10_F_paired_1.fq,S11_F_paired_1.fq,S12_F_paired_1.fq,S13_F_paired_1.fq,S14_F_paired_1.fq,S15_F_paired_1.fq,S16_F_paired_1.fq,S17_F_paired_1.fq,S18_F_paired_1.fq,S19_F_paired_1.fq,S2_F_paired_1.fq,S20_F_paired_1.fq,S21_F_paired_1.fq,S22_F_paired_1.fq,S23_F_paired_1.fq,S24_F_paired_1.fq,S25_F_paired_1.fq,S26_F_paired_1.fq,S27_F_paired_1.fq,S28_F_paired_1.fq,S29_F_paired_1.fq,S3_F_paired_1.fq,S30_F_paired_1.fq,S31_F_paired_1.fq,S32_F_paired_1.fq,S33_F_paired_1.fq,S34_F_paired_1.fq,S35_F_paired_1.fq,S36_F_paired_1.fq,S37_F_paired_1.fq,S38_F_paired_1.fq,S39_F_paired_1.fq,S4_F_paired_1.fq,S40_F_paired_1.fq,S41_F_paired_1.fq,S42_F_paired_1.fq,S43_F_paired_1.fq,S44_F_paired_1.fq,S45_F_paired_1.fq,S46_F_paired_1.fq,S47_F_paired_1.fq,S48_F_paired_1.fq,S5_F_paired_1.fq,S6_F_paired_1.fq,S7_F_paired_1.fq,S8_F_paired_1.fq,S9_F_paired_1.fq \
in2=S1_R_paired_2.fq,S10_R_paired_2.fq,S11_R_paired_2.fq,S12_R_paired_2.fq,S13_R_paired_2.fq,S14_R_paired_2.fq,S15_R_paired_2.fq,S16_R_paired_2.fq,S17_R_paired_2.fq,S18_R_paired_2.fq,S19_R_paired_2.fq,S2_R_paired_2.fq,S20_R_paired_2.fq,S21_R_paired_2.fq,S22_R_paired_2.fq,S23_R_paired_2.fq,S24_R_paired_2.fq,S25_R_paired_2.fq,S26_R_paired_2.fq,S27_R_paired_2.fq,S28_R_paired_2.fq,S29_R_paired_2.fq,S3_R_paired_2.fq,S30_R_paired_2.fq,S31_R_paired_2.fq,S32_R_paired_2.fq,S33_R_paired_2.fq,S34_R_paired_2.fq,S35_R_paired_2.fq,S36_R_paired_2.fq,S37_R_paired_2.fq,S38_R_paired_2.fq,S39_R_paired_2.fq,S4_R_paired_2.fq,S40_R_paired_2.fq,S41_R_paired_2.fq,S42_R_paired_2.fq,S43_R_paired_2.fq,S44_R_paired_2.fq,S45_R_paired_2.fq,S46_R_paired_2.fq,S47_R_paired_2.fq,S48_R_paired_2.fq,S5_R_paired_2.fq,S6_R_paired_2.fq,S7_R_paired_2.fq,S8_R_paired_2.fq,S9_R_paired_2.fq \
ref=/space/home/aguilar/Ofav_temp/Genomes/Orbicella_faveolata_v2_scaffolds.fa \
outu=/space/home/aguilar/Ofav_temp/bbmap/ReadsUnm.R1.fastq.gz \
outu2=/space/home/aguilar/Ofav_temp/bbmap/ReadsUnmR2.fastq.gz \
outm=/space/home/aguilar/Ofav_temp/bbmap/ReadsMappedR1.fastq.gz \
outm2=/space/home/aguilar/Ofav_temp/bbmap/ReadsMappedR2.fastq.gz \

And I am getting this message:

Retaining first best site only for ambiguous mappings.
No output file.
Exception in thread "main" java.lang.AssertionError: ASCII encoding for quality (currently ASCII-33) appears to be wrong.
+��[ԽMo3͒,6��+.�7��l�.��®�w�.��}m��2"��F��#Q�

Thanks

Topics	Statistics	Last Post
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, Yesterday, 05:31 AM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 05:31 AM
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, 10-24-2024, 06:58 AM	0 responses 20 views 0 likes	Last Post by seqadmin 10-24-2024, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, 10-23-2024, 08:43 AM	0 responses 48 views 0 likes	Last Post by seqadmin 10-23-2024, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 58 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News