Seqanswers Leaderboard Ad

**darencard** · 04-04-2019, 09:50 AM

Divide by 0 error in randomreads.sh

I am having an issue with randomreads.sh that I cannot make sense of myself.

I am using this tool to try to extract a random subset of a genome. Most tools subset by selecting some proportion of sequences, but I want to randomly sample pieces of randomly-sampled sequences. So read simulators seem to be the better option for this.

In this case, I'm trying to sample a (giant!) salamander genome from NCBI. For now I just have some arbitrary length/number settings, as follows:

randomreads.sh ref=GCA_002915635.2_ASM291563v2_genomic.fna out=test.fq reads=100 minlength=50000 maxlength=500000 seed=5 banns=t adderrors=f

As the command shows, I do not want variants or errors added in at all; the sequences should be identical to the reference genome.

Here is the output I'm getting, which indicates some sort of 'divide by 0' error. Hopefully someone can help me diagnose and overcome this issue.

Executing align2.RandomReads3 [build=1, ref=GCA_002915635.2_ASM291563v2_genomic.fna, out=test.fq, reads=100, minlength=50000, maxlength=500000, seed=5, banns=t, adderrors=f]

Writing reference.
Executing dna.FastaToChromArrays2 [GCA_002915635.2_ASM291563v2_genomic.fna, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=500, nodisk=false]

Set genScaffoldInfo=true
Exception in thread "main" java.lang.ArithmeticException: / by zero
at align2.RandomReads3.fillRandomChrom(RandomReads3.java:1758)
at align2.RandomReads3.<init>(RandomReads3.java:585)
at align2.RandomReads3.main(RandomReads3.java:389)

Thanks!
Daren

**tamu_anand** · 06-03-2019, 12:46 AM

Has anyone used bbmap for QuantSeq Data Analysis (more precisely the QuantSeq FWD protocol). The Lexogen website recommends bbduk for quality trimming and suggests use of STAR for downstream analysis.

Is it possible to do something similar (to STAR) with bbmap? In other words, is there an analogous bbmap command similar to how one does mapping with STAR (using the genome index and gtf together)?

Thanks in advance.

**kmavrommatis** · 12-09-2019, 01:55 PM

reformat.sh Failed to auto-detect quality coding

Hi,
when I process some fastq files from SRA using the following command

reformat.sh maxcalledquality=40 out=SRR1_R1.fq.gz qout=33 mincalledquality=2 out2=SRR1_R2.fq.gz qin=auto fixjunk=t in=/RNA-Seq/Raw/fastq.20190712/SRR1_1.fastq.gz in2=/RNA-Seq/Raw/fastq.20190712/SRR1_2.fastq.gz

reformat crashes with the following error:

Set INTERLEAVED to false
Changed from ASCII-33 to ASCII-64 on input quality 64 for base N while prescanning.
Changed from ASCII-64 to ASCII-33 on input quality 35 while prescanning.
Exception in thread "main" java.lang.AssertionError: Failed to auto-detect quality coding; quitting. Please manually set qin=33 or qin=64.
at stream.FASTQ.testQuality(FASTQ.java:218)
at stream.FASTQ.isInterleaved(FASTQ.java:129)
at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentRea at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentReadInputStream.java:119)
at jgi.ReformatReads.process(ReformatReads.java:377)
m(ConcurrentReadInputStream.java:55)
at jgi.ReformatReads.process(ReformatReads.java:377)
at jgi.ReformatReads.main(ReformatReads.java:45)

Any suggestions on how to address this problem?
I would prefer to avoid checking each file individually with a different tool and setting the qin value for each. This is part of a bigger pipeline that is intented to be applied to 100s of samples.

Thanks in advance for your help

**kmavrommatis** · 12-09-2019, 02:30 PM

reformat.sh Failed to auto-detect quality coding

Hi,
when I am running reformat.sh on a set of fastq files downloaded from SRA

reformat.sh maxcalledquality=40 out=SRR1_R1.fq.gz qout=33 mincalledquality=2 out2=SRR1_R2.fq.gz qin=auto fixjunk=t in=/RNA-Seq/Raw/fastq.20190712/SRR1_1.fastq.gz in2=/RNA-Seq/Raw/fastq.20190712/SRR1_2.fastq.gz

The program crashes with the following error:

Set INTERLEAVED to false
Changed from ASCII-33 to ASCII-64 on input quality 64 for base N while prescanning.
Changed from ASCII-64 to ASCII-33 on input quality 35 while prescanning.
Exception in thread "main" java.lang.AssertionError: Failed to auto-detect quality coding; quitting. Please manually set qin=33 or qin=64.
at stream.FASTQ.testQuality(FASTQ.java:218)
at stream.FASTQ.isInterleaved(FASTQ.java:129)
at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentRea at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentReadInputStream.java:119)
at jgi.ReformatReads.process(ReformatReads.java:377)
m(ConcurrentReadInputStream.java:55)
at jgi.ReformatReads.process(ReformatReads.java:377)
at jgi.ReformatReads.main(ReformatReads.java:45)

The version of BBMAP used is 37.64.
Any advice on how to address this issue?
This is part of a bigger pipeline and the intention is to apply it on 100s of files.
Thanks in advance for your help.

**SNPsaurus** · 12-09-2019, 09:14 PM

Did you try the suggested "Please manually set qin=33 or qin=64"?

**robinsleith** · 12-12-2019, 08:08 AM

Change in minid implementation?

Hello, I recently noticed that newer versions >=38.68 do not seem to implement minid in the same way as previous versions.

For the same command (same minid, 99), same reads, same reference, with versions <38.68 I get 0 mated pairs mapped, 4 Read 1 mapped, and 6 Read 2 mapped.

Code:

bbmap.sh nodisk minid=99 mappedonly=T threads=4 ref=parcu18S.fasta in=LKH421_FPE_q24_minlen100.fastq.gz in2=LKH421_RPE_q24_minlen100.fastq.gz out=stdout.sam | reformat.sh in=stdin.sam out=stdout.sam minlength=80 | samtools view -h -b -S | samtools sort >67_99parcu.bam

For versions >=38.68 I get 94 mated pairs mapped, 94 Read 1 reads mapped, and 94 Read 2 reads mapped.

Is the minid command just not working in newer versions or is there a new detail I am missing?
Thanks!

**GenoMax** · 12-12-2019, 09:36 AM

@Robinsleith: I suggest that you post this as a ticket on BBMap SF page. Brian no longer visits SeqAnswers regularly. He would be the only person who can answer this.

**OwenLeiser** · 03-02-2021, 12:42 PM

mapPacBio strange behavior

Hi there, I really like this tool for Illumina reads and am trying it on some PacBio/Nanopore reads using the mapPacBio.sh function.

I am "successfully" getting aligned bam files out the other end, but the log file lists >99% of reads as Low-Q Discards. I also get unmapped fastq files out (I do this for my own sanity) and these are usually larger than the input files themselves.

Should I be suspicious of this behavior? I have also tried using the low-quality data suggestion (setting the key to a lower value, adjusting the minimum score ratio, etc), but this did not result in any improvement. I know the input files themselves - subread fastq.gz - are "good enough" for Canu assembly but for I'd also like to align the reads for consensus sequence generation.

Thanks!

**pck0** · 05-11-2021, 12:03 PM

Is mapping stochastic?

Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

1st run:

>one: 47,699
>two: 330

2nd run:

>three: 47,688
>four: 338

BBmap options were all default, just minidentity = 90 and T=12

How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

Just curious, my apologies if this was addressed elsewhere!

cheers

**GenoMax** · 05-12-2021, 05:10 AM

Originally posted by pck0 View Post

Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

1st run:

>one: 47,699
>two: 330

2nd run:

>three: 47,688
>four: 338

BBmap options were all default, just minidentity = 90 and T=12

How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

Just curious, my apologies if this was addressed elsewhere!

cheers

Most NGS aligners are non-deterministic i.e. they will not produce exactly identical results if run multiple times.

Fortunately, BBMap does have an option to run in deterministic mode.

Code:

deterministic=f         Run in deterministic mode.  In this case it is good
                        to set averagepairdist.  BBMap is deterministic
                        without this flag if using single-ended reads,
                        or run singlethreaded.

You could also run the analysis using just a single thread.

**mewu3** · 09-06-2021, 12:14 AM

bbmap.sh unfished job

Hello, Brian,

I am wondering whether bbmap.sh could resume an unfished job.

mewu3

**GenoMax** · 09-07-2021, 08:30 AM

@mewu3: Unfortunately it can't. You will need to restart the job.

Originally posted by mewu3 View Post

Hello, Brian,

I am wondering whether bbmap.sh could resume an unfished job.

mewu3

**mewu3** · 09-10-2021, 12:13 AM

[help]

Hello,

I am using bbmap on HPC and I get the fallowing message :

Aligning C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001 reads to fasta file ...
java -Djava.library.path=/opt/apps/bbtools-37.97/jni/ -ea -Xmx50G -cp /opt/apps/bbtools-37.97/current/ align2.BBWrap build=1 overwrite=true fastareadlen=500 build=1 in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null trimreaddescriptions=t outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam threads=8 pairlen=1000 pairedonly=t minid=0.9 mdtag=t xstag=fs nmtag=t sam=1.3 ambiguous=best secondary=t saa=f maxsites=10 -Xmx50G
Executing align2.BBWrap [build=1, overwrite=true, fastareadlen=500, build=1, in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null, trimreaddescriptions=t, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, -Xmx50G]

Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]
Version 37.97 [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]

Set threads to 8
Retaining first best site only for ambiguous mappings.
Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.816
Set genome to 1

Loaded Reference: 3.463 seconds.
Loading index for chunk 1-1, build 1
Generated Index: 2.556 seconds.
Analyzed Index: 7.028 seconds.
Started output stream: 0.043 seconds.
Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files.
at stream.ReadStreamWriter.<init>(ReadStreamWriter.java:51)
at stream.ReadStreamByteWriter.<init>(ReadStreamByteWriter.java:17)
at stream.ConcurrentGenericReadOutputStream.<init>(ConcurrentGenericReadOutputStream.java:40)
at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:52)
at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:29)
at align2.AbstractMapper.openStreams(AbstractMapper.java:873)
at align2.BBMap.testSpeed(BBMap.java:437)
at align2.BBMap.main(BBMap.java:34)
at align2.BBWrap.execute(BBWrap.java:144)
at align2.BBWrap.main(BBWrap.java:22)

I get the impression that bbmap is stuck on something and i don't know what's wrong with it. Please help !

mewu3

**HESmith** · 09-10-2021, 08:04 AM

"Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files."

Typically, BBMap tools keep paired reads together. You're attempting to write aligned and unaligned reads to separate files, which violates that function.

**kesmarl** · 09-10-2021, 09:27 AM

Thank you very so much !

I downloaded it

Topics	Statistics	Last Post
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, Yesterday, 05:31 AM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 05:31 AM
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, 10-24-2024, 06:58 AM	0 responses 20 views 0 likes	Last Post by seqadmin 10-24-2024, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, 10-23-2024, 08:43 AM	0 responses 50 views 0 likes	Last Post by seqadmin 10-23-2024, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 58 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News