Seqanswers Leaderboard Ad

**Brian Bushnell** · 03-26-2015, 02:45 PM

You can simulate a 4000bp jump library from your existing data like this (using BBTools):

cat assembly1.fa assembly2.fa > combined.fa
bbmap.sh ref=combined.fa
randomreads.sh reads=1000000 length=100 paired interleaved mininsert=3500 maxinsert=4500 bell perfect=1 q=35 out=jump.fq.gz

That may not help much, since it is only recycling information that is already present in your assemblies, unlike a real jump library.

The orientation will come out forward/reverse; if you need them in a different orientation you can reverse-complement one or both reads with the associated "reformat.sh" tool.

**mjm** · 03-27-2015, 08:49 AM

Hi Brian,

Thanks a lot for your answer!
I'm trying to run BBMap but I get the following error message:
==
mjm@mining:~/sequence_assembly/bbmap> bbmap.sh ref=A.fa
java -Djava.library.path=/home/mjm/sequence_assembly/bbmap/jni/ -ea -Xmx39107m -cp /home/mjm/sequence_assembly/bbmap/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 ref=A.fa
Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, ref=A.fa]

BBMap version 34.72
Retaining first best site only for ambiguous mappings.
No output file.
NOTE: Deleting contents of ref/genome/1 because reference is specified and overwrite=true
Writing reference.
Executing dna.FastaToChromArrays2 [A.fa, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=false]

Set genScaffoldInfo=true
Writing chunk 1
Set genome to 1

Loaded Reference: 0.015 seconds.
Loading index for chunk 1-1, build 1
No index available; generating from reference genome: /home/mjm/sequence_assembly/bbmap/ref/index/1/chr1_index_k13_c5_b1.block
Indexing threads started for block 0-1
Indexing threads finished for block 0-1
Generated Index: 9.051 seconds.
No reads to process; quitting.

Total time: 10.184 seconds.

mjm@mining:~/sequence_assembly/bbmap> randomreads.sh reads=1000000 length=100 paired interleaved mininsert=3500 maxinsert=4500 bell perfect q=35 out=jump.fq.gz
java -ea -Xmx39106m -cp /home/mjm/sequence_assembly/bbmap/current/ align2.RandomReads3 build=1 reads=1000000 length=100 paired interleaved mininsert=3500 maxinsert=4500 bell perfect q=35 out=jump.fq.gz
Exception in thread "main" java.lang.NumberFormatException: For input string: "true"
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
at java.lang.Float.parseFloat(Float.java:451)
at align2.RandomReads3.main(RandomReads3.java:296)

My config is as follows:
mjm@mining:~/sequence_assembly/bbmap> uname -a
Linux mining 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

mjm@mining:~/sequence_assembly/bbmap> java -version
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)

Any idea why it crashes?

mjm

**Brian Bushnell** · 03-27-2015, 09:24 AM

Oh, sorry - instead of "perfect" it should be "perfect=1". That's the fraction of reads that are generated with no errors, and for your usage, you don't want any errors.

randomreads.sh reads=1000000 length=100 paired interleaved mininsert=3500 maxinsert=4500 bell perfect=1 q=35 out=jump.fq.gz

**mjm** · 03-30-2015, 08:58 AM

Hi Brian,

Thanks a lot for your answer. Now it works fine!
I'm wondering if I could switch the quality values to a given numerical value?
For instance, I'd like to replace:
====================
@0_chr1_1_22737261_22737360_185068_scaffold_2
TATCCAGGCGTTCCTCGATACGGAGACCGATGGCACCTGCAAGCGCAATGCATTCGCGGCCCTCATGTCCATCAGCCACCAAAAGGCCCTGGAGTACCTG
+
BGEBEFEHHHJJIGJJJIEJJGIJHGDHGGJJHJHJIHHJJGGJFGJJGIJGHHIIJJHJJHHHIJJHIDIJHHJJJFIGJEIIFHJJIIGIFHCCBBBC
====================

by

====================
@0_chr1_1_22737261_22737360_185068_scaffold_2
TATCCAGGCGTTCCTCGATACGGAGACCGATGGCACCTGCAAGCGCAATGCATTCGCGGCCCTCATGTCCATCAGCCACCAAAAGGCCCTGGAGTACCTG
+
2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
====================

I've tried bbmap.sh ref=A.fa fakefastaquality=2 but it doesn't seem to be the solution.

Many thanks for your help,

mjm

**Brian Bushnell** · 03-30-2015, 10:06 AM

mjm,

I'm not sure why you would want to do that. Are you aware that the symbol "2" indicates the quality value 17, which is fairly low?

Anyway, you could do it like this, assuming your fake LMP reads are in "lmp.fq":

reformat.sh in=lmp.fq out=lmp.fa
reformat.sh in=lmp.fa out=fixed.fq qfake=17

A better value for qfake on synthetic data would be something like 30.

-Brian

**mjm** · 03-30-2015, 12:00 PM

Hi Brian,

Thanks for your solution.
I'm still trying to use the jumping libs I created with bbmap for running matassembler, and I wanted the libs to be as close as possible to the samples provided by metassember.
I'll update the thread once it works...

mjm

**malm** · 07-29-2015, 03:06 AM

Issues...

Hello,

I'm trying to run this exact pipeline on my genomes, and I run into this

"java.lang.OutOfMemoryError: Java heap space" errors, even when using
"-Xmx3200m"

However, if i hit "ctrl+c" while it's still running (but not doing anything), it continues, and quite quickly produces the jump.fq.gz file.

And the file consist of the right amount of reads, but I'm worried because I had to force it through, and also because all the quality values are all either just A's, B's....G's like this:

@0_chr2_0_77814904_77815003_17861_flattened_line_6732
AGGATCTCCAGCCGCTCCTCGCTCTCCTCCTGATGCTCTCTCTCGCTCCTCGCCAGCTGTGACGGCCGCGACCTGGCCTCCTTACGGGACATCTCCTGCC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

Is this OK, or is there a way to fix this?

I'm running it on a quite large cluster, so there should be no problem allocating the needed memory resources...

Martin

**Brian Bushnell** · 07-29-2015, 09:56 AM

So.... first off, that's really strange about the ctrl+c thing. Though I suppose that maybe you are allocating enough memory for the reference but not BBMap's index, which is not really necessary for randomreads.

How much memory is on the computer, and how big is the reference you are using?

I suggest deleting the "ref" folder, which may be corrupt, and then doing something like this (with the specific numbers adjusted as needed):

randomreads.sh ref=reference.fa -Xmx25g reads=1000000 length=100 paired interleaved mininsert=3500 maxinsert=4500 bell perfect=1 q=35 out=jump.fq.gz

As for the fixed quality values, you can remove "q=35" and reads will get random quality values, but that won't affect anything other than making the output file bigger due to worse compression. -Xmx25g will allocate 25 gigabytes of memory; that's probably way more than you need. It should require something like 1.5x the size of the reference file.

Please let me know if this still causes problems.

**malm** · 07-29-2015, 10:22 AM

Hi Brian!

That worked perfectly! I tried up to 10g, as I figured that that should be more than enough, but apparently 25g did the trick.

Thanks a lot!

Best regards,
Martin

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Jumping library

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News