Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GenoMax
    replied
    Has the grid engine job completed?

    Most likely not. After printing the message above BBMap starts loading "genome" index into memory before it starts doing alignments, so be patient .. as long as the job is still "running".

    If the job completed then look in the grid engine error file and let us know what is there.

    Leave a comment:


  • DNA Sorcerer
    replied
    Ops, I had forgotten to load java module before. I did now, bbmap run for a couple of minutes and stopped. This is the output:

    Hello from inside a Grid Engine job running on cl157
    Job beginning at Thu Feb 18 17:03:05 NST 2016
    Job ending at Thu Feb 18 17:03:05 NST 2016
    java -Djava.library.path=/home/cslamovi/CLARKSCV1.2.2-b/bbmap/jni/ -ea -Xmx1310m -cp /home/cslamovi/CLARKSCV1.2.2-b/bbmap/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 ref=contigs.fasta nodisk in1=scratch/s_3_1_sequence.fastq in2=scratch/s_3_2_sequence.fastq covstats=omgen.coverage
    Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, ref=contigs.fasta, nodisk, in1=scratch/s_3_1_sequence.fastq, in2=scratch/s_3_2_sequence.fastq, covstats=omgen.coverage]

    BBMap version 35.82
    Retaining first best site only for ambiguous mappings.
    No output file.
    Executing dna.FastaToChromArrays2 [contigs.fasta, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=false, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=true]

    Set genScaffoldInfo=true
    Set genome to 1

    Loaded Reference: 0.335 seconds.
    Loading index for chunk 1-1, build 1
    Indexing threads started for block 0-1
    Indexing threads finished for block 0-1
    Generated Index: 8.592 seconds.
    Executing jgi.CoveragePileup [covhist=null, covstats=omgen.coverage, basecov=null, bincov=null, physcov=false, 32bit=false, nzo=false, twocolumn=false, secondary=false, covminscaf=0, ksb=true, binsize=1000, startcov=false, strandedcov=false, rpkm=null, normcov=null, normcovo=null, in1=scratch/s_3_1_sequence.fastq, in2=scratch/s_3_2_sequence.fastq]

    Set NONZERO_ONLY to false
    Set TWOCOLUMN to false
    Set USE_SECONDARY_ALIGNMENTS to false
    Set KEEP_SHORT_BINS to true
    Set USE_COVERAGE_ARRAYS to false
    Set USE_BITSETS to true
    Analyzed Index: 6.904 seconds.
    Changed from ASCII-33 to ASCII-64 on input quality 97 while prescanning.
    Cleared Memory: 1.798 seconds.
    Processing reads in paired-ended mode.
    Started read stream.
    Started 1 mapping thread.

    Leave a comment:


  • GenoMax
    replied
    @DNA_sorcerer: @Brian would be along later today with an official answer but first thing to check is what version of java is running on your node/cluster.

    Post output of

    Code:
    $  java -version
    If I remember this right, @Brian only validates BBMap suite against java v.1.7 and 1.8.

    You also may be missing a leading "/" in you file paths (scratch/s_3_1_sequence.fastq) unless "scratch" directory is in the directory from where you are running your command.

    Leave a comment:


  • DNA Sorcerer
    replied
    Hi Brian,

    I tried to run bbmap but got an error, and before going into debugging wanted to see if you can tell me if I have a general configuration issue in the custer (e.g. java), so I know what to tell the sysadmin. Thanks a lot in advance.

    My line is: CLARKSCV1.2.2-b/bbmap/bbmap.sh ref=contigs.fasta nodisk in1=scratch/s_3_1_sequence.fastq in2=scratch/s_3_2_sequence.fastq covstats=omgen.coverage

    And the error message is:
    Hello from inside a Grid Engine job running on cl339
    Job beginning at Thu Feb 18 16:30:54 NST 2016
    Job ending at Thu Feb 18 16:30:54 NST 2016
    java -Djava.library.path=/home/cslamovi/CLARKSCV1.2.2-b/bbmap/jni/ -ea -Xmx1310m -cp /home/cslamovi/CLARKSCV1.2.2-b/bbmap/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 ref=contigs.fasta nodisk in1=scratch/s_3_1_sequence.fastq in2=scratch/s_3_2_sequence.fastq covstats=omgen.coverage
    Exception in thread "main" java.lang.UnsupportedClassVersionError: Bad version number in .class file
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:56)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
    at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)

    Leave a comment:


  • habib
    replied
    Thank Brian,

    actually there is also another peak at around 1000 coverage, which, as you suggest could be the organelle genomic sequences (I did not include all the data points in my previous post).

    With a possibility of tetraploid, I think I am a bit in trouble in how to get a good assembly of this genome...

    Leave a comment:


  • Brian Bushnell
    replied
    Originally posted by habib View Post
    Thank you westerman.

    So, each unique kmers would be present in average, at 'coverage depth' many times.

    How about several peaks that I observed after depth 1 (which is error kmers)? There are small peak at depth 23, bigger at 46 and highest at 83. Do they indicate that my genome is heterozygous, diploid?
    It's always difficult to determine exactly what this means. With only a primary peak at 46 and smaller at 23, that indicates a heterozygous diploid. However, the peak at 83 could be a lot of things. Looking at the "unique_kmers column, the first two peaks are actually at 21 and 42, so a 3rd peak at around 84 probably indicates a tetraploid. It could also be a contaminant or an organelle such as mitochondria or chloroplast, but organelles would usually have higher coverage. It could also be 2-copy repeats in the genome. But I suspect it's tetraploid.

    Leave a comment:


  • habib
    replied
    Thank you westerman.

    So, each unique kmers would be present in average, at 'coverage depth' many times.

    How about several peaks that I observed after depth 1 (which is error kmers)? There are small peak at depth 23, bigger at 46 and highest at 83. Do they indicate that my genome is heterozygous, diploid?

    Leave a comment:


  • westerman
    replied
    Try dividing the raw count by the depth and see that the result equals unique_kmers. That might give you a clue as to what everything means.

    Leave a comment:


  • habib
    replied
    Hi Brian,

    I produced the following graph using khist.sh for my 100bp PE Illumina reads. Could you help me interpret the graph please? What is the difference between raw_count and unique_kmers?

    Leave a comment:


  • Brian Bushnell
    replied
    Originally posted by dkainer View Post
    i hadn't noticed the Seal command. Thanks for responding so fast!

    So i assume that if I were to input paired-end reads to Seal with a barcodes.fa as the ref, it would try and match the barcodes in both the R1 and R2 reads? Hence the need for skipr1 and skipr2...?
    That's correct.

    Additionally, would seal let you left trim off the barcode bases from the R1 read?
    Yes, it has a flag "ftl" (forcetrimleft) for doing that... "ftl=6" would remove the first 6 bases of all reads. Unfortunately it would do that for both read 1 and read 2. So... if you have reads in 2 files, that's fine; you just process the read1 file with "ftl=6". If they are interleaved it's more complicated - you'd have to split them first (for example, reformat.sh in=reads.fq out=read#.fq). I'll consider adding that the ability to only do all operations on left or right reads... it seems useful.

    Leave a comment:


  • dkainer
    replied
    i hadn't noticed the Seal command. Thanks for responding so fast!

    So i assume that if I were to input paired-end reads to Seal with a barcodes.fa as the ref, it would try and match the barcodes in both the R1 and R2 reads? Hence the need for skipr1 and skipr2...?

    Additionally, would seal let you left trim off the barcode bases from the R1 read?

    Leave a comment:


  • Brian Bushnell
    replied
    It is almost possible to do this with Seal, which outputs reads into bins based on kmer matching.

    seal.sh in=reads.fq pattern=%.fq k=6 restrictleft=6 mm=f ref=barcodes.fa rcomp=f

    That would require a file "barcodes.fa" like this:
    >AACTGA
    AACTGA
    >GGCCTT
    GGCCTT

    etc., with one fasta entry per barcode, so the output reads would be in file AACTGA.fq and so forth. This is sort of a common request, so maybe I will make it unnecessary to provide a fasta file of the barcodes. Does that matter to you either way?

    However, BBDuk has the flags "skipr1" and "skipr2", which allow it to only do kmer operations on one read or the other. Seal currently lacks this, but it's essential for processing inline barcodes. I'll add it for the next release.

    Leave a comment:


  • dkainer
    replied
    Brian,

    is there a way with the BB Suite to demultiplex paired-end reads based on inline barcodes, like Flexbar does?

    I can see it can be done one barcode at a time by outputting matching reads based on the first 6 left bases. But can it be done in one command to demultiplex for multiple barcodes?

    cheers
    DK

    Leave a comment:


  • vmikk
    replied
    Hello Brian! Thanks a lot for the implementation of this feature!
    Meanwhile I thought to modify sam files from msa.sh, but the out of the box functionality is much more convenient!
    Thanks again!

    Leave a comment:


  • Brian Bushnell
    replied
    I added the "include" flag to cutprimers. Default is "include=f". If you set "include=t" the primers will be retained for the output.

    Leave a comment:

Latest Articles

Collapse

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, Today, 11:58 AM
0 responses
9 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-05-2026, 10:09 AM
0 responses
25 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-04-2026, 08:59 AM
0 responses
35 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-02-2026, 12:03 PM
0 responses
56 views
0 reactions
Last Post SEQadmin2  
Working...