Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • jazz710
    replied
    Kmer Counting

    Hi Brian, tried to send you a private message, but I guess you're popular:

    "Brian Bushnell has exceeded their stored private messages quota and cannot accept further messages until they clear some space."

    Anyhow!

    New odd question for you Brian: I'm doing some kmer counting with kmercountexact.sh. I'm running a kmer series on 100bp DNA-Seq reads from k=21 to k=55. Two things I noticed:

    1) After k=31, the program only calculates kmer counts for even numbers. The output reads (for the example of k=55): "K was changed from 55 to 54", and it did that for every odd value of k from 33 to 55

    2) The unique kmer rises as the value of k rises, but only to a point. In my two samples, it seems like after k=31 that as k increases, the number of unique kmers decreases. How can this be? Am I missing something conceptual about how kmer counts should be expected?

    Best,
    Bob

    EDIT: Added kmer graph

    Last edited by jazz710; 03-14-2016, 06:57 AM. Reason: Added Image

    Leave a comment:


  • GenoMax
    replied
    @mcauchy: Consider @Brian's reply above when choosing a memory setting.

    If you know that your files fall in the second category then you may want to re-do the trimming with a paired-end aware trimmer (so the reads don't get out of sync in first place) e.g. use bbduk.sh from BBMap. That option will not require a lot of RAM and would prove convenient.

    Leave a comment:


  • Brian Bushnell
    replied
    Originally posted by GenoMax View Post
    @Brian: How much memory does repair.sh need?
    That depends. If you have an interleaved file and the interleaving was broken because some reads were discarded, you can run it with the flag "fixinterleaving" and it only needs a trivial amount of memory (at any given time at most 2 reads need to be remembered).

    For an arbitrarily disordered file or pair of files, in the worst case, it would store all reads in memory, so the amount of memory needed would be somewhat greater than the size of the uncompressed files.

    But in the common case of a pair of files that are ordered correctly but some reads were deleted in each file without removing their mate, the amount of memory needed is proportional to the number of singleton reads.

    Leave a comment:


  • GenoMax
    replied
    @Brian: How much memory does repair.sh need?

    Leave a comment:


  • Brian Bushnell
    replied
    Hi mcauchy,

    Were you able to resolve this? The shellscripts (anything.sh) do not work in Windows, so the full syntax is needed. If you are still having trouble, please tell me the location of bbmap.sh (which is probably something like C:\something\bbmap\current\bbmap.sh).

    -Brian

    Leave a comment:


  • GenoMax
    replied
    Originally posted by mcauchy View Post
    I'm not sure what the syntax would be in DOS. I've tried ./bbmap.sh, /bbmap.sh and bbmap.sh

    Could you tell me what the command would be?
    This information is in the BBMap thread. Here is how (note: you need to put the right path to the "current" directory on your machine and the space between that and align2.BBMap):

    Code:
    c:\> java -Xmx3g -cp c:\path_to\current align2.BBMap in=reads.fq out=mapped.sam

    Leave a comment:


  • mcauchy
    replied
    I'm not sure what the syntax would be in DOS. I've tried ./bbmap.sh, /bbmap.sh and bbmap.sh

    Could you tell me what the command would be?

    Leave a comment:


  • GenoMax
    replied
    How about setting the VM aside and running BBMap directly on windows 10. How much RAM is there on the machine? BBMap is written in java and will run there but you would need to take into account windows versions of the command line usage for BBMap.

    Leave a comment:


  • mcauchy
    replied
    Didn't run for very long....

    $ ./repair.sh -Xmx2g in1=/media/sf_D_DRIVE/champagnefastqs/srr1290816_1.fastq in2=/media/sf_D_DRIVE/champagnefastqs/srr1290816_2.fastq out1=fixed1.fq out2=fixed2.fq outsingle=single.fq

    java -ea -Xmx2g -cp /media/sf_D_DRIVE/bbmap/current/ jgi.SplitPairsAndSingles rp -Xmx2g in1=/media/sf_D_DRIVE/champagnefastqs/srr1290816_1.fastq in2=/media/sf_D_DRIVE/champagnefastqs/srr1290816_2.fastq out1=fixed1.fq out2=fixed2.fq outsingle=single.fq
    Executing jgi.SplitPairsAndSingles [rp, -Xmx2g, in1=/media/sf_D_DRIVE/champagnefastqs/srr1290816_1.fastq, in2=/media/sf_D_DRIVE/champagnefastqs/srr1290816_2.fastq, out1=fixed1.fq, out2=fixed2.fq, outsingle=single.fq]

    Set INTERLEAVED to false
    Started output stream.
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.HashMap.resize(HashMap.java:580)
    at java.util.HashMap.addEntry(HashMap.java:879)
    at java.util.LinkedHashMap.addEntry(LinkedHashMap.java:427)
    at java.util.HashMap.put(HashMap.java:505)
    at jgi.SplitPairsAndSingles.repair(SplitPairsAndSingles.java:751)
    at jgi.SplitPairsAndSingles.process3_repair(SplitPairsAndSingles.java:538)
    at jgi.SplitPairsAndSingles.process2(SplitPairsAndSingles.java:304)
    at jgi.SplitPairsAndSingles.process(SplitPairsAndSingles.java:230)
    at jgi.SplitPairsAndSingles.main(SplitPairsAndSingles.java:45)

    Leave a comment:


  • westerman
    replied
    Also it seems to me that '-Xmx-211m' is odd. Why the negative 211? I am not sure that makes a difference but it might.

    Leave a comment:


  • GenoMax
    replied
    Originally posted by mcauchy View Post
    I have allocated 2.9Gb, which is all I have to give. It seems that is not enough. Thanks for your help.
    That may be true but in case BBMap was not able to allocate RAM correctly can you try running the command as follows:

    Code:
    $ ./repair.sh -Xmx2g in1=/media/sf_D_DRIVE/champagnefastqs/srr1290816_1.fastq in2=/media/sf_D_DRIVE/champagnefastqs/srr1290816_2.fastq out1=fixed1.fq out2=fixed2.fq outsingle=single.fq

    Leave a comment:


  • mcauchy
    replied
    I have allocated 2.9Gb, which is all I have to give. It seems that is not enough. Thanks for your help.

    Leave a comment:


  • GenoMax
    replied
    How much memory have you allocated to the VM? You should at least have 2+ GB to have enough available for programs to run.

    Leave a comment:


  • mcauchy
    replied
    What I mean is I run:
    $ ./repair.sh in1=/media/sf_D_DRIVE/champagnefastqs/srr1290816_1.fastq in2=/media/sf_D_DRIVE/champagnefastqs/srr1290816_2.fastq out1=fixed1.fq out2=fixed2.fq outsingle=single.fq

    ...and get:
    java -ea -Xmx-211m -cp /media/sf_D_DRIVE/bbmap/current/ jgi.SplitPairsAndSingles rp in1=/media/sf_D_DRIVE/champagnefastqs/srr1290816_1.fastq in2=/media/sf_D_DRIVE/champagnefastqs/srr1290816_2.fastq out1=fixed1.fq out2=fixed2.fq outsingle=single.fq
    Invalid maximum heap size: -Xmx-211m
    Error: Could not create the Java Virtual Machine.
    Error: A fatal exception has occurred. Program will exit.

    Leave a comment:


  • GenoMax
    replied
    Originally posted by mcauchy View Post
    Newbie here! I have unzipped and untared bbmap but it wont run any commands. I have a Linux virtual box in windows 10. Am I missing some software to use BBMap?
    What do you mean "it won't run any commands"? Can you see the shell scripts in the "bbmap" folder. Try the following command and see if it produces help output on screen after you change to bbmap directory.

    Code:
    $ ./bbmap.sh

    Leave a comment:

Latest Articles

Collapse

  • SEQadmin2
    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
    by SEQadmin2


    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

    Here are nine questions we think about, in roughly the order they matter, before...
    06-18-2026, 07:11 AM
  • SEQadmin2
    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
    by SEQadmin2


    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
    ...
    06-02-2026, 10:05 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, 06-17-2026, 06:09 AM
0 responses
36 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-09-2026, 11:58 AM
0 responses
100 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-05-2026, 10:09 AM
0 responses
120 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-04-2026, 08:59 AM
0 responses
113 views
0 reactions
Last Post SEQadmin2  
Working...