Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Brian Bushnell
    replied
    sdmoore,

    By default, BBMap will look for much longer indels than BWA/Bowtie2, over 16000bp. You can limit this with the maxindel flag (e.g. "maxindel=40"). Soft-clipping (via "local" flag) can also reduce erroneous variation calls from chimeric or low-quality reads.

    Leave a comment:


  • sdmoore
    replied
    Thanks Brian and dpryan.
    I had to give up on bbmap for now, not for this problem (I found the AddOrReplaceReadGroups tool later: I edited the sams or the bams). Rather, the resulting vcf from mpileup on the BBmap alignments were "all over the place" (and took forever to process too), I don't know how else to describe it, large insert calls for a bunch of positions. Viewing the file was no help (tons of insert/asterisks displayed). Same mess from FreeBayes. BWA-mem and Bowtie2 assemblies don't show this and I can easily identify known errors in the reference file with either mpileup or freebayes. The assembly looked more like what I got from cushaw2 (and also dropped). We are at a stage now where we will Sanger sequence a few loci to clear things up (e.g., BWA never shows a collection of mutations that Bowtie2 does). I was hoping to have a third assembler "take sides", but I think it's faster for us to sequence and be sure.

    Leave a comment:


  • dpryan
    replied
    @sdmoore: You actually just want AddOrReplaceReadGroups from Picard tools. The command I expect you were going for is "samtools reheader", though that won't really do what you want since read group information is also added to each alignment.

    @Brian: It would be great if you could add read group support. That'll be needed by anyone doing SNP calling.

    Leave a comment:


  • Brian Bushnell
    replied
    sdmoore,

    BBMap does not have an option for setting the readgroup, since I never encountered a situation where I needed it. But if it's useful, I can add it to the next release. The solution in your linked thread looks reasonable and I'm not sure why it didn't work for you; I will let you know if I find a better solution.

    Leave a comment:


  • sdmoore
    replied
    Possible to add Read Group in BBmap header?

    *sorry, probably wrong thread, I found more activity in the release announcement thread*

    Hello,
    I used BBduk to process my read pairs and then mapped them using BBmap, then sam/bam and sorted.
    I plan to use an alternative to mpileup to process this set (for comparison of the outputs), so I am trying to use GATK tools.

    When running a GATK tool, it reports the error that the readgoup is not found in the header. With other mappers, this is an option (like -R for BWA). I found a methods to manually add readgoup information to the header (such as here), but I have limited linux skills and get errors when trying that approach (command "header" not found). I am also concerned that if I put the wrong RG info, I may pooch a downstream tool.

    Is there a way to make the BBmap output compatible with GATK?
    Last edited by sdmoore; 07-05-2014, 09:34 AM. Reason: wrong thread?

    Leave a comment:


  • muol
    replied
    Excellent, just did a test run. This is very useful software!

    Olaf

    Leave a comment:


  • Brian Bushnell
    replied
    Olaf,

    This has been fixed in the latest release, 33.04

    Leave a comment:


  • muol
    replied
    Thanks for the info Brian, it wasn't a big issue.

    Olaf

    Leave a comment:


  • Brian Bushnell
    replied
    Olaf,

    Currently, BBNorm uses single interleaved files for temporary storage when using multiple passes. And I have not implemented any way to specify dual files in intermediate stages, since everyone at JGI uses interleaved files for everything.

    You have two options.
    1) You could set "passes=1", which is faster, but I don't recommend it because it doesn't give as good results as 2-pass normalization.
    or
    2) You could specify only a single output file, which will get interleaved reads:

    bbnorm.sh in1=R1.fastq.gz in2=R2.fastq.gz out=R12.bbnorm.fastq.gz prefilter=t tossbadreads=t ecc=t fixspikes=t qin=33 -Xmx72g target=40

    ...Then, if you need to, de-interleave it afterward:

    reformat.sh in=R12.bbnorm.fastq.gz out1=R1.bbnorm.fastq.gz out2=R2.bbnorm.fastq.gz

    Sorry for the inconvenience! I'll try to fix that by the next release, though unlike documenting the "qin" flag, this will take more work so no guarantees. Thanks for bringing it to my attention. FYI, the flag "interleaved" has no effect on output, only input.

    -Brian
    Last edited by Brian Bushnell; 06-23-2014, 06:05 PM.

    Leave a comment:


  • muol
    replied
    Brian,

    I ran into a smaller issue with bbnorm. When trying to input and output separate files for a PE library like this:

    Code:
    bbnorm.sh in1=R1.fastq.gz in2=R2.fastq.gz out1=R1.bbnorm.fastq.gz out2=R2.bbnorm.fastq.gz prefilter=t tossbadreads=t ecc=t fixspikes=t qin=33 -Xmx72g target=40
    I receive this error during pass 2:

    Code:
    Exception in thread "main" java.lang.AssertionError: Please do not set 'interleaved=true' with dual input files.
    	at stream.ConcurrentGenericReadInputStream.<init>(ConcurrentGenericReadInputStream.java:132)
    	at stream.ConcurrentGenericReadInputStream.getReadInputStream(ConcurrentGenericReadInputStream.java:661)
    	at stream.ConcurrentGenericReadInputStream.getReadInputStream(ConcurrentGenericReadInputStream.java:641)
    	at kmer.KmerCount7MTA.countFastq(KmerCount7MTA.java:355)
    	at kmer.KmerCount7MTA.makeKca(KmerCount7MTA.java:222)
    	at jgi.KmerNormalize.runPass(KmerNormalize.java:1006)
    	at jgi.KmerNormalize.main(KmerNormalize.java:736)
    Setting interleaved=false doesn't change that. Outputting to a single, interleaved file (in1=xxx in2=xxx out=xxx) on the other hand works fine. Any ideas?

    Olaf

    Leave a comment:


  • muol
    replied
    Indeed, just tried it and it works well with bbnorm.

    Thanks
    Olaf

    Leave a comment:


  • Brian Bushnell
    replied
    Olaf,

    It's there, I just forgot to document it; sorry! I'll add that to the shellscript in the next release. I think that all of the programs in the package that read fastq input allow the "qin" flag.

    -Brian

    Leave a comment:


  • muol
    replied
    Hi Brian,

    Is there an option to set read quality encoding in bbnorm? I had to set qin=33 in bbduk for some Illumina 1.9 paired end libraries, but this option doesn't seem to exist in bbnorm (used BBMap v. 32.32 for Java 7).

    Thanks
    Olaf

    Leave a comment:


  • Corydoras
    replied
    Hi Brian,

    Thanks so much for that explanation . I thought I wouldn't be able to go past 31 but it is best to double check.

    Sorry as well for just deleting my post (and bombarding you with simple questions, new to the world of NGS!), I played around with updating the Java on our Linux machine and that did the trick .

    Thanks again for your help! And the fantastic and easy to use script!!

    Sarah

    Leave a comment:


  • Brian Bushnell
    replied
    Sarah,

    It might be better to normalize using a kmer length of 41, but BBNorm only supports a maximum of 31 In practice, it should make very little difference, though. Using long kmers is important for assembly, as it helps span short repeats that would otherwise cause contigs to terminate. But normalization is much less sensitive to that issue, and very long kmers can cause problems in the presence of errors. With k=31, a 100bp read with 1 error could yield 31 kmers with a depth of 1, out of a total of 70 kmers - in that case, the median depth would not be impacted. With k=63, there could be 63 of the 70 total kmers spanning the error, thus having a depth of 1, so the median depth of the read would look like 1 instead of its correct value. And BBNorm normalizes based on the median kmer depth of a read.

    It's a lot more computationally efficient to use a max kmer length of 31, so that's how I designed it. I've tried shorter kmers down to about k=25 and not noticed an appreciable difference in normalization or error correction.

    As for your prior (deleted) post, sorry for not responding - I think the problem was that you were running Java 6 instead of Java 7. Most of the programs in BBTools work fine in Java 6 but it looks like BBNorm requires Java 7 (or higher).

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM
  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 05-14-2024, 07:03 AM
0 responses
26 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-10-2024, 06:35 AM
0 responses
46 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-09-2024, 02:46 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-07-2024, 06:57 AM
0 responses
47 views
0 likes
Last Post seqadmin  
Working...
X