Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • arkilis
    replied
    How to use the bbsplit to check the match details.

    i.e. I might want to know the 1 seq from my reads match which seq in the ref? Is it possible to do that?

    Cheers,
    a

    Leave a comment:


  • Brian Bushnell
    replied
    Yes, it is. Also, with BBSplit, I think it will try to regenerate the index every time as long as "ref=" is specified, even if it already exists, so only do that once.

    Leave a comment:


  • sdriscoll
    replied
    After running bbsplit once using the syntax: ref=ref1.fa,ref2.fa, is it possible to re-use that index on subsequent runs using the path= parameter?

    Leave a comment:


  • Brian Bushnell
    replied
    Originally posted by vingomez View Post
    Thanks Brian for your effort in providing bioinformatic applications,

    As you mentioned in your post, BBsplit can use two paired-end files as input. In addition to the two files; Can I add a third file (e.g. merged read file) as input?
    No, you'll have to do that in two runs, one for the paired reads and one for the merged reads. But ultimately that won't affect the output or runtime (other than the fact that the index will need to be loaded twice, and you'll end up with 2 sam files that need to be merged).

    P.S. In previous post "java -Xmx29g -cp /path/to/current align2.BBSplit", must said align2.BBSplitter"
    Fixed, thanks!

    Leave a comment:


  • vingomez
    replied
    Thanks Brian for your effort in providing bioinformatic applications,

    As you mentioned in your post, BBsplit can use two paired-end files as input. In addition to the two files; Can I add a third file (e.g. merged read file) as input?



    Thanks again


    P.S. In previous post "java -Xmx29g -cp /path/to/current align2.BBSplit", must said align2.BBSplitter"
    Last edited by vingomez; 09-16-2014, 07:22 AM.

    Leave a comment:


  • Introducing BBSplit: Read Binning Tool for Metagenomes and Contaminated Libraries

    BBSplit is a tool that bins reads by mapping to multiple references simultaneously, using BBMap. The reads go to the bin of the reference they map to best. There are also disambiguation options, such that reads that map to multiple references can be binned with all of them, none of them, one of them, or put in a special "ambiguous" file for each of them. Paired reads will always be kept together.

    For example, if you had a library of something that was contaminated with e.coli and salmonella, you could do this:

    bbsplit.sh in=reads.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu=clean.fq int=t

    This will produce 3 output files:
    out_ecoli.fq (ecoli reads)
    out_salmonella.fq (salmonella reads)
    clean.fq (unmapped reads)

    In this case, "int=t" means that the input file is paired and interleaved. For single-end reads you would leave that out. For paired reads in 2 files, you would do this:
    bbsplit.sh in1=reads1.fq in2=reads2.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu1=clean1.fq outu2=clean2.fq

    You can get more information about parameters by running bbsplit.sh with no arguments, or reading /bbmap/docs/readme.txt. But I will mention here the inter-reference ambiguity modes, which decide what to do with reads that map to multiple references and pairs where each read maps to a different reference:

    ambig2=best
    Default. Ambiguous reads go to the first best site.

    ambig2=toss
    Ambiguous reads are considered unmapped.

    ambig2=all
    Write a copy to the output for each reference to which it maps.

    ambig2=split
    Write a copy to the AMBIGUOUS_ output file for each reference to which it maps.

    If your OS cannot process bash shellscripts, replace "bbsplit.sh" with "java -Xmx29g -cp /path/to/current align2.BBSplitter", where /path/to/current is the location of the 'current' directory (a subdirectory of bbmap), and -Xmx29g specifies the amount of memory to use (so this would be the command line for a 32GB computer). This should be set to about 85% of physical memory.

    BBSplit is extremely fast and highly sensitive, using BBMap for the mapping. So, all flags and features supported by BBMap can be used with BBSplit (aside from sam output).

    BBSplit is available here:
    Download BBMap for free. BBMap short read aligner, and other bioinformatic tools. This package includes BBMap, a short read aligner, as well as various other bioinformatic tools. It is written in pure Java, can run on any platform, and has no dependencies other than Java being installed (compiled for Java 6 and higher).


    P.S. Some people have asked why BBSplit has a lower alignment rate than BBMap. That is because it has a lower default sensitivity, as the original intent was to bin reads using known assemblies. The sensitivity can be raised to be equivalent to BBMap with these flags: "minratio=0.56 minhits=1 maxindel=16000"
    Last edited by Brian Bushnell; 09-16-2014, 08:29 AM.

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 08:47 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X