I recently posted a new version of BEDTools (v2.4.1, http://bedtools.googlecode.com). Most notably, this release includes extensive support for sequence alignments in BAM format and provides a convenient means to refine sequence datasets based on biological interest. GFF support has also been added. Here are the highlights:
(1) Native support for features in GFF format. Can mix and match BED and GFF, BAM (see below) and GFF, etc.
(2) Native support for alignment files in BAM format (http://samtools.sourceforge.net/).
(3) Native support for "blocked" BED features (aka BED12). Note that each block is not considered separately. BEDTools merely allows one to use BED12 files and the last 6 fields are "passed through" the tools.
(4) A comprehensive user manual has been posted to the website. This includes all options, the details of the file formats, as well as many usage examples.
(5) Several improvements to the code base and algorithms, as well as an end to a few minor bugs.
Support for alignments in BAM format was made possible by the elegant BAMTools API (http://bamtools.sourceforge.net) led by Derek Barnett in Gabor Marth's lab.
Regards,
Aaron
(1) Native support for features in GFF format. Can mix and match BED and GFF, BAM (see below) and GFF, etc.
(2) Native support for alignment files in BAM format (http://samtools.sourceforge.net/).
- intersectBed will compare each BAM alignment (if paired, each end is treated distinctly) to BED/GFF annotations. Alignments that overlap (or don't) BED/GFF annotations can be written in BED format or written to a newly BAM file.
- e.g. $ intersectBed -abam reads.bam -b exons.bed > reads.in.exons.bam
- e.g. $ intersectBed -abam reads.bam -b ssrs.bed -v > reads.notIn.simplerepeats.bam
- pairToBed will compare paired-end BAM alignments (the ends treated as a logical unit) to BED annotations. Alignments that overlap BED/GFF (or don't) annotations can be written in BEDPE (paired-end) format or written to a newly refined BAM file. One can require that both, either, neither, xor, or "not both" ends overlap the BED/GFF annotations. Moreover, one can examine the "span" of intra-chromosomal pairs to BED/GFF annotations (e.g. "does the ~500bp span of my paired-end read overlap an exon?").
- e.g. $ pairToBed -abam stdin -b segdups.bed -type both > reads.bothEndsInSD.bam
- e.g. $ pairToBed -abam reads.bam -b ssrs.bed -type notboth > reads.oneOrNoEndsIn.simplerepeats.bam
- bamToBed will convert BAM alignments to either BED or BEDPE format. This allows BAM alignments to work with the other 14 BEDTools.
- e.g. $ samtools view -f 0X2 | bamToBed -i stdin | coverageBed -a stdin -b 5kb.windows.bed
- The tools play nicely with samtools.
- e.g. $ samtools view -F 0X2 | pairToBed -abam stdin -b genes.bed -type both | samtools view -
(3) Native support for "blocked" BED features (aka BED12). Note that each block is not considered separately. BEDTools merely allows one to use BED12 files and the last 6 fields are "passed through" the tools.
(4) A comprehensive user manual has been posted to the website. This includes all options, the details of the file formats, as well as many usage examples.
(5) Several improvements to the code base and algorithms, as well as an end to a few minor bugs.
Support for alignments in BAM format was made possible by the elegant BAMTools API (http://bamtools.sourceforge.net) led by Derek Barnett in Gabor Marth's lab.
Regards,
Aaron
Comment