Hi all,
during my Master thesis I developed a stand-alone scaffolding tool named SSPACE for scaffolding pre-assembled contigs using paired-read data. I developed this program since I couldn't find a program which was able to do this, except from Bambus. However, we had lots of issues on Bambus, including errors and complicated input datasets.
Therefore, SSPACE was developed. The main featues are;
* Inputs are simple FASTA contig sequences as well as (multiple) FASTA/FASTQ paired-read data
* High-quality scaffolds in a short runtime and limited memory requirements
* High reduction of the amount of contigs stored into scaffolds and high N50 value
* Multiple library input of both paired-end and/or mate pair datasets
* Possible contig extension of unmapped sequence reads
* Easy interpretation of the final scaffolds
* Visualization of the final scaffolds using GraphViz
SSPACE has been tested on the E.coli, Grosmannia clavigera and Giant Panda genomes and showed less scaffolds and higher N50 value compared with the produced scaffolds from common de novo assemblers, like Abyss and SOAPdeNovo.
SSPACE is freely available at
The publication is accepted at bioinformatics and will be online soon. Publication shows more detailed information about the produced scaffolds and their quality, including time and memory information.
Hope it could be useful and any comments or questions are ofcourse welcome.
Cheers,
Boetsie
during my Master thesis I developed a stand-alone scaffolding tool named SSPACE for scaffolding pre-assembled contigs using paired-read data. I developed this program since I couldn't find a program which was able to do this, except from Bambus. However, we had lots of issues on Bambus, including errors and complicated input datasets.
Therefore, SSPACE was developed. The main featues are;
* Inputs are simple FASTA contig sequences as well as (multiple) FASTA/FASTQ paired-read data
* High-quality scaffolds in a short runtime and limited memory requirements
* High reduction of the amount of contigs stored into scaffolds and high N50 value
* Multiple library input of both paired-end and/or mate pair datasets
* Possible contig extension of unmapped sequence reads
* Easy interpretation of the final scaffolds
* Visualization of the final scaffolds using GraphViz
SSPACE has been tested on the E.coli, Grosmannia clavigera and Giant Panda genomes and showed less scaffolds and higher N50 value compared with the produced scaffolds from common de novo assemblers, like Abyss and SOAPdeNovo.
SSPACE is freely available at
The publication is accepted at bioinformatics and will be online soon. Publication shows more detailed information about the produced scaffolds and their quality, including time and memory information.
Hope it could be useful and any comments or questions are ofcourse welcome.
Cheers,
Boetsie
Comment