Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bambus2 ... setup for large(ish) genomes

    All,

    We've been happily using SOAPdeNOVO for the scaffolding of Ray contigs for a year or so, but it was recently suggested that bambus may be a more flexible scaffolder. Certainly it's turning out to be a more troublesome scaffolder.

    Parsing out our scaffolding mate-pair data into a link file after alignment with novoalign was relatively straight forward. We then used toAmos to merge our link file and contig file into an afg file. This also seemed to go OK - although I'm not sure how bambus knows about read position and orientation from just a link file - however this is how the example data is presented.

    The final step before running the bambus2 pipeline, according to the cbcb page, is to run minimus on the afg file to create a bnk directory. This is a stumbling block for us ... minimus is an assembler designed for small genomes and our set of contigs are not small. Consequently minimus runs for about a week trying to generate hash-overlaps without producing output or status messages and we eventually had to kill it.

    We tried using bank-transact to create the bnk file from the afg file directly but clearly the afg file didn't have all that was needed since this then failed at the clk stage with the errors like 'no contig account found'.

    I'm guessing that minimus is needed to create a set of files in the bnk directory describing contig overlaps. However (i) it is not a good tool for this and (ii) if these contigs were unambiguously overlapping, the earlier contiging stage would have merged them. So essentially we want to simply say is 'no contig overlaps at this stage'. But the bnk directory contents are not described that I can find (perhaps on sourceforge, but it's presenting errors).

    Could anyone with experience of getting the pipeline working when starting with MP and fasta data perhaps give a hint as to their pipeline? With the python error in goBambus2 (described in another post) and problems getting the compile to work with boost, this is turning out to be a bit more of a problem than we were expecting for a 'simple standalone scaffolder'.

    Many thanks

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 10:49 AM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-25-2024, 11:49 AM
0 responses
21 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-24-2024, 08:47 AM
0 responses
20 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
62 views
0 likes
Last Post seqadmin  
Working...
X