building scaffolds using a contig and mate pair

sabiha replied

03-09-2014, 09:46 PM
MeganS,

Thank you for helping me out. stdout from bank-transact reported the following.
Messages read: 4963548
Objects added: 4963548
Objects deleted: 0
Objects replaced: 0

I have Illumina paired end reads in fastq format, which has quality scores included in the same file. I am planning to split the sequences and the quality into two files and will try using these files.

How did u generate the xml link information??
Leave a comment:
MeganS replied

03-07-2014, 10:33 AM
sabiha,

Does the stdout from bank-transact report any "Objects added"?

Are your .xml files links of some sort (paired reads, etc)? I have gotten bambus to work, but used the .xml link information much earlier. Before running bambus, I used toAmos with fasta reads, a TIGR .contig file, and xml link information. Then I generated an amos bank using bank-transact on the resulting .afg file. Then ran bambus (via the goBambus script) on the bank.
Leave a comment:
sabiha replied

03-06-2014, 06:00 AM
.. I ve already tried SSpace. Just wanted to see how bambus wrks... as it does hierarchical scaffolding
Leave a comment:
relipmoc replied

03-06-2014, 05:18 AM
Originally posted by sabiha View Post

Hii,,
I am trying to run bambus on velvet contigs..
i have generated .afg while assembling using velvet.. and then used the following command lines... as suggested by some forum ...

Why don't try SSPACE? For more details, please see http://seqanswers.com/forums/showthread.php?t=8350

and

Boetzer, M., Henkel, C.V., Jansen, H.J., Butler, D. and Pirovano, W. (2011) Scaffolding pre-assembled contigs using SSPACE, Bioinformatics, 27, 578-579.
Leave a comment:
sabiha replied

03-06-2014, 04:20 AM
Hii,,
I am trying to run bambus on velvet contigs..
i have generated .afg while assembling using velvet.. and then used the following command lines... as suggested by some forum ...
/amos-3.0.1/src/Bank/bank-transact -cf -b j99.bnk -m velvet_asm.afg

/amos-3.0.1/src/Bambus/Bundler/clk -b j99.bnk/

/amos-3.0.1/src/Bambus/Bundler/Bundler -b j99.bnk

/amos-3.0.1/src/Bambus/Bundler/MarkRepeats -b j99.bnk -redundancy 2 -agressive >repeat_fi

/amos-3.0.1/src/Bambus/Bundler/OrientContigs -b j99.bnk -prefix j99scaff -redundancy 2 -repeats repeat_fi -all -agressive -linearize

perl /amos-3.0.1/src/Bambus/Untangler/untangle.pl -e j99scaff.evidence.xml -s j99scaff.out.xml -o j99scaff.untangle.xml

/amos-3.0.1/src/Bank/bank2fasta -d -b j99.bnk >bambus_contigs.fa

perl /amos-3.0.1/src/Bambus/Untangler/printScaff.pl -e j99scaff.evidence.xml -s j99scaff.untangle.xml -l j99scaff.library -f bambus_contigs.fa -merge -o bambus_scaff

and it has generated the following stats..

no. valid links: 0
no. incorrect len. links: 0
no. incorrect ori. links: 0
no. unchecked links: 18129

I can see that no scaffolding is being done, since there are no valid links...
can anyone tell me if my approach is right ... and if it is a must to use mates file
Leave a comment:
rahularjun86 replied

02-23-2012, 08:48 AM
HI all,
I have a question regarding the mate file. I ran velvet(with no scaffolding option) and get 4-5 nice assemblies with different k-mer's. Then I merged all these 4 assemblies into a single one using minimus2 Amos. Now I have .contig file and .bnk/ file. How can I generate the .mate file? should I use the sed command discussed in some posts. but the Id's of my .mate file and .contig file are not showing any link. My .contig file has id: #NODE_1_length_1305_cov_18.627586(0) from velvet and the .mate is with Illumina id's @HWUSI-EAS100R:6:73:941:1973#0/1 @HWUSI-EAS100R:6:73:941:1973#0/2. How can I link this information. Anybody please help.
Regards,
Rahul
Leave a comment:
elisadouzi replied

09-08-2011, 08:39 AM
Hi shouhua,
I also have the same err as you. Have you figured out?

Thanks!
Leave a comment:
shaohua.fan replied

06-01-2011, 11:34 AM
Originally posted by catfisher View Post

Marten, thanks for your quick reply. I editted my configure file as you suggested and run goBambus again, but still failed.
I used the .conf as:
# Priorities
priority ALL 1
# The following lines can be un-commented to specify certain
# per-library settings

# Redundancies
# redundancy lib_some 1

# allowed error
# error MUMmer 0.5

# overlaps allowed
# overlaps MUMmer Y

# Global redundancy
redundancy 2

# min group size
mingroupsize 0

The log information for goBambus is :
Parsing links out of input file
Step 100: running detective
Combining XML files
Step 200: making the xmls
starting
Done
Step 300: Preparing contig links
starting
Done
Step 400: Running scaffolder
Grommit(/home/aubsxl/bin/bambus/bin/grommit -i ctg2660_BES_mapping_704.inp -o ctg2660_BES_mapping_704.out.xml -C c
tg2660_BES_mapping_704.grommit.conf --append --logfile goBambus.log --debug 1) script failed

The error information from goBambus.error file is:
20100712|123807| 10451| Grommit(/home/aubsxl/bin/bambus/bin/grommit -i ctg2660_BES_mapping_704.inp -o ctg2660_BES_
mapping_704.out.xml -C ctg2660_BES_mapping_704.grommit.conf --append --logfile goBambus.log --debug 1) script fail
ed

The first several lines from my mates files is:
library libname 200 500
HWUSI-EAS1665_0002:2:1:1022:18088#0/1 HWUSI-EAS1665_0002:2:1:1022:18088#0/2 libname
HWUSI-EAS1665_0002:2:1:1029:11872#0/1 HWUSI-EAS1665_0002:2:1:1029:11872#0/2 libname
HWUSI-EAS1665_0002:2:1:1029:11034#0/1 HWUSI-EAS1665_0002:2:1:1029:11034#0/2 libname
HWUSI-EAS1665_0002:2:1:1030:19457#0/1 HWUSI-EAS1665_0002:2:1:1030:19457#0/2 libname
HWUSI-EAS1665_0002:2:1:1031:12133#0/1 HWUSI-EAS1665_0002:2:1:1031:12133#0/2 libname

Marten, could you look at these information and point out what's wrong with this? I have no idea. Thanks a lot,

Kevin

Hi, Catfisher,

I met the exact same question as you. Have you found any solution of this question?
Leave a comment:
seb567 replied

02-08-2011, 06:14 PM
Originally posted by boetsie View Post

Hmmm, that is the program I meant, since I'm the developer haha. I refered to the wrong link in my previous reply

Boetsie

That's funny !
Leave a comment:
boetsie replied

02-03-2011, 12:11 AM
Originally posted by seb567 View Post

You can try SSPACE too. It is a scaffolder for next-gen data.

-seb

Hmmm, that is the program I meant, since I'm the developer haha. I refered to the wrong link in my previous reply

Boetsie
Leave a comment:
seb567 replied

02-02-2011, 02:06 PM
You can try SSPACE too. It is a scaffolder for next-gen data.

Bioinformatics paper:

http://bioinformatics.oxfordjournals.org/content/early/2010/12/12/bioinformatics.btq683.short

SEQanswers thread:

SSPACE: a new stand-alone scaffolding tool for small and large genomes - SEQanswers

http://seqanswers.com/forums/showthread.php?t=8350

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

Download:

Page not found - BaseClear B.V.

http://www.baseclear.com/sequencing/data-analysis/bioinformatics-tools/sspace/

-seb
Leave a comment:
boetsie replied

12-13-2010, 12:52 AM
This thread was solved by a program developed by myself which can scaffold assembled contigs in .fasta format with paired-end and/or mate pair sequences. No conversion of file formats are required. See this thread;

building scaffolds using a contig and mate pair - SEQanswers

http://seqanswers.com/forums/showthread.php?t=4124

Wandering without a reference? Post here
Leave a comment:
catfisher replied

07-15-2010, 01:56 PM
Originally posted by themerlin View Post

Catfisher,

I had the same error months ago. I ended up filtering my contigs so I only kept longer contigs (>500nts) with high coverage (depends on your dataset). I didn't change my mates file and then it suddenly worked. I'm not quite sure why, but it might be worth a shot.

Jason

I headed 100k lines of the contig and mates files and rerun the program for these data, the program also worked now.
Does anyone know how much data size we can handle with the bambus? I am afraid that it has a built-in limit for how big the input data can be input. I have 704 contigs in the .contig file and 3434936 x2 paired ends, the program didn't work if I loaded all of them. I tested one with contigs less than 500bp (some are about 200bp), it worked also. How big were the input data when you all used the Bambus? Thanks,

Kevin
Leave a comment:
themerlin replied

07-13-2010, 07:31 AM
Catfisher,

I had the same error months ago. I ended up filtering my contigs so I only kept longer contigs (>500nts) with high coverage (depends on your dataset). I didn't change my mates file and then it suddenly worked. I'm not quite sure why, but it might be worth a shot.

Jason
Leave a comment:
boetsie replied

07-13-2010, 12:32 AM
Hi catfisher,

hmmm weird error, since it doesn't point out where it goes wrong. Is that the only error?

Some thing that might help;

replace ":" and "#" in the readnames to underscores ("_"). E.g.;

HWUSI-EAS1665_0002:2:1:1022:18088#0/1
will be;
HWUSI-EAS1665_0002_2_1_1022_18088_0/1

do this both in the .mates file and .contig file.

Code to do this is;

cat input.mates | sed s/#/_/g | sed s/:/_/g > output.mates

where input.mates is the input file, and output.mates the converted output file.

I don't know if this really works...

Otherwise it might be a good idea to contact Bambus developers, since i'm not to familiar with Bambus.

Good luck.

Cheers,
Marten
Leave a comment:

Previous 1 2 3 template Next

Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM
Strategies for Sequencing Challenging Samples

by seqadmin

Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
- Channel: Articles
03-22-2024, 06:39 AM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 17 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 46 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News