How do we scaffold contigs assembled from different phases of sequencing. Say I have:
1) BAC ends from Sanger sequencing, single end only (~1kb);
2) contigs (200bp ~ 500kb ) from 454 data paired end, the raw data has been deleted;
3) Contigs (100bp ~200kb) from Illumina PE, too. Assembled from CLCBio.
4) Some sequences from GenBank or my local deposite, maybe 10Mb;
Now I want to combined these 4 sets data together to build scaffolds. Is there package to do this type of job? Most packages handle short reads, not sure if there is one dealing with long reads of fasta format.
Read some posts in this forum, e.g. some said it impossible to do scaffolding without paired-end information. CAP3 seems to be the one to do the job, but was extremely slow for my first try. CLCbio can do the job, but not in command line format which does not help me for my case for ~7300 jobs (individual BAC clones).
Asking around for advice.
Thanks a lot!
YT
1) BAC ends from Sanger sequencing, single end only (~1kb);
2) contigs (200bp ~ 500kb ) from 454 data paired end, the raw data has been deleted;
3) Contigs (100bp ~200kb) from Illumina PE, too. Assembled from CLCBio.
4) Some sequences from GenBank or my local deposite, maybe 10Mb;
Now I want to combined these 4 sets data together to build scaffolds. Is there package to do this type of job? Most packages handle short reads, not sure if there is one dealing with long reads of fasta format.
Read some posts in this forum, e.g. some said it impossible to do scaffolding without paired-end information. CAP3 seems to be the one to do the job, but was extremely slow for my first try. CLCbio can do the job, but not in command line format which does not help me for my case for ~7300 jobs (individual BAC clones).
Asking around for advice.
Thanks a lot!
YT
Comment