Some assemblers have less parameter, such as Edena, Velvet; Some other are not. I'd like to use vcake or ssake to assemble some solexa data. However, there are so many parameters that it is not possible to try them all. How can I find out the best parameter?
Header Leaderboard Ad
Collapse
How to confirm the best parameter for assembling?
Collapse
Announcement
Collapse
SEQanswers June Challenge Has Begun!
The competition has begun! We're giving away a $50 Amazon gift card to the member who answers the most questions on our site during the month. We want to encourage our community members to share their knowledge and help each other out by answering questions related to sequencing technologies, genomics, and bioinformatics. The competition is open to all members of the site, and the winner will be announced at the beginning of July. Best of luck!
For a list of the official rules, visit (https://www.seqanswers.com/forum/sit...wledge-and-win)
For a list of the official rules, visit (https://www.seqanswers.com/forum/sit...wledge-and-win)
See more
See less
X
-
Originally posted by anyone1985 View PostSome assemblers have less parameter, such as Edena, Velvet; Some other are not. I'd like to use vcake or ssake to assemble some solexa data. However, there are so many parameters that it is not possible to try them all. How can I find out the best parameter?
Once you have that, you can use various methods to navigate the parameter space to find those parameters which maximize (or minimize) your objective function.
Torst.
-
I defined the best parameter was that the least number of contigs with the high accurate. Because I wanted to use solexa data to finish a bacteria genome with a draft reference genome.
Originally posted by Torst View PostYour first problem is to strictly define what you mean by "best". This is called your "objective function".
Once you have that, you can use various methods to navigate the parameter space to find those parameters which maximize (or minimize) your objective function.
Torst.
Comment
-
While "the least number of contigs" is a good variable (measurable and meaningful), "high accurate" is hard to measure. What does "high accuracy" means in terms of assemblage?
The Salzberg's group at the UMD have developed some tools for dealing with this question. You can give a look at http://www.cbcb.umd.edu/research/ass...lidation.shtml.
I hope to be useful,
Luis M. Rodriguez-R
Comment
-
I am finishing a bacteria genome with the solexa data. It is said that Solexa is not suitable for de novol assembling. The contigs that I assembled by velvet maybe exist mistakes which I will never know. This would lead me never to finish it . This is what I worried about.
Originally posted by lmrodriguezr View PostWhile "the least number of contigs" is a good variable (measurable and meaningful), "high accurate" is hard to measure. What does "high accuracy" means in terms of assemblage?
The Salzberg's group at the UMD have developed some tools for dealing with this question. You can give a look at http://www.cbcb.umd.edu/research/ass...lidation.shtml.
I hope to be useful,
Luis M. Rodriguez-RLast edited by anyone1985; 05-03-2009, 11:07 PM.
Comment
-
Any papers comparing assemblies?
I'd like to know which assemblies are best ... I don't care *how* best is defined, just so long as it *is* defined in the paper...
I think as a community we really need to think about infrastructure to allow rigorous comparison of different assembly methods over different datasets... I know this is hard, and efforts like the "Genome Assembly Validation" project are very welcome, but I think there is still a lot to do.
For example, people are rushing to perform hybrid assembly - where can we find guidelines about how best to do this? Is there any theoretical study of what mix of technologies gives the 'best' assembly?
Sorry for ranting, but the sooner we can get these questions sorted, the sooner we can turn to more interesting analysis.
Comment
-
I think it would be difficult to judge assembly methods based on papers right now -- the publication rate is slower than the rate of improvement in these methods, so you might be stuck trying out the latest of different methods. Perhaps the best way to "publish" assembly results is to simply post them in a blog format.
Within the realm of assembly development, the common objectives are: <br>
First: Correctly order read overlaps so as to have the largest N50, without creating any mis-assemblies. With next-gen data, coverage is so high that one typically doesn't have to "bet" on any overlaps as being correct, whereas in Sanger sequencing there are regions with only two reads overlapping with possible ambiguity, so decisions are made to consider overlaps that are more likely to cause mis-assemblies.
Second: Get base calling correct. This is typically post-processing (or as in euler-sr, only planned for post-processing).
Regarding the hybrid assembly, there are some tools that I started writing for euler-sr that would indicate what experiments are necessary to finish the genome, but development in this has been handed off to a new crop of students, so it may take some time to figure out. I'm not sure what Birney and Zerbino plan for Velvet, but it would be pretty easy to do. The best way to judge what is going on in an assembly is to look at the repeat graph, but typically it's difficult to make any sense of it unless you have looked at a lot of repeat graphs before.
-mark
Originally posted by dan View PostAny papers comparing assemblies?
I'd like to know which assemblies are best ... I don't care *how* best is defined, just so long as it *is* defined in the paper...
I think as a community we really need to think about infrastructure to allow rigorous comparison of different assembly methods over different datasets... I know this is hard, and efforts like the "Genome Assembly Validation" project are very welcome, but I think there is still a lot to do.
For example, people are rushing to perform hybrid assembly - where can we find guidelines about how best to do this? Is there any theoretical study of what mix of technologies gives the 'best' assembly?
Sorry for ranting, but the sooner we can get these questions sorted, the sooner we can turn to more interesting analysis.
Comment
Latest Articles
Collapse
-
by seqadmin
Developments in sequencing technologies and methodologies have transformed the field of epigenetics, giving researchers a better way to understand the complex world of gene regulation and heritable modifications. This article explores some of the diverse sequencing methods employed in the study of epigenetics, ranging from classic techniques to cutting-edge innovations while providing a brief overview of their processes, applications, and advances.
Methylation Detect...-
Channel: Articles
05-31-2023, 10:46 AM -
-
Differential Expression and Data Visualization: Recommended Tools for Next-Level Sequencing Analysisby seqadmin
After covering QC and alignment tools in the first segment and variant analysis and genome assembly in the second segment, we’re wrapping up with a discussion about tools for differential gene expression analysis and data visualization. In this article, we include recommendations from the following experts: Dr. Mark Ziemann, Senior Lecturer in Biotechnology and Bioinformatics, Deakin University; Dr. Medhat Mahmoud Postdoctoral Research Fellow at Baylor College of Medicine;...-
Channel: Articles
05-23-2023, 12:26 PM -
-
by seqadmin
Continuing from our previous article, we share variant analysis and genome assembly tools recommended by our experts Dr. Medhat Mahmoud, Postdoctoral Research Fellow at Baylor College of Medicine, and Dr. Ming "Tommy" Tang, Director of Computational Biology at Immunitas and author of From Cell Line to Command Line.
Variant detection and analysis tools
Mahmoud classifies variant detection work into two main groups: short variants (<50...-
Channel: Articles
05-19-2023, 10:03 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 07:14 AM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
Yesterday, 07:14 AM
|
||
Started by seqadmin, 06-06-2023, 01:08 PM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
06-06-2023, 01:08 PM
|
||
Started by seqadmin, 06-01-2023, 08:56 PM
|
0 responses
164 views
0 likes
|
Last Post
by seqadmin
06-01-2023, 08:56 PM
|
||
Deep Sequencing Unearths Novel Genetic Variants: Enhancing Precision Medicine for Vascular Anomalies
by seqadmin
Started by seqadmin, 06-01-2023, 07:33 AM
|
0 responses
299 views
0 likes
|
Last Post
by seqadmin
06-01-2023, 07:33 AM
|
Comment