Genome assembly.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by kenietz View PostGenome assembly.
You can also check the slides posted here -
If you like to split the reads into parts, the paper by Titus Brown in the first link should help you.
Please email me (samanta at homolog.us), if you need more explanation of the algorithms, because I do not check the forum frequently. The state of the art is far ahead of Velvet with 512Gb RAM, etc.
Comment
-
Originally posted by ymc View PostIf I classify the reads into different chromosomes using bwa, can I "de novo"ly assemble the chromosomes in a 64GB machine?
i) For kind of de novo assembly we talk about, the chromosome sequences are not known. If they were known, why would you need de novo assembly in the first place?
ii) Where chromosomes exist and you are trying to do reassembly, yes it is possible to reduce the RAM requirement by partitioning the reads. However, remember that the RAM requirement for error-free reads is capped no matter how many reads you have. However, in world with errors, RAM requirement goes up linearly with the number of reads.
iii) If you are trying to do reassembly of human genome using BWA, you are most likely interested in parts of chromosome with indels, etc. Unfortunately, BWA may not be able to capture the reads for those regions and assign to reference chromosome.
Comment
-
Originally posted by kenietz View Post@SES:
Thank you for the information. The client wants to try out with 10x at first and then proceed with higher coverage. Yeah, i got it that SGA would probably be able to do the job. Now i am reading about readjoiner. I'm still considering if to take the job at all.
Btw, what kind of power would i really need to assemble 3Gb genome?
Comment
-
Originally posted by samanta View PostInteresting question.
i) For kind of de novo assembly we talk about, the chromosome sequences are not known. If they were known, why would you need de novo assembly in the first place?
Comment
-
Originally posted by ymc View PostI want to have better variant phasing than GATK's ReadBackedPhasing. Will that route do a better job?
Of late, people are recognizing a need for algorithms to handle problems of type mentioned by you. Please take a look at the following two papers and check their programs freely distributed at their websites.
Genome assembly methods produce haplotype phase ambiguous assemblies due to limitations in current sequencing technologies. Determining the haplotype phase of an individual is computationally challenging and experimentally expensive. However, haplotype phase information is crucial in many bioinforma …
The paper mentioned in the following link is not directly relevant to your problem, but could be of help in de novo assembling highly polymorphic genome, where the assumption of no haplotype difference breaks down -
Comment
-
Originally posted by samanta View PostPlease check the Minia program discussed here. You can assemble a 3Gbase genome using about 6-8GB RAM.
You can also check the slides posted here -
If you like to split the reads into parts, the paper by Titus Brown in the first link should help you.
Please email me (samanta at homolog.us), if you need more explanation of the algorithms, because I do not check the forum frequently. The state of the art is far ahead of Velvet with 512Gb RAM, etc.
Comment
-
Originally posted by ymc View PostIf I classify the reads into different chromosomes using bwa, can I "de novo"ly assemble the chromosomes in a 64GB machine?
Comment
-
Readjoiner Features
It was interesting to read article on Readjoiner and notice it has several features as an improvement over SGA. Is Readjoiner MPI compatible. I read it is multithreaded, how good is the scalability ?
However, I notice that the tool does not perform well for erroneous reads as you showed in your e.coli data. Is it possible you integrate data cleaner and filters in Readjoiner itself ?
Also, on Plantagora metrics it seems that Readjoiner performs worse than SGA! It popped up with more number of insertions and deletions and misassembled contig bases than SGA or Edena!
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
62 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment