Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I have never used MIRA, so cannot comment specifically as to the why, but in assembling PacBio data with HGAP - Celera assembler I have on occasion seen this. It is generally due to Celera Assembler conservatively breaking conitgs based on some heuristic. To force the overlap I generally use a simpler overlapper, such as minimus2, then resequence and call a consensus with quiver to check for the introduction of any missasemblies.
Leave a comment:
-
Mira: Contigs failing to collapse despite similarity
I have been using MIRA to assemble PacBio data for a very small circular genome and I have been observing a strange result in the output. For several datasets when the contigs are compared to the closest available reference There are a large number of contigs in certain regions that represent the same region of the genome.
Even when though these contigs have a high degree of overlap, they are not joined into single contigs.
The problem is especially obvious in one dataset where the whole genome can be represented as two contigs with a large degree of overlap at both ends but are not collapsed into a single contig (shown by MUMmer mapview output attached)
I've been running Mira just with the most basic settings for whole genome, denovo, accurate
The closest theory I can come up with for why this is happening is that errors are prevalent enough in the PacBio data that it is possible to come up with two distinct version of the same sequence as a contig.
I would love to hear any suggestions on how to properly collapse these contigs as I am worried I am missing valuable read and quality information by having identical regions represented by different contigs.Attached FilesTags: None
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 11:49 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Today, 11:49 AM
|
||
Started by seqadmin, Yesterday, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: