Hi guys. Could someone show me on some example how we identify that certain node(s) in the graph represent one single contig?
Header Leaderboard Ad
Collapse
Acquiring contigs, de Bruijn graphs, velvet
Collapse
Announcement
Collapse
No announcement yet.
X
-
I'm not completely clear on what you are asking but I think you want to know how to decide when to merge distinct nodes in the de Bruijn graph into a single node. This is done when two nodes are unambiguously connected. In other words, if two nodes x and y are connected by an edge and neither x nor y branches, then they can be merged. I've attached a simple diagram of this. In the diagram the connected blue nodes can be merged together, so can the middle red nodes. In this case 19 nodes would be merged into 5 "contigs".
Let me know if that isn't clear or if you are asking something else.Attached Files
-
Let me know if that isn't clear or if you are asking something else.
In this case 19 nodes would be merged into 5 "contigs".Attached Files
Comment
-
Originally posted by bioinf View PostOk. If I'm getting it correctly then the image below represents just 1 single contig, because this graph fragment(although it is a full graph itself) can be unambiguously assembled into just one node, namely the one having the following sequence: TAGTCGAGGCTTTAGATCCGATGAGGCTTTAGAGACAG. Or I'm getting it wrong and there are actually 4 contigs?
x = TAGTCGAG
y = GAGGCTTTAGA
z = AGAGACAG
w = AGATCCGAGATGAG
Note that node y branches in both directions (to x/w on its left and to w/z on its right). This branch means that the graph cannot be unambiguously simplified further.
In particular, the path of the assembly that you suggest is:
x -> y -> w -> y -> z
This is ambiguous as the following is also a valid assembly:
x -> y -> w -> y -> w -> y -> z
The second assembly is the same as the first except it travels through the y/w loop twice. In general there is no way to know how many times to travel through the loop so most assemblers will output 4 contigs here.Last edited by jts; 01-10-2011, 05:05 AM.
Comment
-
I see. I guess even the mate-pairs can't help in such graph. The only solution in this case is to have the information about the length of the original DNA strand. Then we can deduce the number of times the repetition occured.
What is generally done in such cases? What is the common approach?
Comment
-
It varies assembler to assembler. Some will select the most likely number of copies of the repeat based on read pairs spanning the loop and the insert size distribution. Others will just built a scaffold of x,z and leave the sequence inbetween as a run of "N"s.
Comment
-
Simple model of De Bruijn graph
Dear jts or bioinf,
I have a trouble in understanding de Bruijn graph because the concept of graph is vague for me. What does it mean when we say k-mer acts as edge and k-1 mers act as edges? How the use of k-1 mer is used instead of only k mer? Do you have simpler model of assembly using de Bruijn graph? I would appreciate if you can help.
Thanks for your time,
Scientist1
Comment
Latest Articles
Collapse
-
by seqadmin
Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...-
Channel: Articles
03-21-2023, 01:49 PM -
-
by seqadmin
Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...-
Channel: Articles
03-10-2023, 05:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 01:40 PM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
Yesterday, 01:40 PM
|
||
Started by seqadmin, 03-29-2023, 11:44 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
03-29-2023, 11:44 AM
|
||
Started by seqadmin, 03-24-2023, 02:45 PM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
03-24-2023, 02:45 PM
|
||
Started by seqadmin, 03-22-2023, 12:26 PM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
03-22-2023, 12:26 PM
|
Comment