Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Acquiring contigs, de Bruijn graphs, velvet

    Hi guys. Could someone show me on some example how we identify that certain node(s) in the graph represent one single contig?

  • #2
    I'm not completely clear on what you are asking but I think you want to know how to decide when to merge distinct nodes in the de Bruijn graph into a single node. This is done when two nodes are unambiguously connected. In other words, if two nodes x and y are connected by an edge and neither x nor y branches, then they can be merged. I've attached a simple diagram of this. In the diagram the connected blue nodes can be merged together, so can the middle red nodes. In this case 19 nodes would be merged into 5 "contigs".

    Let me know if that isn't clear or if you are asking something else.
    Attached Files

    Comment


    • #3
      Let me know if that isn't clear or if you are asking something else.
      Thank you very much. That's exactly what I'm interested in. Identifying and acquiring contigs and then using scaffolding to merge them and get the original DNA.
      In this case 19 nodes would be merged into 5 "contigs".
      Ok. If I'm getting it correctly then the image below represents just 1 single contig, because this graph fragment(although it is a full graph itself) can be unambiguously assembled into just one node, namely the one having the following sequence: TAGTCGAGGCTTTAGATCCGATGAGGCTTTAGAGACAG. Or I'm getting it wrong and there are actually 4 contigs?
      Attached Files

      Comment


      • #4
        Originally posted by bioinf View Post
        Ok. If I'm getting it correctly then the image below represents just 1 single contig, because this graph fragment(although it is a full graph itself) can be unambiguously assembled into just one node, namely the one having the following sequence: TAGTCGAGGCTTTAGATCCGATGAGGCTTTAGAGACAG. Or I'm getting it wrong and there are actually 4 contigs?
        Not quite since the graph contains a loop. Label the nodes as follows:

        x = TAGTCGAG
        y = GAGGCTTTAGA
        z = AGAGACAG
        w = AGATCCGAGATGAG

        Note that node y branches in both directions (to x/w on its left and to w/z on its right). This branch means that the graph cannot be unambiguously simplified further.

        In particular, the path of the assembly that you suggest is:

        x -> y -> w -> y -> z

        This is ambiguous as the following is also a valid assembly:

        x -> y -> w -> y -> w -> y -> z

        The second assembly is the same as the first except it travels through the y/w loop twice. In general there is no way to know how many times to travel through the loop so most assemblers will output 4 contigs here.
        Last edited by jts; 01-10-2011, 05:05 AM.

        Comment


        • #5
          I see. I guess even the mate-pairs can't help in such graph. The only solution in this case is to have the information about the length of the original DNA strand. Then we can deduce the number of times the repetition occured.

          What is generally done in such cases? What is the common approach?

          Comment


          • #6
            It varies assembler to assembler. Some will select the most likely number of copies of the repeat based on read pairs spanning the loop and the insert size distribution. Others will just built a scaffold of x,z and leave the sequence inbetween as a run of "N"s.

            Comment


            • #7
              Now everything is clear. Thank you.

              Comment


              • #8
                Simple model of De Bruijn graph

                Dear jts or bioinf,

                I have a trouble in understanding de Bruijn graph because the concept of graph is vague for me. What does it mean when we say k-mer acts as edge and k-1 mers act as edges? How the use of k-1 mer is used instead of only k mer? Do you have simpler model of assembly using de Bruijn graph? I would appreciate if you can help.

                Thanks for your time,
                Scientist1

                Comment


                • #9
                  Sorry, my first question is:
                  What does it mean when we say k-mer acts as edge and k-1 mers act as nodes?

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  9 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  50 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  67 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X