Header Leaderboard Ad

Collapse

Question on the de Bruijn Graphs

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question on the de Bruijn Graphs

    I ran a few times the Velvet, SOAP Abyss and found something intriguing (or that may be my ignorance) in all assembly. And here's my question:

    1) Why two contigs which have approximately 68 base pairs of overlap are not merged together? I tried to think of some hypotheses linking the structure of the de Bruijn Graph. But i still not understand what happening here. I ran all assemblers with k= 51, and the database has millions of sequences with unique length of 54 bp.

    Both contigs have overlapping kmers sufficient to generate a big merged contig and why this is not happening?

    Thanks in advance,
    André.

  • #2
    Originally posted by aloliveira View Post
    I ran a few times the Velvet, SOAP Abyss and found something intriguing (or that may be my ignorance) in all assembly. And here's my question:

    1) Why two contigs which have approximately 68 base pairs of overlap are not merged together? I tried to think of some hypotheses linking the structure of the de Bruijn Graph. But i still not understand what happening here. I ran all assemblers with k= 51, and the database has millions of sequences with unique length of 54 bp.

    Both contigs have overlapping kmers sufficient to generate a big merged contig and why this is not happening?

    Thanks in advance,
    André.
    Hello,


    The 68-base sequence is likely a repeat that is present at least twice in the real genome.

    Therefore, joining the two contigs you described above will likely result in a chimeric contig -- something truly undesirable at any stage in a genome project.

    It is usual for contigs to start and/or end with repeated sequences. Otherwise, they would be longer in the first place unless there was a true "data gap." (no sequences at all for a genomic region)


    In my experience, ABySS is the best at avoiding misassemblies caused by repeats. So if ABySS says so, then you should not join your contigs.


    Ray is also quite good (I work on Ray).



    Sébastien Boisvert

    Comment

    Latest Articles

    Collapse

    • seqadmin
      How RNA-Seq is Transforming Cancer Studies
      by seqadmin



      Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
      09-07-2023, 11:15 PM
    • seqadmin
      Methods for Investigating the Transcriptome
      by seqadmin




      Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

      Whole Transcriptome RNA-seq
      Whole transcriptome sequencing...
      08-31-2023, 11:07 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 06:18 AM
    0 responses
    5 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 09:17 AM
    0 responses
    8 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 09-19-2023, 09:23 AM
    0 responses
    24 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 09-19-2023, 09:14 AM
    0 responses
    7 views
    0 likes
    Last Post seqadmin  
    Working...
    X