Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • genome assembly with only mate pair reads

    Hi,

    I am mostly comfortable with DNA resequencing, mRNAseq, ChIPseq, etc. data. And always feel difficult handling de novo assembly works. But it comes my way anyway.

    I have a set of data that are mate pair sequencing of a ~1GB genome. It is close to 30x coverage after linker being removed. the insert size is about 8Kb. I don't feel it is a good idea to use mate pair only (I'd rather to have various sized libraries). Without evidence, I feel a single mate pair library sequence is worse than paired end at the same depth. Let me know if I am wrong.

    Now, I am asked to get best out of this data. Without diving in too deep (spend too much time), what the best (practical) case scenario and the worst case scenario I should prepare the collaborator for?

    I have access to a 512GB 32 core machine, and have velvet, soap denovo, and spades to use. Also a CLC bio license that can be moved to that computer. What is the recommended methods, programs, and parameters to use?

    Very much appreciate your thoughts and suggestions!

    By the way, I did recommend them to (at least) sequence another 50x in 2x100~150. But I don't think it is going to fly.

    Thanks!!!

  • #2
    Hello, I'm a newcomer.

    Comment


    • #3
      You need to consider several things.
      Is it a plant or animal genome? Do you have a reference?
      How complex is the genome i.e ploidy etc?
      I don't think mate pair alone can do much. Also you just have one mate pair library.
      A starting point would be to sequence several paired end libraries with varying insert sizes e.g. 180bp, 300bp, 600bp etc. for the contig level assembly and later coupled them with several mate pair libraries e.g. 2kb, 5kb, 8kb etc. for scaffolding. Longer reads e.g. PacBio may also help you to resolve large repetitive regions.
      You need to carefully plan each stage of your project: sequencing, quality control and error correction of reads, preliminary contig assembly, scaffolding and gap closing. And of course there is no single best assembler/pipeline for all assembly problem. You need to evaluate multiple assemblers to find the one that gives you best assembly.

      Comment


      • #4
        Thanks for the reply.

        These are exactly what I thought, and recommended to the researcher. Unfortunately I have no control over how the sequencing was designed. But I can refuse to performed the analysis without adequate data :-)

        Comment


        • #5
          It sounds like a waste of your time. You'll end up with a bad assembly that they probably won't like.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Best Practices for Single-Cell Sequencing Analysis
            by seqadmin



            While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
            Today, 07:15 AM
          • seqadmin
            Latest Developments in Precision Medicine
            by seqadmin



            Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

            Somatic Genomics
            “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
            05-24-2024, 01:16 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 08:18 AM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Today, 08:04 AM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 06-03-2024, 06:55 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-30-2024, 03:16 PM
          0 responses
          27 views
          0 likes
          Last Post seqadmin  
          Working...
          X