Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • contigs to fully assembled genome

    Hi all,
    I've done a bit of E. coli resequencing but I am new to the art of de novo and could use some input on what strategies would give me a fully contiguous genome (and at what cost).

    Trying to assemble a 5.2Mb bacteria (estimated from pulsed field) with 220bp SE MiSeq filtered reads (Q>=25) @ ~150x coverage. Velvet Optimiser tells me:

    k=147
    numContigs = 161
    n50 = 81.6kb
    longest contig = 279kb
    num contigs > 1kb = 107
    total bases in contigs = 5.1Mb (~98.9% covered)

    I don't necessarily need a fully contiguous genome but what would it take? According to what I've read, a mate-paired run could help but it seems that I am at a point of diminishing returns.

    Thanks for any suggestions!

  • #2
    I'd be interested why you chose 220bp SE over 150x2 PE.

    Mate-pair libraries will likely drive up the continuity of your sequence -- you'll be able to use a scaffolder (such as SSPACE) to drive contigs into scaffolds -- though in many cases there will be gaps between contigs (and some of those gaps will be wrong and just cases where the two contigs overlap but by so little the scaffolder can't recognize it).

    Another option would be Pacific Biosciences long reads. You'll need to use PacBioToCA or similar to merge them with your Illumina data, but these also can help. A number of cores or service providers offer this.

    Comment


    • #3
      User error is the reason why 2x150 PE reads weren't used

      Comment


      • #4
        What is the purpose of your sequencing project? Are you sure a mapping or reference guided assembly wouldn't suffice?

        Comment


        • #5
          Get 4 SMRT cells of PacBio Continuous Long Reads. That will give you about 70-80X coverage, and will bring your contigs to around 50, your max contig length to close to 1MB, N50 around 200K.

          Should cost you less than 3000, including library prep, etc...i think

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Non-Coding RNA Research and Technologies
            by seqadmin


            Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

            [Article Coming Soon!]...
            Today, 08:07 AM
          • seqadmin
            Recent Developments in Metagenomics
            by seqadmin





            Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
            09-23-2024, 06:35 AM
          • seqadmin
            Understanding Genetic Influence on Infectious Disease
            by seqadmin




            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
            09-09-2024, 10:59 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 10-02-2024, 04:51 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-01-2024, 07:10 AM
          0 responses
          23 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-30-2024, 08:33 AM
          1 response
          29 views
          0 likes
          Last Post EmiTom
          by EmiTom
           
          Started by seqadmin, 09-26-2024, 12:57 PM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Working...
          X