Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • contigs to fully assembled genome

    Hi all,
    I've done a bit of E. coli resequencing but I am new to the art of de novo and could use some input on what strategies would give me a fully contiguous genome (and at what cost).

    Trying to assemble a 5.2Mb bacteria (estimated from pulsed field) with 220bp SE MiSeq filtered reads (Q>=25) @ ~150x coverage. Velvet Optimiser tells me:

    k=147
    numContigs = 161
    n50 = 81.6kb
    longest contig = 279kb
    num contigs > 1kb = 107
    total bases in contigs = 5.1Mb (~98.9% covered)

    I don't necessarily need a fully contiguous genome but what would it take? According to what I've read, a mate-paired run could help but it seems that I am at a point of diminishing returns.

    Thanks for any suggestions!

  • #2
    I'd be interested why you chose 220bp SE over 150x2 PE.

    Mate-pair libraries will likely drive up the continuity of your sequence -- you'll be able to use a scaffolder (such as SSPACE) to drive contigs into scaffolds -- though in many cases there will be gaps between contigs (and some of those gaps will be wrong and just cases where the two contigs overlap but by so little the scaffolder can't recognize it).

    Another option would be Pacific Biosciences long reads. You'll need to use PacBioToCA or similar to merge them with your Illumina data, but these also can help. A number of cores or service providers offer this.

    Comment


    • #3
      User error is the reason why 2x150 PE reads weren't used

      Comment


      • #4
        What is the purpose of your sequencing project? Are you sure a mapping or reference guided assembly wouldn't suffice?

        Comment


        • #5
          Get 4 SMRT cells of PacBio Continuous Long Reads. That will give you about 70-80X coverage, and will bring your contigs to around 50, your max contig length to close to 1MB, N50 around 200K.

          Should cost you less than 3000, including library prep, etc...i think

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 08:47 AM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          59 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          54 views
          0 likes
          Last Post seqadmin  
          Working...
          X