Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 'n' in PacBio assembled sequences?

    I used Celera to assemble PacBio with correction with 454 sequences.

    However, I found letter n (other than atcg) in the assembled result. Why ? and how do I fix it?

  • #2
    You can get an "N" in assemblies from all read types, and typically this means there was no clear consensus - some of the reads suggested on base, other reads another. Some assemblies might use other IUPAC ambiguity codes if they can tell for example the base is either an A or C.

    Such positions could be SNPs if you are sequencing a mixed population, or different alleles if you are sequencing something with two (or more) copies of each chromosome, or errors in assembly (e.g. merging two similar regions into one), etc.

    It is also possible by bad luck in a low coverage region that all your reads at that position happen to have an N rather than a clear base.

    Comment


    • #3
      Good eye. N is from Celera Assember in this case, rather than PacBio.

      If you've got enough PacBio coverage, you can run Quiver for assembly polishing, which can lead to final accuracy of > 99.999%. See www.pacbiodevnet.com/quiver. That can remove or introduce Ns, depending on the situation. You can use this option to reduce the number of Ns in certain cases.

      --noEvidenceConsensusCall=reference

      Comment


      • #4
        Do yo mean use Quiver instead of celera ?

        Is Quiver a standard/common tool for assembling PacBio? (I'm new to PacBio)

        Comment


        • #5
          Quiver is a beta tool for PacBio that will become standard in the next release. It's available on github now. Here's a documentation link:

          GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

          Comment


          • #6
            Thank you for your reply. I have 2 more questions.

            1. Does Quiver still need next-generation sequences for correction?

            2. Any chance, by changing any parameters, I could reduce n in Celra?

            Comment


            • #7
              Good questions. Quiver is a consensus and variant caller, like the GATK Haplotype Caller, rather than a de novo assembly algorithm.

              1. HGAp does not need 2nd gen sequences (or PacBio CCS) for error correction, if that's your question. Quiver is used by HGAp, but it's only for generating the final assembly sequence after you've got your contigs. It's not an "error correction" algorithm.

              2. I'm not aware of which parameters would reduce the Ns in Celera Assembler. Sorry!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                Yesterday, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              57 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              45 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              55 views
              0 likes
              Last Post seqadmin  
              Working...
              X