Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to explain this scenario ?

    When I was assembling the reads , I found this scenario:

    TAA-CCTCCCCC-AAANTT-CAGA Consensus
    TAA-CCTCCCCC-AAACTT
    TAA-CCTCCCCC-AAACTTACAGA
    TAA-CCT-CCCC-AAACTTACAGA
    TAA-CCTCCCCCAAAACTT-CAGA
    TAACCCTCCCCC-AAAATT-CAGA
    TAA-CCTCCCCC-AAAATT-CAGA
    TAA-CCTCCCCA-AAACTTACAGA
    TAA-CCTCCCCC-AAAATT-CAGA
    TAA-CCTCCCCC-AAAATT-CAGA
    ---A-CCTCCCCC-AAAATT-CAGA

    The first line is the consensus sequence. You can find a N.
    Which was caused by 5C and 5A mapped to that position.
    Someone told me this was caused by the homopolymer, the
    C observed at the position is likely to be one part of the homopolymer
    ahead. Have you met this problem before? Do you think it is possible?

  • #2
    You really need to give a lot more information than what you've supplied for any reasonable hypothesis to be provided. Just off the top of my head, I would say there are several plausible explanations.

    Homopolymers are definitely possible - but the likelihood depends on the platform. (Oops! I didn't realize this was in the 454 forum! Homopolymers are more common with 454 than some of the other platforms, so yes, this is possible. However, I think my other comments stand; homopolymers are far from the only reason you would see the above scenario.)

    If it's from a diploid organism, there could be two alleles - and one of them has a SNP.

    If it's from a haploid organism, there could be paralogs, once of which has a single base difference compared to the other, while the reference genome has only one copy.

    I'm sure there are many other biological explanations. Since you haven't given probability scores or any other useful information, all we can do is guess.

    Good luck figuring it out.
    Last edited by apfejes; 01-15-2009, 07:50 AM. Reason: didn't realize thiis was posted to the 454 forum!
    The more you know, the more you know you don't know. —Aristotle

    Comment


    • #3
      Hi apfejes,

      Thanks for your reply. This is a pilot study on how to assemble the genome by 454 data, we found that through 454 software(runAssembly, runMapping), the consensus is too long to be true which due to the influence of the homopolymer, the result is even worse for Seqman, therefore, we write our own script to do the assebling work, until now, we haven't integrated the quality value(quality score, flow value), so we met the problem mentioned above(by 454 software, no N, but instead, these positions would have very low quality score).

      Now, I am trying to figure out the algorithm of 454 softwares how they make use of the "flow value " and "quality score", could anyone give me some reference about it, seems not mentioned in the manuals.

      I am kind of feeling that "quality score" is derived from "flow value" is that true?

      Comment


      • #4
        Mingkunli,
        Any hits with your aligner of 454 for mito data? We are also looking at mt using 454 ..
        --
        bioinfosm

        Comment


        • #5
          mingkunli

          454 Titanium and most recent FLX uses quality score algorithm are based on Broad Institute paper. 454 offline toolkit also has a script called "sffrescore" to allow you to rescore the old read quality scores into new Broad Institute's one.

          Here is Broad Instite paper that 454 read quality score is based on:
          An international, peer-reviewed genome sciences journal featuring outstanding original research that offers novel insights into the biology of all organisms
          Last edited by hlu; 02-19-2009, 09:25 AM.

          Comment


          • #6
            A suggestion, to judge whether the 5th 'A' was homopolymer or SNP, you can amplify this fragment using PCR and clone the product to a T vector, then picking 10 clones to sequence using ABI3730. And I think you'll get the corrct answer.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              The Impact of AI in Genomic Medicine
              by seqadmin



              Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
              Yesterday, 02:07 PM
            • seqadmin
              Multiomics Techniques Advancing Disease Research
              by seqadmin


              New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

              A major leap in the field has
              ...
              02-08-2024, 06:33 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 02-23-2024, 04:11 PM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-21-2024, 08:52 AM
            0 responses
            62 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-20-2024, 08:57 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-14-2024, 09:19 AM
            0 responses
            65 views
            0 likes
            Last Post seqadmin  
            Working...
            X