Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • foolishbrat
    Member
    • Nov 2008
    • 45

    Next Gen versus SAGE sequencing error

    Dear all,

    What's the primary difference between Next Gen Sequencing
    with SAGE in terms of sequencing error?

    In particular the errors affecting the tag counts.
  • westerman
    Rick Westerman
    • Jun 2008
    • 1104

    #2
    Originally posted by foolishbrat View Post
    Dear all,

    What's the primary difference between Next Gen Sequencing
    with SAGE in terms of sequencing error?

    In particular the errors affecting the tag counts.
    I think that you have rephrase your question to be more specific. I presume you are asking about the difference between doing expressing profiling with Next Gen sequencing versus doing profiling via SAGE with classical Sanger sequencing.

    Hands down a single Sanger read will be more accurate than a Next Gen read. That is one answer.

    However since Next gen platforms have many more reads per cost than Sanger you can
    sequence to a larger depth. If you are willing to throw away any next gen reads that do not have significant depth then your accuracy will go way up. So that may be your answer.

    The NextGen technology that you use will greatly influence the answer. I suspect that the SOLiD, despite not being the most accurate platform on a per-read basis, will be a good expression profiling platform simply due to the large quantity of reads at a low cost.

    Unfortunately I know of no papers that cover this question. I am not even sure if it is a question that someone would want to go through the effort of answering.

    Comment

    • Josliu
      Junior Member
      • Nov 2008
      • 4

      #3
      There are a few types of errors in the sequence tag counts for Next Gen sequence.
      1. The sequence basecall errors are high, .5%-1%. When we count the 17 base tags for long SAGE, we may have up to 17% errors or higher. Since different systems may have different error profiles, we may have difficulty to compare the results from one lab to another taken from different systems.
      2. The low abundant gene tags may be affected by the high abundant gene tags with tag sequences differing by 1 bps, since the expression ratio difference may be in seven orders of magnitude.
      3. We also have shot noise sqrt(N), N being the number of the tag. This will be problem to low abundant genes.
      4. Two or more genes may share the same tag. We have no way to tell how much is from one gene and how much is from the other gene(s).
      5. One gene might have two tags because of multiple isoforms. It is challenge to decide how to report them.
      6. Many gene tags are short then 17 bps such as 12 bps. We will have high errors to those genes in counting the tags.
      7. The errors may also come from the different channel locations in the flowcell.
      8. The enzyme efficiency might be dependent on the sequence contents.
      You may use NextGENe software to handle such problems. Generally the error will be minimum if the tag reach 500 counts.


      josliu

      Comment

      • foolishbrat
        Member
        • Nov 2008
        • 45

        #4
        Originally posted by Josliu View Post
        There are a few types of errors in the sequence tag counts for Next Gen sequence.
        1. The sequence basecall errors are high, .5%-1%. When we count the 17 base tags for long SAGE, we may have up to 17% errors or higher. Since different systems may have different error profiles, we may have difficulty to compare the results from one lab to another taken from different systems.
        2. The low abundant gene tags may be affected by the high abundant gene tags with tag sequences differing by 1 bps, since the expression ratio difference may be in seven orders of magnitude.
        3. We also have shot noise sqrt(N), N being the number of the tag. This will be problem to low abundant genes.
        4. Two or more genes may share the same tag. We have no way to tell how much is from one gene and how much is from the other gene(s).
        5. One gene might have two tags because of multiple isoforms. It is challenge to decide how to report them.
        6. Many gene tags are short then 17 bps such as 12 bps. We will have high errors to those genes in counting the tags.
        7. The errors may also come from the different channel locations in the flowcell.
        8. The enzyme efficiency might be dependent on the sequence contents.
        You may use NextGENe software to handle such problems. Generally the error will be minimum if the tag reach 500 counts.


        josliu
        Thanks so much for the reply. This is truly invaluable.

        Do you know any existing program/papers that does correction
        on on such tag counts error?

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
          by SEQadmin2


          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

          Here are nine questions we think about, in roughly the order they matter, before...
          06-18-2026, 07:11 AM
        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, 06-17-2026, 06:09 AM
        0 responses
        33 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-09-2026, 11:58 AM
        0 responses
        97 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        117 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        111 views
        0 reactions
        Last Post SEQadmin2  
        Working...