Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Longer reads => more errors?

    Hi,

    I am working on transcript quantification, where multi-reads, i.e. reads which can be mapped to multiple locations within a transcript/genome, is an important issue as such multi-reads add ambiguity to transcript counts.

    Although longer reads can have a better chance to be mapped uniquely to a genomic location, I am concerned about read error for longer reads. Specifically, does the number of errors within a read increase linearly with read length, or not?

    For example, if a 80bp read will contain 1 error on average, then is it fair to assume a 160bp read will contain 2 errors on average, or actually more?

    As far as I know, read quality deteriorates from the 5' end to the 3' end, hence errors occur more often at the 3' end. Suppose the low quality 3'end begins in the middle of a 80bp read (i.e. the 41bp from the 5'end), can I assume the low-quality end for a 160bp read also will start in the middle (i.e. the 81bp from the 5'end), or will it still start at the 41bp of the read?

    Please suggest. Thanks in advance.

    Billy

  • #2
    You need to specify which platform you are working with.

    Because 454, Illumina & SOLiD all get their signal from an ensemble of molecules, dephasing (the lagging of some molecules behind others due to a failure to extend) is a problem & error rates increase with the length of the read. So a read twice as long is indeed expected to have more than twice as many errors, because errors are not evenly distributed across the length.

    The precise relationship depends on the platform. I've seen plots, though I can never find one when I really need one. Ideally, you could estimate this from your dataset.

    Comment


    • #3
      Thanks Keith. I have been working on a publicly available Illumina data set. I might have other data set to analyze later, but I don't know yet about the platform. It's good to know the simple linear relationship will not hold in general, and the relationship varies across platforms.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Developments in Metagenomics
        by seqadmin





        Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
        09-23-2024, 06:35 AM
      • seqadmin
        Understanding Genetic Influence on Infectious Disease
        by seqadmin




        During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

        Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
        09-09-2024, 10:59 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 10-02-2024, 04:51 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-01-2024, 07:10 AM
      0 responses
      21 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-30-2024, 08:33 AM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-26-2024, 12:57 PM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Working...
      X