Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • yoyoming1001
    Member
    • Aug 2010
    • 15

    Read Length to Gigabase conversion

    Really dumb question: how do you mathematically convert reads into gigabases and then calculate coverage?

    if I have 600 million clusters sequenced at 2*100, then is it 600*200=120,000 gigabases sequenced? Or is that megabases?

    Then do you divide 120,000/3,000 (3 billion bases in human genome) = 40. So 40x coverage?

    I ask because I'm trying to figure out how many clusters/lane I need to get when sequencing a whole genome, and then how many lanes our lab would need to run. Would also be curious what people are doing typically nowadays.
  • krobison
    Senior Member
    • Nov 2007
    • 734

    #2
    2x100 yields 200 = 2x10^2
    600M clusters = 6x10^8 clusters
    so 1.2x10^11 bases
    human genome approximately 3x10^9

    1.2x10^11 / 3x10^9 = 0.4x10^2 or 4x10^1 so yes 40X coverage

    Comment

    • yoyoming1001
      Member
      • Aug 2010
      • 15

      #3
      Thanks krobison. Good to see the math confirmed. It's a little odd to me since I thought labs had to push cluster densities on the HiSeq a bit in order to get 30x coverage in three lanes, but that's only 150,000 clusters PF per lane, which isn't all that much nowadays. Am I missing something?

      Comment

      • kmcarr
        Senior Member
        • May 2008
        • 1181

        #4
        Originally posted by yoyoming1001 View Post
        ...but that's only 150,000 clusters PF per lane, which isn't all that much nowadays. Am I missing something?
        Yes, three orders of magnitude. . (I know it was probably just a typo.)

        You need 150 million (150,000,000) PF clusters @ 2x100bp per lane to get 30X (90Gbp) in three lanes.

        Illumina specifications are for 187.5 million PF clusters per lane; in practice this can be pushed to 200-220 million PF clusters per lane with minimal loss of quality.

        Comment

        • yoyoming1001
          Member
          • Aug 2010
          • 15

          #5
          Haha, yeah, I knew it in my head but skipped over it when writing it down.

          I wasn't sure if there were any bioinformatics factors that would decrease the usable reads and therefore the lab would need to push higher on average.

          Thanks though! My understanding is a lot more clearer now.

          Comment

          Latest Articles

          Collapse

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, Today, 10:09 AM
          0 responses
          9 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, Yesterday, 08:59 AM
          0 responses
          16 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 12:03 PM
          0 responses
          24 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 11:40 AM
          0 responses
          21 views
          0 reactions
          Last Post SEQadmin2  
          Working...