Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • vkpilla
    Junior Member
    • Apr 2012
    • 2

    How do I calculate the probability of sequence coverage at particular window?

    Hi all,

    A newbie question:

    How do I calculate the probability of a random set of sequences (at a specified length, say short reads of 25bp) aligning to a set window length (say 10kb)? Essentially, I'd like to know the sequence coverage probability along a specified length of DNA.

    I'd like to use this sequence coverage probability to test whether what I see (for example, say I see 3 reads within a particular 10kb window) is truly significant or aligned by random chance.

    Please let me know your thoughts and whether this is a valid question to ask in the first place.

    Thanks!
  • Simon Anders
    Senior Member
    • Feb 2010
    • 995

    #2
    If I understand your question correctly, you want to calculate the probability that a read with a completely random sequence aligns to a place in your genome by chance. The answer is usually: zero.

    There are 4^25=1.1e15 possible 25-bp reads. The human genome has 3e10 base pairs, hence the probability of a random 25-mer actually occurring in the human genome is roughly 3e10/1e15=2e-6.

    Hence, if you see a read aligning somewhere, it has almost certainly been amplified from a real biological template, i.e., it is either from the sample or from contamination.

    Contrary to popular belief, there is no such thing as alignment noise in high-throughput sequencing.
    Last edited by Simon Anders; 04-03-2012, 10:13 AM. Reason: corrected distorting grammer mistake

    Comment

    • Fad2012
      Member
      • Sep 2012
      • 62

      #3
      Hello guys

      I am having almost the same problem, and i am confused of how to calculate the probability, here is my issue:

      I have sequenced a PCR fragment of 2kb from original reference sequence of 7.5kb. I used illumina HiSeq paired-ends technology to generate 5 million 80bp reads with a coverage of x30 as I am looking for a recombination event between two serotypes of viruses, which is rare event.

      I know that the event occurs by 1%, so I expect to find 1% of the reads represent the recombination event. Among these reads there are some reads which are going to span the junction point of the Recombinants, and therefore not aligned to any of the reference sequences. I want to calculate the probability of the coverage of those reads which span the junction point to calculate the error rate between the expected and the observed.

      Need help!

      Many thanks

      Comment

      • Fad2012
        Member
        • Sep 2012
        • 62

        #4
        Hi again

        I forgot to add that the junction point could be occur at any nucleotide lies in the 2kb fragment.

        Thanks a lot

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Pathogen Surveillance with Advanced Genomic Tools
          by seqadmin




          The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
          03-24-2025, 11:48 AM
        • seqadmin
          New Genomics Tools and Methods Shared at AGBT 2025
          by seqadmin


          This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

          The Headliner
          The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
          03-03-2025, 01:39 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-20-2025, 05:03 AM
        0 responses
        49 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-19-2025, 07:27 AM
        0 responses
        57 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-18-2025, 12:50 PM
        0 responses
        50 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-03-2025, 01:15 PM
        0 responses
        201 views
        0 reactions
        Last Post seqadmin  
        Working...