Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does a reliable consensus mean more reliable SNPs?

    Dear All,

    I'm relatively new to WGS analysis so please excuse any naivety on my part.

    Before getting the WGS sequences I have confirmed the presence or absence of certain oligonucleotides in various bacterial DNA samples. So I know I should see these sequences in the final consensus sequence.

    Is it true to think that if I can produce a more reliable consensus sequence then the SNP calls are also likely to be more reliable. I appreciate that there are many SNP quality filters etc that will also be applied that can lead to difference between a consensus and a SNP call, but I just wanted to get an idea of the overall correlation between the consensus and SNPs.

    If there is a high correlation between the two then surely if I make sure that my consensus sequences are as reliable as possible, when I come to calling the SNPs from the same mapped reads they will be more reliable???

    Apologies if I'm totally wrong about this.

    Best wishes lg36

  • #2
    Hi lg36,

    you're not wrong about this at all -- this is in fact a pretty important factor in SNP discovery.

    Your SNPs can only ever be as good as your reference and your mapping. If your reference contains errors, this will propagate right through into your SNP calls, and similarly if you mismap lots of reads you will also increase your false positive SNP rate.

    I routinely map the reads from the individual used to make the reference back to the reference before I do any mapping of other individuals onto that reference for SNP discovery. I then call SNPs on that mapping first, and I always get SNPs here.

    In a homozygous or haploid organism this will give you a list of positions where there reference most likely contains errors -- in an ideal case there should be zero SNPs when I map the reads back onto the reference that was made from the same reads. I don't know what you work with but I am fortunate in that I do a lot of work with cultivated barley which is essentially homozygous and that simplifies matters obviously.

    I then subtract the list of SNPs called there from any list of SNPs generated with reads from a different individual -- it's essentially a way of removing background noise. I guess if you have a heterozygous organism and it's well curated you could probably use a public, curated list of SNPs instead.

    This gives you much cleaner SNP sets and reduces the false positive rate but the caveat is that potentially you may be increasing your false negative rate (I don't have any data on this yet). It all depends on what your SNPs are for - if reliability is key, then this works well. You may also want to remove duplicates from the mapping -- that also reduces your FP rate.

    cheers

    Micha

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Addressing Off-Target Effects in CRISPR Technologies
      by seqadmin






      The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
      08-27-2024, 04:44 AM
    • seqadmin
      Selecting and Optimizing mRNA Library Preparations
      by seqadmin



      Sequencing mRNA provides a snapshot of cellular activity, allowing researchers to study the dynamics of cellular processes, compare gene expression across different tissue types, and gain insights into the mechanisms of complex diseases. “mRNA’s central role in the dogma of molecular biology makes it a logical and relevant focus for transcriptomic studies,” stated Sebastian Aguilar Pierlé, Ph.D., Application Development Lead at Inorevia. “One of the major hurdles for...
      08-07-2024, 12:11 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 08-27-2024, 04:40 AM
    0 responses
    16 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 08-22-2024, 05:00 AM
    0 responses
    293 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 08-21-2024, 10:49 AM
    0 responses
    135 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 08-19-2024, 05:12 AM
    0 responses
    124 views
    0 likes
    Last Post seqadmin  
    Working...
    X