Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filter with dbSNP or 1000 genomes or ...

    Hi all,

    I’ve been working on whole exome sequencing using SureSelect and Illumina GAIIx.
    I have a whole list of single nucleotide variations, certainly too many to check them all. So I was wondering if I could do some filtering. What I did was remove all variants that are present in dbSNP131. However, when I presented these data, a lot of people say that dbSNP131 is not a good measure to use. According to them, I should either use dbSNP130 or data from the 1000 genomes project.
    But as for as I understand, data from the 1000 genomes project are incorporated into dbSNP (although with some delay). And how can you be sure that variants found in the 1000 genomes project are truly polymorphisms and not mutations?

    What do you think of this? Can I use dbSNP130 and ‘upgrade’ the coordinates to hg19? Where to find a list of SNPs from 1000 genomes project? Or do I just continue with dbSNP131?

    Any input would be greatly appreciated.
    Lien

  • #2
    Hi Lien.
    I also do exome sequencing, in my case on patients with rare diseases. I'm interested in people's responses too.

    I annotate my samples with dbSNP131 using Annovar. If I had an easy way to use dbSNP129 then I would, and I would then filter using this field. However, I haven't found a dbSNP129 file in hg19 coordinates that Annovar can use. As a result, I LOOK at the dbSNP column in my spreadsheet output, but I don't generally filter things out based on it. When a particular mutation looks interesting to me, I manually look up the dbSNP entry to see in what context it was added.

    I also annotate the 1000 genomes allele frequency on my variants. I do filter on this column, specifying that I'm only interested in variants with, for example, less than a 1% or 2% frequency. Any frequency below 1% I consider to be "somewhere between 0% and 1%"... I don't think the data is precise yet for rare variants. In some cases we think we have found recessive disease variants but that were observed in the 1000 genomes project at a very low frequency.

    Comment


    • #3
      I would be wary of using any of the coordinate mapping tools for snps, you are much better to map flanking sequences so you aren't caught out but underlying sequence changes in the assembly

      dbSNP 132 is in HG19/GRCh37 coordinates and contains all the 1000 genomes pilot snp data

      You can also get the most recent 1000genomes release which is all in GRCh37 from ftp://ftp.1000genomes.ebi.ac.uk/vol1...hase1_release/

      You can find answers to many 1000 genomes questions in our FAQ http://www.1000genomes.org/faq

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-25-2024, 11:49 AM
      0 responses
      17 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-24-2024, 08:47 AM
      0 responses
      17 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      62 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X