Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filter with dbSNP or 1000 genomes or ...

    Hi all,

    I’ve been working on whole exome sequencing using SureSelect and Illumina GAIIx.
    I have a whole list of single nucleotide variations, certainly too many to check them all. So I was wondering if I could do some filtering. What I did was remove all variants that are present in dbSNP131. However, when I presented these data, a lot of people say that dbSNP131 is not a good measure to use. According to them, I should either use dbSNP130 or data from the 1000 genomes project.
    But as for as I understand, data from the 1000 genomes project are incorporated into dbSNP (although with some delay). And how can you be sure that variants found in the 1000 genomes project are truly polymorphisms and not mutations?

    What do you think of this? Can I use dbSNP130 and ‘upgrade’ the coordinates to hg19? Where to find a list of SNPs from 1000 genomes project? Or do I just continue with dbSNP131?

    Any input would be greatly appreciated.
    Lien

  • #2
    Hi Lien.
    I also do exome sequencing, in my case on patients with rare diseases. I'm interested in people's responses too.

    I annotate my samples with dbSNP131 using Annovar. If I had an easy way to use dbSNP129 then I would, and I would then filter using this field. However, I haven't found a dbSNP129 file in hg19 coordinates that Annovar can use. As a result, I LOOK at the dbSNP column in my spreadsheet output, but I don't generally filter things out based on it. When a particular mutation looks interesting to me, I manually look up the dbSNP entry to see in what context it was added.

    I also annotate the 1000 genomes allele frequency on my variants. I do filter on this column, specifying that I'm only interested in variants with, for example, less than a 1% or 2% frequency. Any frequency below 1% I consider to be "somewhere between 0% and 1%"... I don't think the data is precise yet for rare variants. In some cases we think we have found recessive disease variants but that were observed in the 1000 genomes project at a very low frequency.

    Comment


    • #3
      I would be wary of using any of the coordinate mapping tools for snps, you are much better to map flanking sequences so you aren't caught out but underlying sequence changes in the assembly

      dbSNP 132 is in HG19/GRCh37 coordinates and contains all the 1000 genomes pilot snp data

      You can also get the most recent 1000genomes release which is all in GRCh37 from ftp://ftp.1000genomes.ebi.ac.uk/vol1...hase1_release/

      You can find answers to many 1000 genomes questions in our FAQ http://www.1000genomes.org/faq

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Exploring the Dynamics of the Tumor Microenvironment
        by seqadmin




        The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
        07-08-2024, 03:19 PM
      • seqadmin
        Exploring Human Diversity Through Large-Scale Omics
        by seqadmin


        In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
        06-25-2024, 06:43 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 07-10-2024, 07:30 AM
      0 responses
      27 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-03-2024, 09:45 AM
      0 responses
      201 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-03-2024, 08:54 AM
      0 responses
      212 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-02-2024, 03:00 PM
      0 responses
      193 views
      0 likes
      Last Post seqadmin  
      Working...
      X