Header Leaderboard Ad
Collapse
how to tell SNPs synonymous v. nonsynonymous
Collapse
Announcement
Collapse
No announcement yet.
X
-
How good is SNPeff anotation? does it just output "synonymous" or "nonsynonymous" only? or can it give you a full cDNA , HGVS formatted output? eg. c.1279G>T , and p.T289F ? And if it does do that, can it handle splice mutations in the HGVS format? (eg. c.196-1G>T or c.241+2A>T ) ?
Anyone know if the ensembl variant annotator API can do that?
I know ANNOVAR can output this in HGVS format.
These are the formats used in the literature, and not having the variants reported in this format makes comparison and integration with other data sets a bit clumsy.
Leave a comment:
-
Thanks for the idea on SNPeff. It's working very well for me and even supports some bacterial genomes out of the box. Input format is also easy to generate, as well as VCF or pileup.
Leave a comment:
-
Yeah, I'm not saying it is necessarily the easiest solution for you, but it is incorrect to state that it requires annotation databases from UCSC.
Leave a comment:
-
Originally posted by jnfass View Post@Michael.James.Clark ... Annovar seems to require annotation databases from the UCSC Genome Browser. That doesn't exist for at least part of the genome I'm working with.
@nexgengirl ... Polyphen-2 seems to be restricted to human
@laura ... thanks for the suggestion; I might try contacting ensembl.
But in the meantime someone responded to my Biostar post and suggested snpEff (snpeff.sourceforge.net) and it seems to fit the bill. If anyone has had good or bad experience with it, I'd appreciate hearing about it.
Leave a comment:
-
@Michael.James.Clark ... Annovar seems to require annotation databases from the UCSC Genome Browser. That doesn't exist for at least part of the genome I'm working with.
@nexgengirl ... Polyphen-2 seems to be restricted to human
@laura ... thanks for the suggestion; I might try contacting ensembl.
But in the meantime someone responded to my Biostar post and suggested snpEff (snpeff.sourceforge.net) and it seems to fit the bill. If anyone has had good or bad experience with it, I'd appreciate hearing about it.
Leave a comment:
-
Leave a comment:
-
Originally posted by jnfass View PostThanks, but it seems to only do SNPs ... we need to annotate indels, too.
Leave a comment:
-
For help with setting up a custom ensembl database I would suggest emailing [email protected] to get help
Leave a comment:
-
Thanks, but it seems to only do SNPs ... we need to annotate indels, too.
Leave a comment:
-
You could also try :
The requirement is that the input snps be in VCF format.
-Abhi
Leave a comment:
-
We're looking at using Ensembl's Variant Effect Predictor right now ... but with a non-model organism (or at least a model organism with some regions replaced by a "better" assembly). It seems like we'll need to set up a local database in order to provide the reference and gene models. Does anyone have experience setting up a db like this?
Leave a comment:
-
If you have hg19/GRCh37 positions for all your snps I would suggest using a tool like the ensembl variant effect predictor to get the consequences of your snps and then tracing to refseq ids using the ensembl xref system rather than doing it the other way around
Refseq models and Ensembl models should be mostly the same for the cds coordinates (though not in all models) but to get the models which are identical across both sets it best to look at the ccds models http://www.ncbi.nlm.nih.gov/projects...CcdsBrowse.cgi
Do remember that utr coordinate may be different across both sets
Leave a comment:
-
I'm an intern working with exome analysis and I am facing the same thing. I want to implement this annotation into my own java pipeline and although the solution may seem very easy, I am troubled in finding the right approach.
I have tried using the RefSeq annotation file of SeqCap EZ Exome v2 (matching UCSC genome browser with HG19), which holds information on cdsStart, -End en exon starts and endings. This file also holds an Ensembl gene reference for each RefSeq gene, which should make it easy to link with the cDNA fasta file of Ensembl and get exactly what I want...
... a few problems though:
1) RefSeq- and ensembl genes overlap and multiple of the same ensembl references may occur in the ensembl fasta file, making it hard to differentiate. This is most likely due to different isoforms.
2) Looking at a few cases, I noticed that some RefSeq genes show cdsStart and cdsEnd positions that can not be traced back to ensembl. In other words: when I read the ensembl reference from the RefSeq file and look them up in the ensembl file, I can find multiple isoforms, but none with the same cdsStart and/or cdsEnd. I already take into account that RefSeq and Ensembl differ 1 nuc. in cdsStart. Both files are based on HG19, so that can't be the problem either.
What would be the best approach on solving this puzzle? Should I just walk through the entire genome and annotate all the information to my SNPs as I go along? Any thoughts are welcome.
Thanks a bunch!
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...-
Channel: Articles
Today, 06:26 AM -
-
by seqadmin
Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...-
Channel: Articles
09-07-2023, 11:15 PM -
-
by seqadmin
Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.
Whole Transcriptome RNA-seq
Whole transcriptome sequencing...-
Channel: Articles
08-31-2023, 11:07 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Multiplexed Biomarker Detection with Nanopore Technology: A Leap in Precision Diagnostics
by seqadmin
Started by seqadmin, Yesterday, 07:42 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Yesterday, 07:42 AM
|
||
Started by seqadmin, 09-22-2023, 09:05 AM
|
0 responses
23 views
0 likes
|
Last Post
by seqadmin
09-22-2023, 09:05 AM
|
||
Started by seqadmin, 09-21-2023, 06:18 AM
|
0 responses
17 views
0 likes
|
Last Post
by seqadmin
09-21-2023, 06:18 AM
|
||
Started by seqadmin, 09-20-2023, 09:17 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
09-20-2023, 09:17 AM
|
Leave a comment: