Seqanswers Leaderboard Ad

**husamia** · 12-16-2010, 01:52 PM

I think you mean you have chromosomal position such as chr1:222222 and dna change A>T and you want to know the coding sequence change with respect to start of a coding sequence like ATG. If this isn't what you want, then give example. If this is what you want, there is problem in that there may be more than one version of the coding sequence called isoform you have to decide which isoform you want thats probably why no tool will do this automatically. I have done it by myself based on data from ensembl definition of exons, i found errors in ucsc browser which is another place you can go. The problem is I want highly accurate manually annotated exons ensembl worked best for me. There are alot of other issues that I won't go into. its not as straght forward as seems to be most people have genes of interest in which case you have to prepare it yourself.

**Boel** · 12-16-2010, 02:01 PM

Hi husamia, and thanks for your reply.
No, I am not interested in the coding consequence, just interested in the position in the transcript, in the mRNA sequence.

Like if the DNA pos. is chr1:30000, and this falls within the gene X's first exon, that I want to know the position in the mRNA position (pos 1 if gene X start at pos chr1:30000) . If a gene has several isoforms this will be reflected in my GTF file. A fairly simple mathematical exercise, just very nitty gritty to do, hence just wanted to hear if someone had a simple script. Thanks though.

**kmcarr** · 12-16-2010, 02:46 PM

I had to do this exact exercise myself (though going further, to the amino acid as husamia described). I wrote my own script but it is not simple. It makes use of the BioPerl module Bio::Coordinate::GeneMapper which is meant for these types of transformations between coordinate spaces. But to use it everything must be a Bio::SeqFeature object. Since I was working in Arabidopsis I already had a Bio:

B::SeqFeature database of TAIR9 set up (back end for GBrowse). If you are conversant with some serious BioPerl I could offer some guidance.

**Boel** · 12-16-2010, 02:57 PM

Hi kmcarr,

I'm looking into biopython, and there is some functionality there. Might cross over to BioPerl if I feel the need later on. Thanks a lot.

**joa_ds** · 12-17-2010, 02:53 AM

drop me an email @ joachim dot deschrijver at ugent dot be

I have such a script ready in Perl that you could use

**Giulietta** · 01-05-2011, 02:46 AM

Ensembl's variant effect predictor may be of use, here. If you enter in a genomic position and allele(s) it will let you know the position in the cDNA and the protein (if there is one) and the amino acid change. Have a look at the example:

403 Forbidden

http://www.ensembl.org/info/website/upload/var.html

It's available online, or through the API:

403 Forbidden

http://www.ensembl.org/tools.html

Email us at [email protected] for more help.

**husamia** · 01-05-2011, 05:58 AM

Originally posted by Giulietta View Post

Ensembl's variant effect predictor may be of use, here. If you enter in a genomic position and allele(s) it will let you know the position in the cDNA and the protein (if there is one) and the amino acid change. Have a look at the example:

403 Forbidden

http://www.ensembl.org/info/website/upload/var.html

It's available online, or through the API:

403 Forbidden

http://www.ensembl.org/tools.html

Email us at [email protected] for more help.

The link [http://uswest.ensembl.org/info/website/upload/var.html] gives 404 error but I think the correct link is [http://uswest.ensembl.org/Homo_sapie...oadVariations]

**Giulietta** · 01-05-2011, 07:06 AM

Originally posted by husamia View Post

The link [http://uswest.ensembl.org/info/website/upload/var.html] gives 404 error but I think the correct link is [http://uswest.ensembl.org/Homo_sapie...oadVariations]

Sorry about the broken link- we will endeavor to fix it.

The link at www.ensembl.org is working:

403 Forbidden

http://www.ensembl.org/info/website/upload/var.html

Try to change uswest to www (and go back to the UK site if it redirects you again!) The UploadVariations link you quote is not quite the one I was trying to point you to.

Cheers.

**amias** · 10-28-2014, 09:15 AM

Originally posted by Boel View Post

Hi kmcarr,

I'm looking into biopython, and there is some functionality there. Might cross over to BioPerl if I feel the need later on. Thanks a lot.

Hi Boel, could you share the biopython functionality you used for converting the genomic coordinates to transcript coordinates? I have gff file where I would like to convert the genomic coordinates of utr and cds to transcript coordinates, but I am having a hard time finding a script or function that could do this. Thanks!

**m_two** · 10-29-2014, 07:43 AM

Ensembl VEP is a best bet for custom annotation (fast, robust, reliable, and easily automated)

403 Forbidden

http://useast.ensembl.org/info/docs/tools/vep/script/vep_custom.html

403 Forbidden

http://useast.ensembl.org/info/docs/tools/vep/script/vep_cache.html

**amias** · 10-29-2014, 11:57 AM

Originally posted by m_two View Post

Ensembl VEP is a best bet for custom annotation (fast, robust, reliable, and easily automated)

403 Forbidden

http://useast.ensembl.org/info/docs/tools/vep/script/vep_custom.html

http://useast.ensembl.org/info/docs/...vep_cache.html

As far as I understand from the documentation, the ensembl vep requires variant information as input. The sites I would like to convert are not SNP positions, but miRNA target sites-- so I could not use vep for that conversion.

**SrCardgage** · 06-04-2015, 05:55 AM

You basically need to subtract the position of the transcription start site from the position of the variant. This info is in several places. The source I use is the UCSC Table Browser.

Table Browser

http://genome.ucsc.edu/cgi-bin/hgTables

The values for clade genome asssembly should be obvious.

Group Genes and Gene Predictions
Track RefSeq Genes
table refGene
output format all fields from selected table
output file refGene_human (or whatever your organism is)
file type returned gzip (speeds up download a lot)

Unzip the file and either load it into an SQL table set up with the refGene schema (click the button describe table schema for info) or programmatically search the unzipped text file for your gene to pull its TSS.

If you don't know databases, searching the plain text will be faster in the short run. But, if this is part of a major pipeline you will be running a lot, it would be worthwhile to become comfortable with a relational database system and embedding calls to that database inside your language of choice. That may sound like a major hurdle, but all the info you need is on the web. Message me, if you need help getting started to find the resources to learn this.

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Converting DNA position to transcript position

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News