Hello,
I have a list of mRNA NM_ numers.
In UCSC, hg19->refGene table, I can get exons and cds coordinates for every NM_.
However, when I pull out a subsequence from hg19 based on refGene coordinates, the result seems to be not correct for reverse strand. Reverse complement of the pulled exons dosn't work as well.
-------
example:
I have a: NM_012345.3
From UCSC i know, that for NM_012345 the first CDS is beetwen 50000:50100, strand: "-", chr1
Then I use:
The result doesn't start with ATG (and it should starts).
Where is the problem? I know that UCSC doesn't use the version (NM_012345 instead of NM_012345.3) but it should work.
(hg19 is downloaded from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/)
I have a list of mRNA NM_ numers.
In UCSC, hg19->refGene table, I can get exons and cds coordinates for every NM_.
However, when I pull out a subsequence from hg19 based on refGene coordinates, the result seems to be not correct for reverse strand. Reverse complement of the pulled exons dosn't work as well.
-------
example:
I have a: NM_012345.3
From UCSC i know, that for NM_012345 the first CDS is beetwen 50000:50100, strand: "-", chr1
Then I use:
Code:
samtools faidx /path/hg19.fa chr1:50000-50100
Where is the problem? I know that UCSC doesn't use the version (NM_012345 instead of NM_012345.3) but it should work.
(hg19 is downloaded from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/)
Comment