So, yes, I have a large number of gene models that are derived from EST data. That is, the coding sequence is correct but any introns will be missing. I also have a bunch of 454 genomic sequence (enriched for the particular genes) which will enable me to find the introns.
Obviously I can just map the raw 454 genomic sequence to the gene models to find where introns are, and I have done this. However I don't really want to do it manually for each gene model.
I was thinking to derive some sort of consensus sequence from the genomic sequence mapped onto the (reference sequence) gene model, with a large tolerance for introducing gaps in the gene model (which would be introns). At a bit of a loss as to how to do this. Any suggestions on how to automate this process?
Obviously I can just map the raw 454 genomic sequence to the gene models to find where introns are, and I have done this. However I don't really want to do it manually for each gene model.
I was thinking to derive some sort of consensus sequence from the genomic sequence mapped onto the (reference sequence) gene model, with a large tolerance for introducing gaps in the gene model (which would be introns). At a bit of a loss as to how to do this. Any suggestions on how to automate this process?
Comment