Hello Seqanswers,
I need some help in deciding which tool (preferably web-based) is best for searching for very-distantly related homolog genes. I have a set of 100 human genes (all with nucleotide/protein refseqs) and I’m trying to find homologs in the model plant system Arabidopsis (I know, I know- but bear with me please). I realize that there’s going to be extreme divergence between sequences of homologs, if they even exist in the first place, but I’m willing to go on a wild goose chase if it means even a decent chance of finding a small subset of conserved genes.
Right now I’m just trying a few things with the first five accessions of interest. tBLASTx with a lenient matrix (blosum45 and trying pam250 now) did a good job finding some Arabidopsis genes with the same functional domain (e.g. zinc finger TF hits), but I’m learning that this isn’t necessarily the best way to go about this search in species divergence above family/order levels of taxonomic differentiation... So, NCBI’s help page recommends PSI-BLAST for distantly-related proteins, which I’m going to try next.
I’ve looked around seqanswers, biostars, and random pages via google, but mostly I’ve only turned up pages looking for faster BLAST alternatives, and speed isn’t really a concern for me. I was at some point led to a page describing HMMER, which I’ve heard about but never used but apparently has a devoted server now at EBI. Is this likely to be a helpful resource for me?
Would walking up and down the phylogenetic tree be a worthwhile endeavor? I.e. search for homologs in well-documented models between human and Arabidopsis in a piecemeal system? Also, I’ve confirmed that none of my genes of interest are genes with a KOG/CEG designation, which would have been awesome.
Thanks for any guidance,
NYGen
I need some help in deciding which tool (preferably web-based) is best for searching for very-distantly related homolog genes. I have a set of 100 human genes (all with nucleotide/protein refseqs) and I’m trying to find homologs in the model plant system Arabidopsis (I know, I know- but bear with me please). I realize that there’s going to be extreme divergence between sequences of homologs, if they even exist in the first place, but I’m willing to go on a wild goose chase if it means even a decent chance of finding a small subset of conserved genes.
Right now I’m just trying a few things with the first five accessions of interest. tBLASTx with a lenient matrix (blosum45 and trying pam250 now) did a good job finding some Arabidopsis genes with the same functional domain (e.g. zinc finger TF hits), but I’m learning that this isn’t necessarily the best way to go about this search in species divergence above family/order levels of taxonomic differentiation... So, NCBI’s help page recommends PSI-BLAST for distantly-related proteins, which I’m going to try next.
I’ve looked around seqanswers, biostars, and random pages via google, but mostly I’ve only turned up pages looking for faster BLAST alternatives, and speed isn’t really a concern for me. I was at some point led to a page describing HMMER, which I’ve heard about but never used but apparently has a devoted server now at EBI. Is this likely to be a helpful resource for me?
Would walking up and down the phylogenetic tree be a worthwhile endeavor? I.e. search for homologs in well-documented models between human and Arabidopsis in a piecemeal system? Also, I’ve confirmed that none of my genes of interest are genes with a KOG/CEG designation, which would have been awesome.
Thanks for any guidance,
NYGen
Comment