Hi
I have made a repeat-library for my metazoan species genome using repeatmodeler and would like to know how to best proceed with it for gene prediction in maker. Some say to use the consensi.fa.classified file right away, some apply filtering steps to remove sequences with real genes that contain repetititve elements.
What I have done so far is to analyse the library with TEclass and remove all entries that are classified as "unknown" from repeatmodeler AND "unclear" from TEclass. I have also done blastx and removed the sequences which contained part off or a full conserved eucaryotic domain and had positive hits from multiple species.
Does this make sense?
I have made a repeat-library for my metazoan species genome using repeatmodeler and would like to know how to best proceed with it for gene prediction in maker. Some say to use the consensi.fa.classified file right away, some apply filtering steps to remove sequences with real genes that contain repetititve elements.
What I have done so far is to analyse the library with TEclass and remove all entries that are classified as "unknown" from repeatmodeler AND "unclear" from TEclass. I have also done blastx and removed the sequences which contained part off or a full conserved eucaryotic domain and had positive hits from multiple species.
Does this make sense?