Hello.
We've got Rna-seq data from sequencing a legume crop. There is no reference genome nor EST data available.
We assembled the data using MIRA including the highly repetitive option, since it is very very close to soybean.(assumed it would be highly repetitive also,like all legumes.)
However, the percentage of SSR repeats calculated using misa microsatellite tool appears to be really low 4.38%.
What is the standard method to determine the repetitiveness of a genome from rna-seq data?
During previous runs, MIRA warned about the repetitiveness and exceeding the megahub ratio as is normal in highly repetitive legumes.
Is it safe to assume that it is highly repetitive and go ahead with assembly using the --highlyrepetitive option?
Thank you.
We've got Rna-seq data from sequencing a legume crop. There is no reference genome nor EST data available.
We assembled the data using MIRA including the highly repetitive option, since it is very very close to soybean.(assumed it would be highly repetitive also,like all legumes.)
However, the percentage of SSR repeats calculated using misa microsatellite tool appears to be really low 4.38%.
What is the standard method to determine the repetitiveness of a genome from rna-seq data?
During previous runs, MIRA warned about the repetitiveness and exceeding the megahub ratio as is normal in highly repetitive legumes.
Is it safe to assume that it is highly repetitive and go ahead with assembly using the --highlyrepetitive option?
Thank you.