I saw there are a couple of files on the ensembl ftp website:
ftp://ftp.ensembl.org/pub/release-76...lanoleuca/dna/
* 'dna' - unmasked genomic DNA sequences.
* 'dna_rm' - masked genomic DNA. Interspersed repeats and low
complexity regions are detected with the RepeatMasker tool and masked
by replacing repeats with 'N's.
* 'dna_sm' - soft-masked genomic DNA. All repeats and low complexity regions
have been replaced with lowercased versions of their nucleic base
Still don't know which file should I downloaded to format a blast database? Thanks!
ftp://ftp.ensembl.org/pub/release-76...lanoleuca/dna/
* 'dna' - unmasked genomic DNA sequences.
* 'dna_rm' - masked genomic DNA. Interspersed repeats and low
complexity regions are detected with the RepeatMasker tool and masked
by replacing repeats with 'N's.
* 'dna_sm' - soft-masked genomic DNA. All repeats and low complexity regions
have been replaced with lowercased versions of their nucleic base
Still don't know which file should I downloaded to format a blast database? Thanks!
Comment