Hello,
I am trying to build a costumed database, including all reference genomes of fungi, bacteria, archaea, viruses, and Homo sapiens.
I am facing a problem in the first step: Creating sequence ID to taxonomy ID map.
I had a message saying:
I tried those 3 codes separately:
But with all the 3 codes, the same thing happens. I always have unmapped sequences which can affect my taxonomic classification after.
I tried with fix_unmapped.py from KrakenTools, but it does not change anything:
Does anyone have an idea how to solve the problem?
Thanks in advance
I am trying to build a costumed database, including all reference genomes of fungi, bacteria, archaea, viruses, and Homo sapiens.
I am facing a problem in the first step: Creating sequence ID to taxonomy ID map.
I had a message saying:
Found 10361157/19252012 targets, searched through 948300298 accession IDs..
lookup_accession_numbers: 8890855/19252012 accession numbers remain unmapped, see unmapped.txt in DB directory
Sequence ID to taxonomy ID map complete. [17m6. 413s]
lookup_accession_numbers: 8890855/19252012 accession numbers remain unmapped, see unmapped.txt in DB directory
Sequence ID to taxonomy ID map complete. [17m6. 413s]
$kraken2-build --build --db microbialDB/ --threads 3
$kraken-build --build --db microbialDB/ --kmer-len 100 --threads 3
$kraken2-build --build --threads 3 --fast-build --db microbialDB/ --kmer-len 100
$kraken-build --build --db microbialDB/ --kmer-len 100 --threads 3
$kraken2-build --build --threads 3 --fast-build --db microbialDB/ --kmer-len 100
I tried with fix_unmapped.py from KrakenTools, but it does not change anything:
$python3 fix_unmapped.py -i unmapped.txt --accession2taxid microbialDB/taxonomy/nucl_gb.accession2taxid -o output.map -r still_unmapped.txt
$python3 fix_unmapped.py -i unmapped.txt --accession2taxid microbialDB/taxonomy/nucl_wgs.accession2taxid -o output.map -r still_unmapped.txt
$python3 fix_unmapped.py -i unmapped.txt --accession2taxid microbialDB/taxonomy/nucl_wgs.accession2taxid -o output.map -r still_unmapped.txt
Does anyone have an idea how to solve the problem?
Thanks in advance