Header Leaderboard Ad

Collapse

Kraken2 - unmapped sequences during the first step of building the database

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kraken2 - unmapped sequences during the first step of building the database

    Hello,
    I am trying to build a costumed database, including all reference genomes of fungi, bacteria, archaea, viruses, and Homo sapiens.

    I am facing a problem in the first step: Creating sequence ID to taxonomy ID map.
    I had a message saying:

    Found 10361157/19252012 targets, searched through 948300298 accession IDs..
    lookup_accession_numbers: 8890855/19252012 accession numbers remain unmapped, see unmapped.txt in DB directory
    Sequence ID to taxonomy ID map complete. [17m6. 413s]
    I tried those 3 codes separately:

    $kraken2-build --build --db microbialDB/ --threads 3
    $kraken-build --build --db microbialDB/ --kmer-len 100 --threads 3
    $kraken2-build --build --threads 3 --fast-build --db microbialDB/ --kmer-len 100
    But with all the 3 codes, the same thing happens. I always have unmapped sequences which can affect my taxonomic classification after.

    I tried with fix_unmapped.py from KrakenTools, but it does not change anything:

    $python3 fix_unmapped.py -i unmapped.txt --accession2taxid microbialDB/taxonomy/nucl_gb.accession2taxid -o output.map -r still_unmapped.txt
    $python3 fix_unmapped.py -i unmapped.txt --accession2taxid microbialDB/taxonomy/nucl_wgs.accession2taxid -o output.map -r still_unmapped.txt


    Does anyone have an idea how to solve the problem?
    Thanks in advance

Latest Articles

Collapse

  • seqadmin
    How RNA-Seq is Transforming Cancer Studies
    by seqadmin



    Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
    09-07-2023, 11:15 PM
  • seqadmin
    Methods for Investigating the Transcriptome
    by seqadmin




    Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

    Whole Transcriptome RNA-seq
    Whole transcriptome sequencing...
    08-31-2023, 11:07 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 07:42 AM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-22-2023, 09:05 AM
0 responses
23 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-21-2023, 06:18 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-20-2023, 09:17 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Working...
X