Problem with blastdbcmd. “Error: [blastdbcmd] Skipped TRANSCRIPT/4865”

arredondoea

Junior Member

Join Date: Mar 2024

Posts: 4
- Share
- Tweet
#1

Problem with blastdbcmd. “Error: [blastdbcmd] Skipped TRANSCRIPT/4865”

03-06-2024, 12:32 PM

Hello, I am looking for help with running a Reciprocal BLAST. I cannot query any of my transcripts using blastdbcmd because I get this error message: “Error: [blastdbcmd] Skipped TRANSCRIPT_4865”. I appreciate any insight that can help resolve this issue!

I am using high quality FASTA files with compiled transcripts of the subspecies I am working with. For context, I ran LORDEC on my transcript files to reduce redundancy. Now, I am working on Reciprocal Blasting to identify comparable contigs between subspecies.

This is the script I am using for the Reciprocal BLAST (macOS):

#Making a database

makeblastdb -in ~/Desktop/tegula_blast/Hq_transcripts_2.fa -dbtype 'nucl' -out ~/Desktop/tegula_blast/Tfunebralis_DB -parse_seqids

#Output message

Building a new DB, current time: 03/06/2024 11:23:46

New DB name: /Users/lanigleason/Desktop/tegula_blast/Tfunebralis_DB

New DB title: /Users/lanigleason/Desktop/tegula_blast/Hq_transcripts_2.fa

Sequence type: Nucleotide

Keep MBits: T

Maximum file size: 3000000000B

Adding sequences from FASTA; added 99388 sequences in 2.75653 seconds.

#Blast for all possible pairwise comparisons

blastn -query ~/Desktop/tegula_blast/Hq_transcripts_1.fa -db ~/Desktop/tegula_blast/Tfunebralis_DB -out ~/Desktop/tegula_blast/eiseni_to_funebralis.txt -evalue 1E-20 -outfmt 6 -max_target_seqs 1

#Output message

Warning: [blastn] Examining 5 or more matches is recommended

(See attached photo for output file)

#Retrieve subset of assembly using blastdbcmd

Here, I took the queried transcripts with matches, from previous blastn output (eiseni_to_funebralis.txt) and made a list of the transcript names (eiseni_names.txt).

An observation I made here is that there are multiple matches listed in the blastn output file. My original fasta file with transcripts does not include multiple copies, so this has to be a product of the blastn command. I am wondering if this is a possible reason for the error in the next step.

blastdbcmd -db ~/Desktop/tegula_blast/Tfunebralis_DB -dbtype 'nucl' -entry_batch ~/Desktop/tegula_blast/eiseni_names.txt -out ~/Desktop/tegula_blast/eiseni_to_funebralis.subset.reciprocal.fasta

#Output message

Error: [blastdbcmd] Skipped transcript_4866

Error: [blastdbcmd] Skipped transcript_9439

Error: [blastdbcmd] Skipped transcript_9439

Error: [blastdbcmd] Skipped transcript_13586

Error: [blastdbcmd] Skipped transcript_17909

Error: [blastdbcmd] Skipped transcript_38053

Error: [blastdbcmd] Skipped transcript_38053

Error: [blastdbcmd] Skipped transcript_22088

Error: [blastdbcmd] Skipped transcript_34418

Error: [blastdbcmd] Skipped transcript_45393

Error: [blastdbcmd] Skipped transcript_45393

Error: [blastdbcmd] Skipped transcript_13587

Error: [blastdbcmd] Skipped transcript_13587

Error: [blastdbcmd] Skipped transcript_13587

etc…

Most of the transcripts are skipped in this step. As seen, all duplicates of the same transcript are skipped too.

I have tried using a smaller subset of transcripts and removing the duplicate matches to run the reciprocal blast but I encounter the same errors.

I also tried looking up database identifiers that I could be missing but I am not sure how to use these identifiers to query with blastdbcmd.

blastdbcmd -db ~/Desktop/tegula_blast/Tfunebralis_DB -dbtype 'nucl' -entry all -out ~/Desktop/tegula_blast/eiseni_to_funebralis.subset.reciprocal.all.fasta -outfmt "OID: %o GI: %g ACC: %a IDENTIFIER: %i"

#Example output
OID: 0 GI: N/A ACC: transcript_3756 IDENTIFIER: lcl|transcript_3756
OID: 1 GI: N/A ACC: transcript_9791 IDENTIFIER: lcl|transcript_9791
OID: 2 GI: N/A ACC: transcript_7816 IDENTIFIER: lcl|transcript_7816
OID: 3 GI: N/A ACC: transcript_1853 IDENTIFIER: lcl|transcript_1853

Please let me know any ways to reduce the amount of transcripts skipped by blastdbcmd. Any recommended parameter changes and other explanations are welcomed. Thank you!

Attached Files

blastn output.png (608.7 KB, 29 views)

Last edited by arredondoea; 03-06-2024, 12:43 PM.
Tags: bioinformatics, reciprocal blast hits, rna-seq

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Problem with blastdbcmd. “Error: [blastdbcmd] Skipped TRANSCRIPT/4865”

Latest Articles

ad_right_rmr

News