How to parse blastn output

Sufia

Junior Member

Join Date: Feb 2024

Posts: 4
- Share
- Tweet
#1

How to parse blastn output

02-27-2024, 10:37 AM

I have aligned my shotgun metagenomics reads to NCBI eukaryotic reference database using Blastn to evaluate the dietary assessment from fecal samples of black bears. I've got the blastn output as a tabular format (outfmt 6). I am currently trying to see if a PCR bias/PCR duplicates is influencing our results. I want to see if the ratio of unique subjects to unique queries differs depending on enriched/non-enriched samples (this might indicate that it is something about the enrichment process rather than the PCR that changes the results). So, I extracted the information regarding unique queries and unique subject sequences using the following commands:

for i in $(ls blastn_out_nt/); do cut -f 1 blastn_out_nt/$i | sort | uniq | wc -l >> query; done

for i in $(ls blastn_out_nt/); do sort -k2,2 blastn_out_nt/$i | cut -f 2,9,10 | uniq | wc -l >> unique_subjects; done

Need to mention here that the first column in the blastn output is query id, the second column is subject id, 9th and 10th columns are the start and end of alignments in the subject. I wanted to verify that the work is error-free and also have an idea about what explains the pattern.

Here's what I have got:

Does a smaller ratio of unique queries and unique subjects potentially indicate that the input fasta sequences were redundant (pcr duplicates) because they hit the same database entry? Also, how I should explain these figures?
Tags: None

Previous template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 24 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 159 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

How to parse blastn output

Latest Articles

ad_right_rmr

News