Hi All,
Does anyone have a script to pull out paired end data after a blast search? I created a separate blast database from each paired end read file. I used one gene as a query and did a blast search against each database with a tabular output. What I'd like to do next is use the R1 names to pull out R2 and R2 to pull out R1, and then take out any redundancy.
I think that I can retrieve the names of the hit reads using:
cat gene_blast_output | cut -f 2 | sort -u > contig_ids.txt
But not sure where to go from here.
Does anyone have a script to pull out paired end data after a blast search? I created a separate blast database from each paired end read file. I used one gene as a query and did a blast search against each database with a tabular output. What I'd like to do next is use the R1 names to pull out R2 and R2 to pull out R1, and then take out any redundancy.
I think that I can retrieve the names of the hit reads using:
cat gene_blast_output | cut -f 2 | sort -u > contig_ids.txt
But not sure where to go from here.
Comment