Hi all,
I try to fix the following issue:
I have a BLAST output from the command line, using PacBio data as queries and a custom database. I have some filtering criteria for my hits, e.g. >db_1 should at least cover 80bp, while >db_2 should cover at least 1000bp.
In the end I'd like to have a graphical representation of my queries and the corresponding (filtered) hits.
A refinement would be, if the alignment of db_2 is in total longer than the alignment of db_3, db_3 should be discarded...
I tried to use Bio::Search::Result::ResultI first. I ended up with a very ugly script, which works and processes my custom blast tab, but I even failed to visualize my results.
Next thing I gave a try was Bio::SearchIO::Writer::HTMLResultWriter;
The problem is, the syntax doesn't work. Of cause, because I mixed up hits and hsp. Is it possible to combine hit and hsp filters?
I know, that this question is very specific. But maybe anybody struggles on similar issues parsing blast outputs, trying to filter them somehow and visualize the results. I hope I could make my problem clear and maybe anybody could give me a suggestion how to solve it.
Thanks in advance!
ps: it doesn't matter whether I parse xml, pure blast or tables, just want to get it to run
I try to fix the following issue:
I have a BLAST output from the command line, using PacBio data as queries and a custom database. I have some filtering criteria for my hits, e.g. >db_1 should at least cover 80bp, while >db_2 should cover at least 1000bp.
In the end I'd like to have a graphical representation of my queries and the corresponding (filtered) hits.
A refinement would be, if the alignment of db_2 is in total longer than the alignment of db_3, db_3 should be discarded...
I tried to use Bio::Search::Result::ResultI first. I ended up with a very ugly script, which works and processes my custom blast tab, but I even failed to visualize my results.
Code:
while( my $result = $in->next_result ) {
## $result is a Bio::Search::Result::ResultI compliant object
while( my $hit = $result->next_hit ) {
## $hit is a Bio::Search::Hit::HitI compliant object
while( my $hsp = $hit->next_hsp ) {
## $hsp is a Bio::Search::HSP::HSPI compliant object
if( $hit->name =~ m/someexpression/ ) {
if($hit->name =~ m/db_2/ && $hsp->length('hit')>=90){
my $aln = $hsp->get_aln;
my $alnIO = Bio::AlignIO->new(-format =>"msf", -file => ">hsp.msf");
$alnIO->write_aln($aln);
print $result->query_name, "\t",
$hit->name, "\t",
$hsp->length('total'), "\t",
$hsp->percent_identity, "\t",
$hsp->start('query'), "\t",
$hsp->end('query'), "\t",
$hsp->start('hit'), "\t",
$hsp->end('hit'), "\t",
$alnIO, "\n";
}
elsif($hit->name =~ m/db_2/ && $hsp->length('hit')>=80){
my $aln = $hsp->get_aln;
my $alnIO = Bio::AlignIO->new(-format =>"msf", -file => ">hsp.msf");
$alnIO->write_aln($aln);
print $result->query_name, "\t",
$hit->name, "\t",
$hsp->length('total'), "\t",
$hsp->percent_identity, "\t",
...
Code:
my $in = new Bio::SearchIO( -format => 'blast',
-file => $ARGV[0]);
sub hsp_filter {
my $hsp = shift;
return 1 if ($hit->name =~ m/db_1/ && $hsp->length('hit')>=70);
return 1 if ($hit->name =~ m/db_2/ && $hsp->length('hit')>=65);
...
}
my $writer = Bio::SearchIO::Writer::HTMLResultWriter ->new (-filters => {'HSP' => \&hsp_filter});
my $out = Bio::SearchIO ->new (-writer =>$writer);
$out ->write_result($in->next_result);
I know, that this question is very specific. But maybe anybody struggles on similar issues parsing blast outputs, trying to filter them somehow and visualize the results. I hope I could make my problem clear and maybe anybody could give me a suggestion how to solve it.

Thanks in advance!
ps: it doesn't matter whether I parse xml, pure blast or tables, just want to get it to run
Comment