comparing large sets of sequences

suppose you have 2 large sequences or sets of sequences that you want to compare

for matching entries.

E.g. you sequenced some ancient bone and want to check for bacterial contamination

For simplicity assume you have 2 sets of 1000 nucleotide sequences of length 1000 ,

1GB each set that you want to compare against each other, find the best pairs of matching

sequences or subsequences.

Sounds like a standard problem, doesn't it ?

How is it done ? What is the best, fastest method ?

suppose you have 2 large sequences or sets of sequences that you want to compare

for matching entries.

E.g. you sequenced some ancient bone and want to check for bacterial contamination

For simplicity assume you have 2 sets of 1000 nucleotide sequences of length 1000 ,

1GB each set that you want to compare against each other, find the best pairs of matching

sequences or subsequences.

Sounds like a standard problem, doesn't it ?

How is it done ? What is the best, fastest method ?

## Comment