BBsplit looks great, thanks! Can't believe I hadn't seen it before, will try it out.
Cheers
N
Announcement
Collapse
No announcement yet.
Blasting contigs against reference database
Collapse
X
-
This is a classic case for using BBSplit (http://seqanswers.com/forums/showthread.php?t=41288). Use the cyanobacterial genome(s) as the reference and the reads will be binned automatically. If you need help with the actual command line let us know.
Leave a comment:
-
Bowtie2 looks useful, certainly. However, wouldn't this only keep reads that mapped directly to the reference genome, losing some good reads from my genome of interest? I was going along the lines of assembling contigs first and searching within them for matches to the reference.
Leave a comment:
-
Blasting contigs against reference database
Apologies if this has been covered elsewhere, couldn't find a satisfactory answer easily....
The problem: I have hi-seq 2500 PE reads from a microbial culture that contain ONE cyanobacterial genome of interest and several contaminating genomes. My understanding is that by blasting against a local reference database containing only cyanobacterial genomes, I could bin my contigs by those which contain any cyanobacterial genes and those which do not.
Further analysis of G-C content and tetranucleotide frequencies could then be used to eliminate chimeric contigs, leaving me with a draft genome.
Could anybody point me in the direction of resources to help me write a BLAST algorithm do perform this task, maybe using BioPython (I have just started learning python)? I don't need long stretches of sequence to align, just the presence of a single gene with a good match in a whole contig would be enough to put it in the 'keep' pile.
I'm new to bioinformatics and essentially teaching myself so any pointers much appreciated...
Cheers
Nathan
Leave a comment: