I have a set of contigs, de novo assembled, from a new bacterial strain. What ways are there to identify which contigs come from plasmids? I have done the following:
1) Used Mauve to order contigs to the core chromosome of a reference genome, and considered those contigs that were not included in the ordering as potential plasmids.
2) Checked the GC content of each contig, with the assumption that plasmids tend to have lower GC than the core chromosome.
3) Blastn to the nr database with each contig as query, checking if the top hit is to a core chromosome or a plasmid sequence.
4) Checked for each contig if it contains any replication initiation gene.
5) Made dot plots of candidate plasmids vs known plasmids (chosen based on the blastn results).
6) Tried to run the cBar program for plasmid recognition, however could not get the program to work.
Any suggestions for additional things to do? Or any flawed reasoning in the above steps? I got two very strong candidate plasmids based on this, but would want to be really sure before publishing.
1) Used Mauve to order contigs to the core chromosome of a reference genome, and considered those contigs that were not included in the ordering as potential plasmids.
2) Checked the GC content of each contig, with the assumption that plasmids tend to have lower GC than the core chromosome.
3) Blastn to the nr database with each contig as query, checking if the top hit is to a core chromosome or a plasmid sequence.
4) Checked for each contig if it contains any replication initiation gene.
5) Made dot plots of candidate plasmids vs known plasmids (chosen based on the blastn results).
6) Tried to run the cBar program for plasmid recognition, however could not get the program to work.
Any suggestions for additional things to do? Or any flawed reasoning in the above steps? I got two very strong candidate plasmids based on this, but would want to be really sure before publishing.
Comment