Hi everyone,
Im a bit new to the genome assembly process (its my first time) and need some help plus some guidance
I sequenced a bacteria twice using Iontorrent PGM and template/sequencing 200 kit and then twice using template/sequencing 400kit. So now I have four sequences of the same bacteria that I wish to assemble (each are from different colonies though as I wish to deduce variants within the sequence).
In order to de Novo assemble I used Iontorrent assembler to form contigs.
Now I have contigs for the four runs and I wish to order and assemble them. I have three reference genomes that are similar to the bacteria that im studying.
I have access to CLC genomics workbench but not the ‘microbial finishing module plugin’ (as its expensive) and have been using Mauve ‘align with progressive mauve’ tool but am not exactly sure as to what im doing when im using the tool.
I wish to know when I use the align with progressive mauve tool should I add 2 of my torrent assembler result from sequencing/template preparation 400 kit with a reference sequence of interest. And then run the progressive mauve tool again with contigs obtained from torrent assembler using the sequencing/template 200 kit and then align the two results to form as complete of a sequence as possible for my bacteria. Also once I obtain this data should I annotate using RAST or are other steps required before that???
Also when a run the progressive mauve I obtain 4 files
1) .file
2) .backbone file
3) .bbcols
4) .guide_tree
Im assuming the .file is the file with my assembled contigs so i convert it to multifsata format using perl (never used perl before took me a while to figure out how to do this). I don’t know how to interpret the data that I get.
Moreover if anyone can guide me on how to assemble my contigs in CLC genomics workbench when I don’t have the microbial/genomic finishing module’? this would be very useful as I could compare the results I obtain from Mauve and CLC also I think CLC would be easier to use.
Any help in the steps involved in aligning and assembling my contigs to a reference and/or obtaining as complete of a sequence as possible would be much appreciated.
p.s. please find attached an example of the type of mauve result im getting
Thanks
Im a bit new to the genome assembly process (its my first time) and need some help plus some guidance
I sequenced a bacteria twice using Iontorrent PGM and template/sequencing 200 kit and then twice using template/sequencing 400kit. So now I have four sequences of the same bacteria that I wish to assemble (each are from different colonies though as I wish to deduce variants within the sequence).
In order to de Novo assemble I used Iontorrent assembler to form contigs.
Now I have contigs for the four runs and I wish to order and assemble them. I have three reference genomes that are similar to the bacteria that im studying.
I have access to CLC genomics workbench but not the ‘microbial finishing module plugin’ (as its expensive) and have been using Mauve ‘align with progressive mauve’ tool but am not exactly sure as to what im doing when im using the tool.
I wish to know when I use the align with progressive mauve tool should I add 2 of my torrent assembler result from sequencing/template preparation 400 kit with a reference sequence of interest. And then run the progressive mauve tool again with contigs obtained from torrent assembler using the sequencing/template 200 kit and then align the two results to form as complete of a sequence as possible for my bacteria. Also once I obtain this data should I annotate using RAST or are other steps required before that???
Also when a run the progressive mauve I obtain 4 files
1) .file
2) .backbone file
3) .bbcols
4) .guide_tree
Im assuming the .file is the file with my assembled contigs so i convert it to multifsata format using perl (never used perl before took me a while to figure out how to do this). I don’t know how to interpret the data that I get.
Moreover if anyone can guide me on how to assemble my contigs in CLC genomics workbench when I don’t have the microbial/genomic finishing module’? this would be very useful as I could compare the results I obtain from Mauve and CLC also I think CLC would be easier to use.
Any help in the steps involved in aligning and assembling my contigs to a reference and/or obtaining as complete of a sequence as possible would be much appreciated.
p.s. please find attached an example of the type of mauve result im getting
Thanks
Comment