Dear collective,
I would like to be able to use my 2nd generation Illumina sequencing data in population genomic investigations to do strain-to-strain comparative gene content assessment. This process would be analogous to what used to be done with old DNA-DNA spotted arrays, or as is currently done with Affy probe set arrays. My intention would be to use the depth of coverage relative to a reference when mapping Illumina read data to then make a gene presence or absence call. Is anybody aware of a program or pipeline which has already been set up to do this process. Does anybody have any suggestion for how to modify some other application to accomplish this - in my opinion this seems to be analogous to the read counting that goes on when performing RNAseq. Before going to the trouble of writing code of my own I wanted to be sure that I had not just missed the software that has already been developed. I realize that accomplishing this is fairly straight forward in a spreadsheet if one is comparing only a handful of strains, but I need to be able to apply such a process to literally thousands of strains - and therefore the process has to be scripted.
Thanks for any assistance.
Steve Beres
I would like to be able to use my 2nd generation Illumina sequencing data in population genomic investigations to do strain-to-strain comparative gene content assessment. This process would be analogous to what used to be done with old DNA-DNA spotted arrays, or as is currently done with Affy probe set arrays. My intention would be to use the depth of coverage relative to a reference when mapping Illumina read data to then make a gene presence or absence call. Is anybody aware of a program or pipeline which has already been set up to do this process. Does anybody have any suggestion for how to modify some other application to accomplish this - in my opinion this seems to be analogous to the read counting that goes on when performing RNAseq. Before going to the trouble of writing code of my own I wanted to be sure that I had not just missed the software that has already been developed. I realize that accomplishing this is fairly straight forward in a spreadsheet if one is comparing only a handful of strains, but I need to be able to apply such a process to literally thousands of strains - and therefore the process has to be scripted.
Thanks for any assistance.
Steve Beres
Comment