I have 20 million short reads: 25 bp each, and there is no genome reference available.
I would like to examine the patterns of those reads, that is, some reads should be derived from the same gene locus.
Any suggestions about a quick method to cluster them together so that I know how many loci they might be derived from?
Thank you very much for any thoughts?
I would like to examine the patterns of those reads, that is, some reads should be derived from the same gene locus.
Any suggestions about a quick method to cluster them together so that I know how many loci they might be derived from?
Thank you very much for any thoughts?
Comment