I have 46 genomes of different strains of a bacteria, along with their protein sequences and the nucleotide sequences of those protein sequences.
I am trying to cluster the genes so that I can make a graph such that the x-axis includes genomes 1-46 and the y-axis includes the gene clusters.
How do I go about doing this? I have looked into USEARCH and CD-HIT, but I am very confused on how to go about it.
I am trying to cluster the genes so that I can make a graph such that the x-axis includes genomes 1-46 and the y-axis includes the gene clusters.
How do I go about doing this? I have looked into USEARCH and CD-HIT, but I am very confused on how to go about it.
Comment