Hi:
I have three datasets of sequences that I have clustered with CD-HIT. I would like to do a clustering of the three datasets combined, but that task is too big for my computer. There are instructions in the CD-HIT manual for compare /adding new seuences to an already clustered database (CD-HIT-2D), but that way my computer also runs out of memory.
I wonder if there is a way to compare / merge the three fasta and cluster results from CD-HIT into one.. or whether there is a software tool that is less memory intensive, so I can combine the three original sequence files and cluster them all at once.
Thanks in advance for any suggestions
I have three datasets of sequences that I have clustered with CD-HIT. I would like to do a clustering of the three datasets combined, but that task is too big for my computer. There are instructions in the CD-HIT manual for compare /adding new seuences to an already clustered database (CD-HIT-2D), but that way my computer also runs out of memory.
I wonder if there is a way to compare / merge the three fasta and cluster results from CD-HIT into one.. or whether there is a software tool that is less memory intensive, so I can combine the three original sequence files and cluster them all at once.
Thanks in advance for any suggestions
Comment