I want to do a PCA of metagenomic contigs based on tetranucleotide composition. Commonly I read in the literature the use of NORMALIZED tetranucleotide frequencies for this purpose, however no detail about this normalization is given.
What did they mean about "normalization"? Is it a z-score transformation? Or is it a normalization depending on the length of the contigs (how do I calculate it?)?
Some references:
Thanks.
What did they mean about "normalization"? Is it a z-score transformation? Or is it a normalization depending on the length of the contigs (how do I calculate it?)?
Some references:
Thanks.