Hi all,
I have three libraries from 3 experiments. Our sequences are tags with a poly(A) tail, like the polyA-tail sequences in mRNA-seq but using different protocol. Most of the tags are located in 3'-UTR regions.
I think I need to normalize the data before comparing different libraries.
There is a normalization method called TPM:
TPM (tags per million) according to the total tag count in each library
as follows: TPMj i=Cji*10^6/Tj, where TPMj i is the TPM for SAGE tag i in SAGE library j, Cj i is the count of SAGE tag i in library j, and Tj is the total SAGE tag count in SAGE library j.
It seems that this method is just to normalize the sequence number by the total sequence count in each library.
But, if I use TPM, should I use the raw sequence count or only the count of the mapped sequence? Or any other normalization method?
Thanks a lot.
I have three libraries from 3 experiments. Our sequences are tags with a poly(A) tail, like the polyA-tail sequences in mRNA-seq but using different protocol. Most of the tags are located in 3'-UTR regions.
I think I need to normalize the data before comparing different libraries.
There is a normalization method called TPM:
TPM (tags per million) according to the total tag count in each library
as follows: TPMj i=Cji*10^6/Tj, where TPMj i is the TPM for SAGE tag i in SAGE library j, Cj i is the count of SAGE tag i in library j, and Tj is the total SAGE tag count in SAGE library j.
It seems that this method is just to normalize the sequence number by the total sequence count in each library.
But, if I use TPM, should I use the raw sequence count or only the count of the mapped sequence? Or any other normalization method?
Thanks a lot.
Comment