Hi all,
I have downloaded 150 pair-end RNA-seq data (patient samples). There are several different sequence lengths of these RNA-seq data, from 50 bp, 75 bp, 100 bp, 101 bp to 103 bp. Moreover, the library sizes also differ quite a lot (from around 20 million to 110 million reads). I would like to divide these 150 samples into two groups, 1p36 positive or negative, based on the expression level of genes located at the chromosome region 1p36. So the problem is how to normalize this RNA-seq dataset. Any suggestions?
Thanks a lot!
Yao
I have downloaded 150 pair-end RNA-seq data (patient samples). There are several different sequence lengths of these RNA-seq data, from 50 bp, 75 bp, 100 bp, 101 bp to 103 bp. Moreover, the library sizes also differ quite a lot (from around 20 million to 110 million reads). I would like to divide these 150 samples into two groups, 1p36 positive or negative, based on the expression level of genes located at the chromosome region 1p36. So the problem is how to normalize this RNA-seq dataset. Any suggestions?
Thanks a lot!
Yao
Comment