Hey,
I'm totally new to the field, I've never used R before but I need to normalize some expression data.
I have many samples (different source cells) in each file (columns) and many files corresponding to different experiments, which have different numbers of rows (each row is a different gene).
I want to use RLE normalization which is implemented in the function calcNormFactors from the package edgeR, but I don't understand how can I put the read counts into a matrix since my files contain different numbers of genes (rows).
I thought I should have a giant matrix containing data from all of my files where the columns are the samples and rows are the genes.
Should I perhaps take the file that has the highest number of genes and input "0" for these genes if they are not present in the other files?
Thanks for your time.
I'm totally new to the field, I've never used R before but I need to normalize some expression data.
I have many samples (different source cells) in each file (columns) and many files corresponding to different experiments, which have different numbers of rows (each row is a different gene).
I want to use RLE normalization which is implemented in the function calcNormFactors from the package edgeR, but I don't understand how can I put the read counts into a matrix since my files contain different numbers of genes (rows).
I thought I should have a giant matrix containing data from all of my files where the columns are the samples and rows are the genes.
Should I perhaps take the file that has the highest number of genes and input "0" for these genes if they are not present in the other files?
Thanks for your time.