I apologize if an existing answer to the following question has already been addressed in this forum.
I am analyzing solexa either 35 or 75 bp single-end reads.
If I understand correctly, htseq-count and bedtools return the total number of raw reads for the gene. I had written a script in perl which took the maximum # of reads from the gene. I think probably similar to peak finding in CHIP-seq. I was operating under the assumption that the read count should reflect the number of copies of the transcripts of the cell at a specific time or under a specific condition. A simple example: If you have 10 reads that map on a gene from 1-75 and another 10 reads that map from 1000-1075. Does this mean we had 20 copies of this transcript in the cell or were there really 10?
Has my counting approach been incorrect? What is the biological reasoning behind counting the total # of reads per gene?
My apologies if this is a trivial question.
Much Thanks,
Tirza Doniger
I am analyzing solexa either 35 or 75 bp single-end reads.
If I understand correctly, htseq-count and bedtools return the total number of raw reads for the gene. I had written a script in perl which took the maximum # of reads from the gene. I think probably similar to peak finding in CHIP-seq. I was operating under the assumption that the read count should reflect the number of copies of the transcripts of the cell at a specific time or under a specific condition. A simple example: If you have 10 reads that map on a gene from 1-75 and another 10 reads that map from 1000-1075. Does this mean we had 20 copies of this transcript in the cell or were there really 10?
Has my counting approach been incorrect? What is the biological reasoning behind counting the total # of reads per gene?
My apologies if this is a trivial question.
Much Thanks,
Tirza Doniger