Hello all.
I have a couple of basic questions regarding threshold values and standard deviation.
I have a fastq file of about 40million 50bp single end reads that I aligned to a genome via bowtie. Using the sam file generated from bowtie and the gff file for the genome, I wrote a PERL script to count how many reads aligned to each gene on the genome.
Question 1, what is the minimum number of reads that must map to a gene before I can say the gene is "expressed"? What is the threshold?
I have a gene of interest that lets say has 100 reads aligned to it. Due to cost constraints, I cannot run the same sample multiple times to calculate the standard deviation.
Question 2, what is the approximate standard deviation of those reads? Is there some quick calculation I could perform to estimate the standard deviation?
Thanks in advance.
I have a couple of basic questions regarding threshold values and standard deviation.
I have a fastq file of about 40million 50bp single end reads that I aligned to a genome via bowtie. Using the sam file generated from bowtie and the gff file for the genome, I wrote a PERL script to count how many reads aligned to each gene on the genome.
Question 1, what is the minimum number of reads that must map to a gene before I can say the gene is "expressed"? What is the threshold?
I have a gene of interest that lets say has 100 reads aligned to it. Due to cost constraints, I cannot run the same sample multiple times to calculate the standard deviation.
Question 2, what is the approximate standard deviation of those reads? Is there some quick calculation I could perform to estimate the standard deviation?
Thanks in advance.