I'm new to bioinformatics, and have an incomprehensible PI. Could someone clue me in on some of what is going on in the prompt below? For instance:
1) What's "LD"?
2) What do the different coverage values signify? How do I implement that in R?
3) I have no idea what the RefAlleleCount, etc. bit means. Aren't all of those ratios the same anyhow?
I realize I'm asking really broad, noobish questions, but a little context would be reeeaaally helpful. Thanks.
" RNAseq reads covered 5,000 biallelic human autosomal SNPs (all are heterogeneous, assume they are not in LD). Please write simple R statements to plot the distribution of the alternative allelic frequency under sequencing coverage 2x, 5x, 10x, 20x, respectively. Also calculate the p-values with the hypothesis that there are no differential allelic expression for these three SNPs: (RefAlleleCount:AltAlleleCount) 1:4, 2:8, 4:16. "
Edit: I've discovered that LD is linkage disequilibrium. Still fairly lost though.
1) What's "LD"?
2) What do the different coverage values signify? How do I implement that in R?
3) I have no idea what the RefAlleleCount, etc. bit means. Aren't all of those ratios the same anyhow?
I realize I'm asking really broad, noobish questions, but a little context would be reeeaaally helpful. Thanks.
" RNAseq reads covered 5,000 biallelic human autosomal SNPs (all are heterogeneous, assume they are not in LD). Please write simple R statements to plot the distribution of the alternative allelic frequency under sequencing coverage 2x, 5x, 10x, 20x, respectively. Also calculate the p-values with the hypothesis that there are no differential allelic expression for these three SNPs: (RefAlleleCount:AltAlleleCount) 1:4, 2:8, 4:16. "
Edit: I've discovered that LD is linkage disequilibrium. Still fairly lost though.
Comment