Hi all,
I was wondering does anyone have any experience using maq mapcheck to estimate the different substitution rates by substitution type. After an RNAseq analysis I get the following as the first 3 lines of the mapcheck file.
1 20.4 23.2 29.0 27.5 : 7 18 8 6 7 11 3 3 2 4 7 5 : 5 10 14 969 : 72 41 29 18
2 23.2 25.1 30.7 21.1 : 4 11 3 4 5 5 2 2 1 3 6 4 : 6 11 14 969 : 115 40 24 11
3 28.1 23.8 33.0 15.0 : 2 7 1 2 2 3 2 1 0 2 5 3 : 5 10 14 969 : 29 20 14 6
As far as I understand I can approximate the overall substitution rate for cycle 1 from the last 8 columns of the first row. as (969/1000)*(18/1000)+(14/1000)*(29/1000)+(41/1000)*(10/1000)+(72/1000)*(5/1000) = 0.018. The columns in the middle represent the substitution rates by substitution type. i.e. For all A's in the reference genome, 7/1000 of the alligned reads had a C in this position. Is this the correct interpretation? Sorry if this question seems quite simple.
Thanks,
John
I was wondering does anyone have any experience using maq mapcheck to estimate the different substitution rates by substitution type. After an RNAseq analysis I get the following as the first 3 lines of the mapcheck file.
1 20.4 23.2 29.0 27.5 : 7 18 8 6 7 11 3 3 2 4 7 5 : 5 10 14 969 : 72 41 29 18
2 23.2 25.1 30.7 21.1 : 4 11 3 4 5 5 2 2 1 3 6 4 : 6 11 14 969 : 115 40 24 11
3 28.1 23.8 33.0 15.0 : 2 7 1 2 2 3 2 1 0 2 5 3 : 5 10 14 969 : 29 20 14 6
As far as I understand I can approximate the overall substitution rate for cycle 1 from the last 8 columns of the first row. as (969/1000)*(18/1000)+(14/1000)*(29/1000)+(41/1000)*(10/1000)+(72/1000)*(5/1000) = 0.018. The columns in the middle represent the substitution rates by substitution type. i.e. For all A's in the reference genome, 7/1000 of the alligned reads had a C in this position. Is this the correct interpretation? Sorry if this question seems quite simple.
Thanks,
John