Hello,
I hoping somebody can help me to understand the qmap format, used to describe the bis-seq methylome data from recent papers. I downloaded the data from GEO (GSE17972). It would be great if you could you take a look at the questions below, or alternatively if you know of the software that generates qmap files, I could see if there is further explanation in the manual. I’ve read the format description (attached) from the GEO entry but this hasn’t answered all my questions. I’ve pasted an example line below to help illustrate my problems.
1. Columns 6-10 are involved with tags that map repeatedly. Does this mean that these tags have mapped to many different locations on the reference genome with the same mapping quality?
2. I don’t understand the difference between column 3 and column 4. How do some tags (in this example, 4 tags) provide no information regarding methylation? The bisulphite converted read would presumably show the base as being C (methylated) or T (unmethylated) and would therefore be counted in column 2 or 3. What goes in column 4?
3. Column 10 shows the sum of tags mapped repeatedly to the reference at this site. I took this to mean columns 6 + 7 + 8 + 9. However, in the example below that would equal 0+1+3+0=4, not the given value of 9896. Clearly I’m missing something!
C 0 9 4 1 0 1 3 0 9896 550.56 Wh_V^hYUh
Many Thanks,
Gareth
I hoping somebody can help me to understand the qmap format, used to describe the bis-seq methylome data from recent papers. I downloaded the data from GEO (GSE17972). It would be great if you could you take a look at the questions below, or alternatively if you know of the software that generates qmap files, I could see if there is further explanation in the manual. I’ve read the format description (attached) from the GEO entry but this hasn’t answered all my questions. I’ve pasted an example line below to help illustrate my problems.
1. Columns 6-10 are involved with tags that map repeatedly. Does this mean that these tags have mapped to many different locations on the reference genome with the same mapping quality?
2. I don’t understand the difference between column 3 and column 4. How do some tags (in this example, 4 tags) provide no information regarding methylation? The bisulphite converted read would presumably show the base as being C (methylated) or T (unmethylated) and would therefore be counted in column 2 or 3. What goes in column 4?
3. Column 10 shows the sum of tags mapped repeatedly to the reference at this site. I took this to mean columns 6 + 7 + 8 + 9. However, in the example below that would equal 0+1+3+0=4, not the given value of 9896. Clearly I’m missing something!
C 0 9 4 1 0 1 3 0 9896 550.56 Wh_V^hYUh
Many Thanks,
Gareth
Comment