Hi, everybody,
I have result files generated by blastn which then were sorted based on the second field. A typical file looks like:
360 miR156a
1 miR156a
9 miR156a
1 miR156a
10 miR156a
7 miR156a
1 miR156a
705 miR157a
2 miR157a
1 miR157a
5 miR157a
4 miR157a
67 miR157a
5 miR157a
11 miR157a
2 miR157a
34 miR159
3 miR162
3 miR166a
17 miR166a
4 miR166a
103 miR167a
1 miR167a
... .....
The first column is the deepseq read counts for each unique sequence. The 2nd column is the miR IDs that the sequence was aligns to.
I would like to:
1)
Sum the total read counts for each miR IDs (e.g. for miR156a, sum row1-row7);
Generate a bar graph to show the total read counts for each miR ID.
I have more than 20 files like this. I would like to use an automated way of doing this. The R package came to my minds.
But I have not used R before. Can you guys give me some tips or suggestions as about which R package or tools to use? (I can then learn those and figure out)
2)
If possible, generate a table that summarize all the total reads info from the 20 files.
The table that I would like to have is as follows:
miRID sample1 sample2 sample3 ......... sample 20
miR156 103 300 450 .......... 33
miR157 205 300 ..........
miR167 .....
.... .......
Thanks a lot!!
Jian
I have result files generated by blastn which then were sorted based on the second field. A typical file looks like:
360 miR156a
1 miR156a
9 miR156a
1 miR156a
10 miR156a
7 miR156a
1 miR156a
705 miR157a
2 miR157a
1 miR157a
5 miR157a
4 miR157a
67 miR157a
5 miR157a
11 miR157a
2 miR157a
34 miR159
3 miR162
3 miR166a
17 miR166a
4 miR166a
103 miR167a
1 miR167a
... .....
The first column is the deepseq read counts for each unique sequence. The 2nd column is the miR IDs that the sequence was aligns to.
I would like to:
1)
Sum the total read counts for each miR IDs (e.g. for miR156a, sum row1-row7);
Generate a bar graph to show the total read counts for each miR ID.
I have more than 20 files like this. I would like to use an automated way of doing this. The R package came to my minds.
But I have not used R before. Can you guys give me some tips or suggestions as about which R package or tools to use? (I can then learn those and figure out)
2)
If possible, generate a table that summarize all the total reads info from the 20 files.
The table that I would like to have is as follows:
miRID sample1 sample2 sample3 ......... sample 20
miR156 103 300 450 .......... 33
miR157 205 300 ..........
miR167 .....
.... .......
Thanks a lot!!
Jian
Comment