Announcement
Collapse
No announcement yet.
Couting number of similar reads in sam file and output in bed file
Collapse
X
-
Couting number of similar reads in sam file and output in bed file
Hi all,
Currently, after mapping, I got a sam file which contained approximately 30 million mapped reads. Then, by using bedtool bamtobed, I was able to obtain a bed file. However, there were so many many similar reads.
Therefore, in order to reduce the size of the final bed file, is there any way that I can mention each read only once as well as its number of presentation in the sam file.
For instance:
var_1 0 15 ATGCATGCATGCCGTA
var_1 0 15 ATGCATGCATGCCGTA
var_1 0 15 ATGCATGCATGCCGTA
var_2 5 20 ATGCATGCGGGCCCC
Will become:
var_1 0 15 ATGCATGCATGCCGTA 3
var_2 5 20 ATGCATGCGGGCCCC 1
Thank you in advance!
Leave a comment: