Announcement

Collapse
No announcement yet.

Couting number of similar reads in sam file and output in bed file

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Couting number of similar reads in sam file and output in bed file

    Hi all,
    Currently, after mapping, I got a sam file which contained approximately 30 million mapped reads. Then, by using bedtool bamtobed, I was able to obtain a bed file. However, there were so many many similar reads.
    Therefore, in order to reduce the size of the final bed file, is there any way that I can mention each read only once as well as its number of presentation in the sam file.

    For instance:

    var_1 0 15 ATGCATGCATGCCGTA
    var_1 0 15 ATGCATGCATGCCGTA
    var_1 0 15 ATGCATGCATGCCGTA
    var_2 5 20 ATGCATGCGGGCCCC

    Will become:
    var_1 0 15 ATGCATGCATGCCGTA 3
    var_2 5 20 ATGCATGCGGGCCCC 1

    Thank you in advance!
    Last edited by bobbyle0210; 11-08-2019, 08:14 PM. Reason: cross-post

  • #2
    Cross-posted and answered on biostars: https://www.biostars.org/p/407019/

    Comment

    Working...
    X