Header Leaderboard Ad

Collapse

Parse CIGAR string in C/C++

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Parse CIGAR string in C/C++

    What'd be the best way to parse a CIGAR string fully according to the specification in C/C++? Would regular expression work?

  • #2
    Originally posted by tedwong View Post
    What'd be the best way to parse a CIGAR string fully according to the specification in C/C++? Would regular expression work?
    No. Unless you simply want to detect the presence of some operation, the best way is with a custom loop.

    Here's an example in Java that can easily be translated to C++:

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      If you're using C/C++ already, then just use htslib. The functions for this have already been written (afterall, it's what samtools uses) and the API is generally convenient.

      Comment


      • #4
        cross posted: http://stackoverflow.com/questions/2...lar-expression

        Comment


        • #5
          For reference (using the htslib library)



          #include <htslib/sam.h>

          auto f = sam_open(file.c_str(), "r");
          auto h = sam_hdr_read(f);
          auto t = bam_init1();

          while (sam_read1(f, h, t) >= 0)
          {
          auto id = std::string(h->target_name[0]);
          auto mapped = !(t->core.flag & BAM_FUNMAP);

          const auto cigar = bam_get_cigar(t);

          for (int k = 0; k < t->core.n_cigar; k++)
          {
          const int op = bam_cigar_op(cigar[k]);
          const int ol = bam_cigar_oplen(cigar[k]);

          if (op == BAM_CMATCH || op == BAM_CINS || op == BAM_CDEL)
          {
          // your code, you have the length in ol (eg: 101M -> ol == 101)
          }
          }
          }

          sam_close(f);

          Comment

          Latest Articles

          Collapse

          • seqadmin
            How RNA-Seq is Transforming Cancer Studies
            by seqadmin



            Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
            09-07-2023, 11:15 PM
          • seqadmin
            Methods for Investigating the Transcriptome
            by seqadmin




            Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

            Whole Transcriptome RNA-seq
            Whole transcriptome sequencing...
            08-31-2023, 11:07 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:18 AM
          0 responses
          5 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-20-2023, 09:17 AM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-19-2023, 09:23 AM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-19-2023, 09:14 AM
          0 responses
          7 views
          0 likes
          Last Post seqadmin  
          Working...
          X