Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extract the XS field from bam

    Hello everyone,

    BWA mem generates for each read an "XS" field (the suboptimal alignment score). When I use samtools view, it's presented this way :
    - NS500801:90:HY7JVBGXY:2:21205:8003:11253 147 chrM 958 60 76M = 920 -114 CCCCCTCCCCAATAAAGCTAAAACTCACCTGAGTTGTAAAAAACTCCAGTTGACACAAAATAGACTACGAAAGTGG >>;@B@CC1C??=??AAC=???>C@C>CC@BAAA?@<>>>>>=B?BB@@@?A=B=B>>>><@A=B<=;A>=@=;>= BD:Z:IIIMPOLKNKJJJBIMOMIBBJLKKIKLMKJKJIIKHAAAAILKKKLJIHJKHHHH@@GGIHLLLKKLJCKOJLJJ PG:Z:MarkDuplicates RG:Z:id BI:Z:LLLPTSOOSROPQHOTSQOGGNPPQNQPROLPNMMNNFFFFLNNONPOMLNOMLMNEEKLNMOOPOONMHOROPNN NM:i:0 AS:i:76 XS:i:55

    Does anyone know an easy way to extract it ? With R ? I mean I know I could use samtools view + awk but it'll take a long time.

    Thanks in advance!

  • #2
    using bioalcidaejdk: http://lindenb.github.io/jvarkit/BioAlcidaeJdk.html

    Code:
    java -jar dist/bioalcidaejdk.jar -e 'stream().forEach(R->println(R.getAttribute("XS")));'  in.bam

    Comment


    • #3
      Hi lindenb, thank you for your answer,

      How can I get the read name too ? I would like to have a table the in the first column the read name, and in the second the XS.

      Comment


      • #4
        > How can I get the read name too ?

        Code:
        ... printl(R.getReadName()+" "+R.getAttribute("XS")

        Comment


        • #5
          Thank you very much for your help!

          The output file is too big, I'm trying to get the chromosome too so that I can separate it per chromosome. I tried "getReferenceIndex" but it returns "null". Do you know how I could do ?

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Non-Coding RNA Research and Technologies
            by seqadmin


            Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

            [Article Coming Soon!]...
            Today, 08:07 AM
          • seqadmin
            Recent Developments in Metagenomics
            by seqadmin





            Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
            09-23-2024, 06:35 AM
          • seqadmin
            Understanding Genetic Influence on Infectious Disease
            by seqadmin




            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
            09-09-2024, 10:59 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 10-02-2024, 04:51 AM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-01-2024, 07:10 AM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-30-2024, 08:33 AM
          1 response
          31 views
          0 likes
          Last Post EmiTom
          by EmiTom
           
          Started by seqadmin, 09-26-2024, 12:57 PM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Working...
          X