Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sasignor
    Member
    • Sep 2009
    • 28

    editing header files

    I will admit I have little understanding of exactly how read groups work with regards to tags within the file instead of just the header. Here is my problem:

    1. I originally processed all my files with @RG tags for ID and SM, where ID was always 1 (the problem) and SM was the sample name (which is fine).

    2. A downstream application I use has problems with the ID:1 tag, and I need it to be the sample name (I think, the downstream application has little documentation, but on other files where I did this originally it worked).

    4. Using samtools view -H file > file.txt I edited each header file to have ID: sample id then used samtools reheader.

    3. Before using the downstream application, I need to merge my files and I did this:

    /home/jdk1.7.0_55/bin/java -Xmx2g -jar ../bin/picard-tools-1/picard-tools-1.89/MergeSamFiles.jar

    followed by a list of I= and O=

    4. The error:
    Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 1, Read name M01533:70:000000000-A5W2F:1:2113:28047:12326, RG ID on SAMRecord not found in header: 1
    at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:541)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.<init>(BAMFileReader.java:500)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.<init>(BAMFileReader.java:488)
    at net.sf.samtools.BAMFileReader.getIterator(BAMFileReader.java:290)
    at net.sf.samtools.SAMFileReader.iterator(SAMFileReader.java:322)
    at net.sf.picard.sam.MergingSamRecordIterator.startIterationIfRequired(MergingSamRecordIterator.java:100)
    at net.sf.picard.sam.MergingSamRecordIterator.hasNext(MergingSamRecordIterator.java:115)
    at net.sf.picard.sam.MergeSamFiles.doWork(MergeSamFiles.java:147)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
    at net.sf.picard.sam.MergeSamFiles.main(MergeSamFiles.java:79)


    So, can you not edit @RG tags after the file has been created? The very existence of samtools reheader seems to support that you can.....Is there a way of doing this that does a better job, and actually changes the read tags, assuming some kind of tag on each read is the problem?
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    Just use picard tools AddOrReplaceReadGroups. That'll make your life easier.

    Unlike many other things, the read group ID is stored as text in each alignment, so changing the header alone won't be sufficient. I agree that this is somewhat annoying.

    Comment

    • sasignor
      Member
      • Sep 2009
      • 28

      #3
      Yes that seems to work. The existence of samtools reheader seems to be somewhat pointless.

      I still wish I knew more about how each tag is attached to each read, and how that is then interpreted down the line, but oh well.

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        It's fine if you just want to change chromosome names (e.g., from UCSC to Ensembl names).

        Regarding understanding how things are stored, this is partly covered in the SAM spec. in the section about BAM format, but it's generally easier to remember if you've played with the samtools source code a bit.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Pathogen Surveillance with Advanced Genomic Tools
          by seqadmin




          The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
          Yesterday, 11:48 AM
        • seqadmin
          New Genomics Tools and Methods Shared at AGBT 2025
          by seqadmin


          This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

          The Headliner
          The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
          03-03-2025, 01:39 PM
        • seqadmin
          Investigating the Gut Microbiome Through Diet and Spatial Biology
          by seqadmin




          The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
          02-24-2025, 06:31 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-20-2025, 05:03 AM
        0 responses
        36 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-19-2025, 07:27 AM
        0 responses
        43 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-18-2025, 12:50 PM
        0 responses
        35 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-03-2025, 01:15 PM
        0 responses
        190 views
        0 reactions
        Last Post seqadmin  
        Working...