Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing header information in bam cause GATK unifiedgenotyper fail

    Hi, I am this mitochondria sequence data, aligned using only chrM reference using BWA, and aligned against HG19 using bwa.

    The bam file header looks like
    @SQ SN:chrM LN:16569

    When I run gatk unifiedgenotyper, got this error message:
    ##### ERROR MESSAGE: SAM/BAM file SAMFileReader{/home3/guoy1/CaiMitochondria/analysis_chrM/SNPcall/../out.bam} is malformed: Read HWUSI-EAS614_0001:5:1:1051:19990#0 is missing the read group, which is required by the GATK

    But when I use the the bam file from HG19
    The header looks like
    @SQ SN:chr1 LN:249250621
    @SQ SN:chr2 LN:243199373
    @SQ SN:chr3 LN:198022430
    @SQ SN:chr4 LN:191154276
    @SQ SN:chr5 LN:180915260
    @SQ SN:chr6 LN:171115067
    @SQ SN:chr7 LN:159138663
    @SQ SN:chr8 LN:146364022
    @SQ SN:chr9 LN:141213431
    @SQ SN:chr10 LN:135534747
    @SQ SN:chr11 LN:135006516
    @SQ SN:chr12 LN:133851895
    @SQ SN:chr13 LN:115169878
    @SQ SN:chr14 LN:107349540
    @SQ SN:chr15 LN:102531392
    @SQ SN:chr16 LN:90354753
    @SQ SN:chr17 LN:81195210
    @SQ SN:chr18 LN:78077248
    @SQ SN:chr19 LN:59128983
    @SQ SN:chr20 LN:63025520
    @SQ SN:chr21 LN:48129895
    @SQ SN:chr22 LN:51304566
    @SQ SN:chrX LN:155270560
    @SQ SN:chrY LN:59373566
    @SQ SN:chrM LN:16571

    And no error for unifiedgenotyper. Both bam files don't have read group information in the headers, why would unifiedgenotyper complain about the chrM only one?

    Anyone know anything please let me know.

    Thanks

  • #2
    Is this related to this: "BWA patch to generate read group"

    Comment


    • #3
      My advice is to use the pre-produced genomes in the gatk_resources.tgz file you can obtain for their site. The program is too finicky if you try using something else in my experience. Also, you do need RG flags for it to work properly regardless. The link Ishen contributed explains how to add them downstream if you didn't add them when you did alignment. Good luck!
      Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
      Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
      Projects: U87MG whole genome sequence [Website] [Paper]

      Comment


      • #4
        Originally posted by lshen View Post
        Is this related to this: "BWA patch to generate read group"

        http://www.broadinstitute.org/gsa/wi...ate_read_group
        Hello-

        I've never used patch before. It told me 9 of 9 hunks (and then in subsequent attempts 7 of 7 hunks) failed. When I tried recompiling I got the following errors:

        make[1]: Entering directory `/home/arup/software/bwa-0.5.8c'
        make[1]: Nothing to be done for `lib'.
        make[1]: Leaving directory `/home/arup/software/bwa-0.5.8c'
        make[1]: Entering directory `/home/arup/software/bwa-0.5.8c/bwt_gen'
        make[1]: Nothing to be done for `lib'.
        make[1]: Leaving directory `/home/arup/software/bwa-0.5.8c/bwt_gen'
        gcc -c -g -Wall -O2 -m64 -DHAVE_PTHREAD bntseq.c -o bntseq.o
        In file included from bntseq.c:32:
        bntseq.h:87: error: conflicting types for ‘read_group_t’
        bntseq.h:74: error: previous declaration of ‘read_group_t’ was here
        bntseq.h:100: error: conflicting types for ‘read_group_t’
        bntseq.h:87: error: previous declaration of ‘read_group_t’ was here
        make: *** [bntseq.o] Error 1

        Any suggestions?

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Best Practices for Single-Cell Sequencing Analysis
          by seqadmin



          While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
          Today, 07:15 AM
        • seqadmin
          Latest Developments in Precision Medicine
          by seqadmin



          Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

          Somatic Genomics
          “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
          05-24-2024, 01:16 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 08:18 AM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Today, 08:04 AM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 06-03-2024, 06:55 AM
        0 responses
        13 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-30-2024, 03:16 PM
        0 responses
        27 views
        0 likes
        Last Post seqadmin  
        Working...
        X