Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • read group: GATK or BWA option?

    Hi NGS users,
    when I run GATK recalibration score, it returns me the error about read group:

    SAM/BAM file SAMFileReader{/dati1/Analysis_NGS/prova_s8/prova_GATK_localrealignment.bam.sorted.bam} is malformed: The input .bam file contains reads with no read group.

    So I try to use the option with a random string and the program runs:

    --default_platform Illumina --default_read_group HWUSI-EAS703

    I don't know if this is correct or if I have to rerun bwa, using the patch and use this options:

    -i read group identifier (ID)
    -m read group sample (SM), required if ID is given
    -l read group library (LB)
    -p read group platform (PL)

    Any suggestions?
    Thanx a lot,
    ME

  • #2
    Hi,

    I had this issue, and after patching bwa to the lastest update I run my sampe with

    -r STR read group header line such as `@RG\tID:foo\tSM:bar' [null]

    I specify ID, SM, LB and PL. So far no trouble with GATK on the downstream files.

    Comment


    • #3
      It depends!

      If you have one lane/group, using the GATK default_read_group is as good as using the patched bwa.
      But if you are analyzing multiple lanes/groups, patched bwa should be used. The GATK recalibration requires different group name in order to analyze each group seperately and default_read_group only assign one default group name to all reads. This can affect the GATK recalibration score.

      .

      Comment


      • #4
        malformated @RG line

        Hi NGSer,
        i run into the problem that my Readgroup Header is not accepted by bwa and i dont know why

        heres my code:

        Code:
        bwa sampe -r @RG ID:ILLUMINA-52179E_0039_FC62HDBAAXX_1_1 SM:48_2 PL:illunmina ./testingscripts/chrY.fa ./testingscripts/48_2_1_KESC1_mymod.sai./testingscripts/48_2_2_KESC1_mymod.sai ./testingscripts/48_2_1_KESC1_mymod.fastq ./testingscripts/48_2_2_KESC1_mymod.fastq > ./testingscripts/chrYvs48_2_1_KESC1_mymod_48_2_2_KESC1_mymod.sam
        any help would be nice . I also tested with \t explicit between the tags but with no success , thx in advance Aicen

        Comment


        • #5
          you should separate the tags by tabulators instead of whitespaces just put \t between the tags
          like:
          Code:
          bwa sampe -r "@RG\tID:ILLUMINA-52179E_0039_FC62HDBAAXX_1_1\tSM:48_2\tPL:illumina" ./testingscripts/chrY.fa ./testingscripts/48_2_1_KESC1_mymod.sai./testingscripts/48_2_2_KESC1_mymod.sai ./testingscripts/48_2_1_KESC1_mymod.fastq ./testingscripts/48_2_2_KESC1_mymod.fastq > ./testingscripts/chrYvs48_2_1_KESC1_mymod_48_2_2_KESC1_mymod.sam
          Last edited by ulz_peter; 10-18-2011, 02:25 AM. Reason: forgot one tab

          Comment


          • #6
            " - this missed

            my string missed the " at start and end and i needed to write \\t in my script so that the command lane statement went right, thx for your help

            Comment


            • #7
              With BWA output, I usually use Picard's AddOrRemoveReadGroups.jar before feeding the BAM file to GATK. You can provide all the necessary read group details to ensure GATK works.

              Comment


              • #8
                bwa

                Well, at the moment it seems to work fine, did u make other experience where bwa's -r opton could fail ? Because i dont want to draw the short stick at the end of the work realising sth. didnt work with the RG header^^ .

                Iam using GATk for further variant calling btw
                Last edited by Aicen; 10-18-2011, 10:52 AM.

                Comment


                • #9
                  To have a bam that can be used in GATK, use this RG header (note the "\t")

                  bwa sampe -r @RG"\t"ID:SAMPLE1_RG1"\t"PL:illumina"\t"PU:SAMPLE1_RG1_UNIT1"\t"LB:SAMPLE1_LIB1"\t"SM:SAMPLE1

                  Comment


                  • #10
                    Originally posted by ulz_peter View Post
                    you should separate the tags by tabulators instead of whitespaces just put \t between the tags
                    like:
                    Code:
                    bwa sampe -r "@RG\tID:ILLUMINA-52179E_0039_FC62HDBAAXX_1_1\tSM:48_2\tPL:illumina" ./testingscripts/chrY.fa ./testingscripts/48_2_1_KESC1_mymod.sai./testingscripts/48_2_2_KESC1_mymod.sai ./testingscripts/48_2_1_KESC1_mymod.fastq ./testingscripts/48_2_2_KESC1_mymod.fastq > ./testingscripts/chrYvs48_2_1_KESC1_mymod_48_2_2_KESC1_mymod.sam
                    Hello ulz_peter,

                    I tried writing code according to you , But didnt get success.

                    @HWI-ST741:204: D0TEJACXX:8:1101:1107:1901 1:N:0: BC=ACGTAA SAMPLE=1 LENGTH=64 MEAN_QUAL=38.6

                    Here is my sample Can you please help me writing the code for this sample.

                    Thanking in anticipation.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Recent Advances in Sequencing Technologies
                      by seqadmin







                      Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                      Long-Read Sequencing
                      Long-read sequencing has...
                      12-02-2024, 01:49 PM
                    • seqadmin
                      Genetic Variation in Immunogenetics and Antibody Diversity
                      by seqadmin



                      The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                      11-06-2024, 07:24 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 12-02-2024, 09:29 AM
                    0 responses
                    140 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 12-02-2024, 09:06 AM
                    0 responses
                    50 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 12-02-2024, 08:03 AM
                    0 responses
                    38 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 11-22-2024, 07:36 AM
                    0 responses
                    70 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X