  • How to deal with this problem? Reordersam treat bam file

    ##### ERROR A USER ERROR has occurred (version 3.2-2-gec30cee):
    ##### ERROR
    ##### ERROR This means that one or more arguments or inputs in your command are incorrect.
    ##### ERROR The error message below tells you what is the problem.
    ##### ERROR
    ##### ERROR If the problem is an invalid argument, please check the online documentation guide
    ##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
    ##### ERROR
    ##### ERROR Visit our website and forum for extensive documentation and answers to
    ##### ERROR commonly asked questions
    ##### ERROR
    ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
    ##### ERROR
    ##### ERROR MESSAGE: Lexicographically sorted human genome sequence detected in knownSites.
    ##### ERROR For safety's sake the GATK requires human contigs in karyotypic order: 1, 2, ..., 10, 11, ..., 20, 21, 22, X, Y with M either leading or trailing these contigs.
    ##### ERROR This is because all distributed GATK resources are sorted in karyotypic order, and your processing will fail when you need to use these files.
    ##### ERROR You can use the ReorderSam utility to fix this problem:
    ##### ERROR knownSites contigs = [chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chrM, chrX, chrY, chr9]
    ##### ERROR ------------------------------------------------------------------------------------------
    INFO 22:20:10,569 HelpFormatter - --------------------------------------------------------------------------------

    This error was happend when i run BaseRecalibrator of the bam file.
    I also have used ReorderSam.jar to re-order the bam file, but it still didn't work.
    java -jar /usr/local/software/picard-tools-1.105/ReorderSam.jar I=./PET19_malformed.bam O=./PET19_reorder.bam REFERENCE=/home/george/alignment/gatk_resource/ucsc.hg19.fasta CREATE_INDEX=True

  • #2
    Have you confirmed that the fasta file has the correct order?


    • #3
      I think fasta file order is right, because if i directly use GATK to call snp of the same bam file, all run seemed ok. But if use this file to run BaseRecalibrator, it will report this error.
      So it is very confused.


      • #4
        Just because one program has no problem with the order does not mean that a different program will not have a problem with the order. The ERROR message is extremely clear (for a bioinformatics program) -- it tells you both the error and what to do about it.

        You say you ran ReorderSam.jar (as suggested) but the error persists. So we now need to do more heavy troubleshooting. As @dpryan says, have you confirmed your fasta file is in order. Do a:

        grep '>' /home/george/alignment/gatk_resource/ucsc.hg19.fasta | more
        and show us the top several lines.

        It would also be nice to look inside your BAM file and see if column 3 is in correct order. I think that the following will help in this:

        samtools view ./PET19_reorder.bam | cut -f 3 | grep chr | uniq -c | more
        And show us the top several lines of that.


        • #5
          @westerman, thanks. Below is the results.
          $ /usr/local/software/samtools-0.1.13/samtools view ./301-10.rmdup.bam | cut -f 3 | grep chr | uniq -c | more
          3175288 chrM
          568694 chr1
          459726 chr2
          351916 chr3
          225789 chr4
          493266 chr5
          691892 chr6
          348049 chr7
          262928 chr8
          138590 chr9
          275147 chr10
          209938 chr11
          202256 chr12
          150772 chr13
          126969 chr14
          193823 chr15
          147602 chr16
          167425 chr17
          104782 chr18
          73551 chr19
          87570 chr20
          43880 chr21
          95365 chr22
          529116 chrX
          4396 chrY
          28 chr1_gl000191_random

          for hg19 is also right order.
          $ head -10 /media/Analysis/gatk_resource/ucsc.hg19.fasta.fai
          chrM 16571 6 50 51
          chr1 249250621 16915 50 51
          chr2 243199373 254252555 50 51
          chr3 198022430 502315922 50 51
          chr4 191154276 704298807 50 51
          chr5 180915260 899276175 50 51
          chr6 171115067 1083809747 50 51
          chr7 159138663 1258347122 50 51
          chr8 146364022 1420668565 50 51
          chr9 141213431 1569959874 50 51


          • #6
            Certainly looks good. Only that 'chr1_gl000191_random' in the BAM file is the least bit strange. Unfortunately I have no more troubleshooting advice.


