No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem with auto_annovar

    I am trying to analyze some exome data using auto_annovar program. It works brilliantly till step 8. After Step-8, it generates a file but then it stop. During the step-8 processing, it says "fGrep: Writing output" and at the end the final message is "fGrep : Write error". Step-8 file contains the variants but with no gene names.
    i am not able to map the variants to the relevant genes. I checked my Humandb database, it seems to contain all the relevant files. Can anyone please kindly help me with this. Thanks a lot

  • #2

    I am experiencing an issue with auto_annovar script too.
    I am using annovar for a few months. Until now, I used script, with different parameters separately, -geneanno, -regionanno, -filter,...
    Now, I would like to try auto_annovar script, but I am stuck at the first step:

    perl -model recessive $dir/$file humandb -build hg19 -step 1
    NOTICE: the --ver1000g argument is set as '1000g2010nov' by default
    Error: the required database file humandb/hg19_ALL.sites.2010_11.txt does not exist. Please download it via -downdb argument by
    I cannot find what this hg19_ALL.sites.2010_11.txt database is... I check the script and it is not mentioned in it.
    I try to donload it:
    perl -downdb hg19_ALL.sites.2010_11.txt humandb -build hg19
    NOTICE: Downloading annotation database ... ^[[AFailed
    WARNING: Some files cannot be downloaded, including
    perl -downdb ALL.sites.2010_11.txt humandb -build hg19
    NOTICE: Downloading annotation database ... Failed
    WARNING: Some files cannot be downloaded, including
    Does someone know what is this file and where I can find it?


    ps: I am using the version of March 2012.


    • #3
      Try to include "-webfrom annovar" when downloading. Also, try "1000g2010nov" instead of "hg19_ALL.sites.2010_11.txt".


      • #4
        Thank you for your answer, your remark put me on the right way: I have downloaded 1000g2012apr and not 1000g2010nov, so I only needed to change the name inside the script.

        I managed to run the three first steps, but I don't really understand the outputs. I get files concerning almost all the steps: step4, 7, 8, 9, I don't know why, maybe they are here by defalut. Moreover, I cannot open the file .genelist, which seems to be the final file

        perl -model recessive $dir/$file humandb -build hg19 -step 1-3
        NOTICE: the --ver1000g argument is set as '1000g2012apr' by default

        NOTICE: Running step 1 with system command <perl -geneanno -buildver hg19 -dbtype refgene -outfile GAR/mutect_file.step1 GAR/mutect_file humandb>
        NOTICE: Reading gene annotation from humandb/hg19_refGene.txt ... Done with 40514 transcripts (including 6759 without coding sequence annotation) for 23468 unique genes
        NOTICE: Reading FASTA sequences from humandb/hg19_refGeneMrna.fa ... Done with 63 sequences
        WARNING: A total of 271 sequences will be ignored due to lack of correct ORF annotation
        NOTICE: Finished gene-based annotation on 60 genetic variants in GAR/mutect_file
        NOTICE: Output files were written to GAR/mutect_file.step1.variant_function, GAR/mutect_file.step1.exonic_variant_function

        NOTICE: Running step 2 with system command <perl -regionanno -dbtype mce46way -buildver hg19 -outfile GAR/mutect_file.step2 GAR/mutect_file.step2.varlist humandb>
        NOTICE: Reading annotation database humandb/hg19_phastConsElements46way.txt ... Done with 5163775 regions
        NOTICE: Finished region-based annotation on 26 genetic variants in GAR/mutect_file.step2.varlist
        NOTICE: Output files were written to GAR/mutect_file.step2.hg19_phastConsElements46way

        NOTICE: Running step 3 with system command <perl -regionanno -dbtype segdup -buildver hg19 -outfile GAR/mutect_file.step3 GAR/mutect_file.step3.varlist humandb>
        NOTICE: Reading annotation database humandb/hg19_genomicSuperDups.txt ... Done with 51599 regions
        NOTICE: Finished region-based annotation on 24 genetic variants in GAR/mutect_file.step3.varlist
        NOTICE: Output files were written to GAR/mutect_file.step3.hg19_genomicSuperDups

        NOTICE: Running step 8 with system command <fgrep -f GAR/mutect_file.step8.varlist GAR/mutect_file.step1.exonic_variant_function | cut -f 2- > GAR/mutect_file.step8;cut -f 3- GAR/mutect_file.step8 > GAR/mutect_file.step8.temp;fgrep -v -f GAR/mutect_file.step8.temp GAR/mutect_file.step8.varlist > GAR/mutect_file.step8.temp1;fgrep -f GAR/mutect_file.step8.temp1 GAR/mutect_file.step1.variant_function >> GAR/mutect_file.step8;>

        NOTICE: a list of potentially important genes and the number of variants in them are written to GAR/mutect_file.genelist
        NOTICE: Consider filter out the list of dispensable genes from the GAR/mutect_file.genelist file to identify the final candidate gene list.
        What do you get and is the .genelist file the final result?


        • #5
          I don't actually use the auto_annovar script, I just use the script and do my own filtering from that, so I'm not sure exactly what you should get.


          • #6
            Ok, thanks. I will try script since I don't really want to follow all the steps of


            Latest Articles


            • seqadmin
              Advanced Tools Transforming the Field of Cytogenomics
              by seqadmin

              At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
              09-26-2023, 06:26 AM
            • seqadmin
              How RNA-Seq is Transforming Cancer Studies
              by seqadmin

              Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
              09-07-2023, 11:15 PM
            • seqadmin
              Methods for Investigating the Transcriptome
              by seqadmin

              Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

              Whole Transcriptome RNA-seq
              Whole transcriptome sequencing...
              08-31-2023, 11:07 AM





            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:57 AM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 09-26-2023, 07:53 AM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 09-25-2023, 07:42 AM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 09-22-2023, 09:05 AM
            0 responses
            Last Post seqadmin