Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • snpEff eff with built genome database

    Hello,

    I am having trouble annotating my vcf file using a database that I built. It seems as though snpEff wants to go to sourceforge.net to retieve a built database. Can someone help me identify what my problem is?

    Here is my code
    java -Xmx4g -jar /pkgs/snpeff-4.3.1p-1/share/snpeff-4.3.1p-1/snpEff.jar eff -c /pkgs/snpeff-4.3.1p-1/share/snpeff-4.3.1p-1/snpEff.config Sorghum EMSmutboth.vcf > var.ann.vcf

    Here is the error I continue to receive:
    ERROR while connecting to http://downloads.sourceforge.net/pro..._3_Sorghum.zip
    java.lang.RuntimeException: java.lang.RuntimeException: File not found on the server. Make sure the database name is correct.
    at org.snpeff.util.Download.download(Download.java:178)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.downloadAndInstall(SnpEffCmdDownload.java:32)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.runDownloadGenome(SnpEffCmdDownload.java:86)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.run(SnpEffCmdDownload.java:72)
    at org.snpeff.SnpEff.run(SnpEff.java:1221)
    at org.snpeff.SnpEff.loadDb(SnpEff.java:515)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:998)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:981)
    at org.snpeff.SnpEff.run(SnpEff.java:1183)
    at org.snpeff.SnpEff.main(SnpEff.java:162)
    Caused by: java.lang.RuntimeException: File not found on the server. Make sure the database name is correct.
    at org.snpeff.util.Download.download(Download.java:127)
    ... 9 more
    java.lang.RuntimeException: Genome download failed!
    at org.snpeff.SnpEff.loadDb(SnpEff.java:516)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:998)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:981)
    at org.snpeff.SnpEff.run(SnpEff.java:1183)
    at org.snpeff.SnpEff.main(SnpEff.java:162)

    I built a database that seemed to work, so I am not sure what information I am missing. When I built the database I used the following command:
    java -jar location/to/snpEff.jar build -gffs -v Sorghum

    Thank you for any help

  • #2
    Hello,

    how does you config file looks like? I guess snpEff cannot find your database and tries to download it.

    Be sure to link to your local database in the config file or use the -dataDir flag to specify the folder where the database is located.

    fin swimmer

    Comment


    • #3
      fin swimmer,

      Thank you for your reply.

      This is what I have in my config file:

      #Sorghum bicolor genome, version 3.0.1
      Sorghum.genome : Sorghum

      And this is the error I am getting when running snpEff eff:
      Reading configuration file '/home/pkgs/snpeff-4.3.1p-1/share/snpeff-4.3.1p-1/snpEff.config'. Genome: 'Sorghum'
      00:00:00 Reading config file: /home/pkgs/snpeff-4.3.1p-1/share/snpeff-4.3.1p-1/snpEff.config
      00:00:01 done
      00:00:01 Reading database for genome version 'Sorghum' from file '/home/pkgs/snpeff-4.3.1p-1/share/snpeff-4.3.1p-1/./data/Sorghum/snpEffectPredictor.bin' (this might take a while)
      00:00:01 Database not installed

      It seems that the snpEffectPredictor.bin file is not created. Which leads me to believe my database wasn't built correctly. However when I build the database I do not receive and major errors of a failure to build database.

      Are you familiar with this problem?
      Thanks for replying to original thread, I hope you know what is causing this error,
      htetre

      Comment


      • #4
        Hello htetre,

        snpeff tries to find your database in this path:
        /home/pkgs/snpeff-4.3.1p-1/share/snpeff-4.3.1p-1/./data/Sorghum/snpEffectPredictor.bin
        As you can see there is a '.' in the path. You can define a absolute path in the config file for data_dir or use -dataDir flag as i suggest in my first post to link to the correct directory.

        fin swimmer

        Comment


        • #5
          Thank you fin swimmer,

          I have changed that now, thanks to you. Now it is definitely going to the correct path. However I have determined that my database was never built correctly in that I do not have this snpEffectPredictor.bin file. When I try to 'find' it anywhere on the cluster it is not present.

          Have you heard of that problem? Executing the following:
          java -jar snpEff.jar build -gff3 -v Sorghum
          starts the program and it is finding my genes.gff file but it is not following through and creating a predictor file.

          Thank you
          htetre

          Comment


          • #6
            Hello,

            i've never build a new database, so I cannot help here. Do you really need to build your own database? snpEff seems to have a prebuild database for Sorghum.

            fin swimmer

            Comment


            • #7
              Well, up to now my vcf file is based on the most recent Sorghum genome assembly and that assembly is not part of snpeff. But with the difficulty I am having to build genome database I'm thinking of redoing with an older assembly version that is part of the snpeff available databases. I was trying not to but seems as though the problem I am having is not common or its unfamiliar because I havent been able to find information.

              Thanks
              Hannah

              Comment


              • #8
                Hello,

                I have found my problem, and it resides in the gff file I have been using. It only contained the gene not gene_exon information. Once I used the correct gff file the genome database was built.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Advanced Methods for the Detection of Infectious Disease
                  by seqadmin




                  The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
                  ...
                  11-27-2023, 01:15 PM
                • seqadmin
                  Strategies for Investigating the Microbiome
                  by seqadmin




                  Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
                  11-09-2023, 07:02 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 10:48 AM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 08:26 AM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 08:12 AM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 11-27-2023, 08:12 AM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Working...
                X