Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to import my own genome sequence into SeqMonk?

    I'm trying start a new project and import my own genome sequence database into SeqMonk. However, I can browse my computer for selection. In SeqMonk, I can only download and import from its own genome list.

    Is there any way I can import my own genome sequence database to Seq Monk?

  • #2
    Yes, when you download SeqMonk you will find a text file called CREATING_CUSTOM_GENOMES.txt explaining the procedure in great detail.



    Hope this helps.

    Comment


    • #3
      I didn't find such a file. I got only one file from the download. This file is basically the software, when I click it, it starts to work.

      Could you please post the procedure to create custom genomes if possible?

      Thanks.

      Comment


      • #4
        You can find the custom genome creation file if you download the Windows/Linux zip file. For your convenience I'll attach it here as well.
        Attached Files

        Comment


        • #5
          Got the file. Thanks a lot.

          Comment


          • #6
            custom genome

            Slny,

            I am also trying to use Seqmonk for my custom geneome but keeps on getting error. Were you able to use Seqmonk for custom genome.
            Thanks

            Comment


            • #7
              Hi,
              Thanks for developing SeqMok software it is a great tool to manage my sequencing data.
              I'm trying to create a folder with my custom genome, but I'm not able yet to do it.
              Could you help me?
              My steps were:
              1)I downloaded citrus genome from http://citrus.hzau.edu.cn/cgi-bin/gb2/gbrowse/orange/ in genbank format
              2) I converted genbank into embl format with this script

              #!/usr/local/bin/perl -w
              use strict;
              use Bio::SeqIO;

              if (@ARGV != 2) { die "USAGE: gb2embl.pl \n"; }

              my $seqio = Bio::SeqIO->new('-format' => 'genbank', '-file' => "$ARGV[0]");
              my $seqout = new Bio::SeqIO('-format' => 'embl', '-file' => ">$ARGV[1]");
              while( my $seq = $seqio->next_seq) {
              $seqout->write_seq($seq)
              }
              3) I changed the AC line

              At this point the program returns a message:
              "no data was present in the imported genome"
              I didn't understand which lines I should modify.

              Thak you all for your help!

              Comment


              • #8
                Originally posted by giampe View Post
                Hi,
                Thanks for developing SeqMok software it is a great tool to manage my sequencing data.
                I'm trying to create a folder with my custom genome, but I'm not able yet to do it.
                Could you help me?
                My steps were:
                3) I changed the AC line
                Without being able to see one of the files you've created it's difficult to know what's gone wrong.

                Can you put your genome files somewhere I can see them? If I can have a look at the files I can figure out why seqmonk isn't recognising them.

                Comment


                • #9
                  Hi Simon,
                  thanks for your quicly reply
                  I can show you the head of .embl file relative to chr1:


                  ID unknown; SV 1; linear; unassigned DNA; STD; UNC; 28800734 BP.
                  XX
                  AC unknown;
                  XX
                  DT 22-Feb-2013
                  XX
                  XX
                  XX
                  FH Key Location/Qualifiers
                  FH
                  FT scaffold 1..196955
                  FT /name="scaffold_0255"
                  FT scaffold 196978..818715
                  FT /name="scaffold_0155"
                  FT scaffold 818738..1870313
                  FT /name="scaffold_0091"
                  FT scaffold complement(1870336..8756891)
                  FT /name="scaffold_0002"
                  FT scaffold 8756914..10191576
                  FT /name="scaffold_0067"
                  FT scaffold 10191599..12196287
                  FT /name="scaffold_0044"
                  FT scaffold 12196310..12455131
                  FT /name="scaffold_0224"
                  FT scaffold 12455154..12524877
                  FT /name="scaffold_0342"
                  FT scaffold 12524900..13254358
                  FT /name="scaffold_0131"
                  FT scaffold complement(13254381..13838699)
                  FT /name="scaffold_0162"
                  FT scaffold 13838722..14955534
                  FT /name="scaffold_0083"
                  FT scaffold complement(14955557..17624236)
                  FT /name="scaffold_0029"
                  FT scaffold 17624259..18164428
                  FT /name="scaffold_0166"
                  FT scaffold 18164451..19274573
                  FT /name="scaffold_0085"
                  FT scaffold complement(19274596..22480739)
                  FT /name="scaffold_0019"
                  FT scaffold 22480762..25121265
                  FT /name="scaffold_0030"
                  FT scaffold complement(25121288..26274302)
                  FT /name="scaffold_0081"
                  FT scaffold complement(26274325..28800734)
                  FT /name="scaffold_0033"
                  XX
                  SQ Sequence 28800734 BP; 8998530 A; 4599939 C; 4612033 G; 8991187 T; 1599045 other;
                  ctaaacccta aaccctaaac cctaaaccct aaaaacccta taccctaaat accctatacc 60
                  ctatacccta taccctatac cctaaaccct ataccctata aaccctatac cctaaaccct 120
                  ataccctata aaccctatac cccataccct ataccccata ccctataccc tataccccat 180
                  accctatacc ccatacccta aaccctataa accctaaacc ctataaaccc taaaccctat 240
                  aaaccccaaa ccataaaccc taaaacccaa aaccctaaaa ccctaaaccc ctaaacccta 300
                  aaccctaaac cctaaaaccc taaaccccta aaaccctaaa acgcaaaaac actaaaccct 360
                  aaaaccggaa aaccctaaac cctaaaccct aaaaccctaa accctaaacc ctaaacccta 420


                  this is the file before my attemps.

                  Comment


                  • #10
                    OK. The only problem is that you need to adjust the AC line to the format described in the CREATING_CUSTOM_GENOMES.txt file. The one I used to test with was:

                    AC chromosome:Test:1:1:28800734:1

                    ..but change it to whatever assembly and genome name you actually want to use.

                    Comment


                    • #11
                      Yeah, I get it!
                      I have my custom genome! Thanks too much again!

                      Comment


                      • #12
                        Hello everyone,

                        I have the same problem. Here is the beginning of my ".dat " file (for the first chromosome of the bug i'm interested in)

                        ID AM040264; SV 1; circular; genomic DNA; STD; PRO; 2121359 BP.
                        XX
                        AC chromosome:2308:genome:1:2121359:1
                        XX
                        PR Project:PRJNA16203;
                        XX
                        DT 22-NOV-2005 (Rel. 85, Created)
                        DT 15-JUN-2010 (Rel. 105, Last updated, Version 3)
                        XX
                        DE Brucella melitensis biovar Abortus 2308 chromosome I, complete sequence,
                        DE strain 2308
                        XX
                        KW complete genome.
                        XX
                        OS Brucella abortus 2308
                        OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales; Brucellaceae;
                        OC Brucella.
                        XX
                        RN [1]
                        RP 1-2121359
                        RG Microbial Genomics Group, Lawrence Livermore National Laboratory, and the
                        RG Genome Analysis Group, Oak Ridge National Laboratory
                        RA Larimer F.;
                        RT ;
                        RL Submitted (21-JUN-2006) to the INSDC.
                        RL Larimer F., Oak Ridge National Laboratory, 1 Bethel Valley Road, Bldg 5700
                        RL A201 Oak Ridge, TN 37831, USA;
                        XX
                        RN [2]
                        RP 1-2121359
                        RX DOI; 10.1128/IAI.73.12.8353-8361.2005.
                        RX PUBMED; 16299333.
                        RG Microbial Genomics Group, Lawrence Livermore National Laboratory, and the
                        RG Genome Analysis Group, Oak Ridge National Laboratory
                        RA Chain P., Comerci D.J., Tolmasky M.E., Larimer F.W., Malfatti S.,
                        RA Vergez L.M., Aguero F., Land M.L., Ugalde R.A., Garcia E.;
                        RT "Whole-genome analyses of speciation events in pathogenic Brucellae";
                        RL Infect Immun 73(12):8353-8361(2005).
                        XX
                        DR MD5; a898c1e51a44dc700fa4f7a9333c982c.
                        DR EnsemblGenomes-Gn; BAB1_0014.
                        DR EnsemblGenomes-Gn; BAB1_0020.
                        DR EnsemblGenomes-Gn; BAB1_0021.
                        DR EnsemblGenomes-Gn; BAB1_0039.
                        ...

                        any idea why it keep on telling "no data present in the imported genome" ?
                        Thanks a lot!

                        Comment


                        • #13
                          Originally posted by chris202 View Post
                          Hello everyone,

                          I have the same problem. Here is the beginning of my ".dat " file (for the first chromosome of the bug i'm interested in)

                          ID AM040264; SV 1; circular; genomic DNA; STD; PRO; 2121359 BP.
                          XX
                          AC chromosome:2308:genome:1:2121359:1
                          Hi Chris,

                          The information in this post is now out of date. You no longer need to manually make custom genomes, there's a nice graphical way to do it as long as you have fasta files, GTF/GFF files, or preferably both.

                          Simply go to File > New Project and then select "Build custom genome". You can then load in your fasta and annotation files and it will create all of the genome files you need for you. It also has the option to create pseudochromosomes if you have an assembly which is scaffold or contig based and you don't want to end up with tons of chromosomes listed.

                          Let me know if you have any problems with this, but hopefully it will prove to be a much simpler solution.

                          Cheers

                          Simon.

                          Comment


                          • #14
                            Ok it worked ! Thank you very much !
                            Actually I have another downstream question (not sure this is the best place to ask...). So I've uploaded my custom genome and I trying to import a small test dataset which looks exactly like the one shown in this video (at 2:05)
                            This video goes through the process of starting a new SeqMonk project, from downloading and importing an annotated genome, to importing mapped sequence data....


                            After assigning the different columns as needed, I try to import but for each read it says:
                            "Location XXX-YYY was not an integer" no matter the size of the interval.
                            I'm a bit lost. Do you have any advice ?

                            Thansk again

                            Comment


                            • #15
                              Originally posted by chris202 View Post
                              Ok it worked ! Thank you very much !
                              Actually I have another downstream question (not sure this is the best place to ask...). So I've uploaded my custom genome and I trying to import a small test dataset which looks exactly like the one shown in this video (at 2:05)
                              This video goes through the process of starting a new SeqMonk project, from downloading and importing an annotated genome, to importing mapped sequence data....


                              After assigning the different columns as needed, I try to import but for each read it says:
                              "Location XXX-YYY was not an integer" no matter the size of the interval.
                              I'm a bit lost. Do you have any advice ?

                              Thansk again
                              Are you setting the count column when you import the file? This is a new option which won't be shown in the video and is only for datasets where there is an extra column to say how many times a particular position was seen. There was a bug in the last release which gave the wrong error message if the count value was incorrect so it made it hard to track down the problem. If you are setting the count column could you try setting it to nothing (leave that selector blank) and see if that fixes it.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X