Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • svj
    Junior Member
    • Jul 2012
    • 8

    Eukaryotic orf finder

    Hi All,

    I am looking for Eukaryotic orf finder algorithm/source code. I am trying to build training model for unknown eukaryotic genome using Glimmerhmm. I need collect orf's for the Glimmerhmm training model. So I did BLASTp against known eukaryotic protein sequences (closest neighbour to the unknown eukaryote) but am unable to build the training model with resultant orf's. The error I get after trainGlimmerhmm is:
    Training data created successfully! Check exons.dat and seqs for accuracy.


    Acceptor sites for training: 18292
    False acceptor sites for training: 853751
    Donor sites for training: 18219
    False donor sites for training: 672464


    ERROR 69: /GlimmerHMM/train/score exited funny: 35584


    If this process of building training model is right then can anyone help me with this situation. If not then what can I do to build training model? Should I look for acceptor and donor sites in the upstream and downstream of the orf's I got in blastp?
  • dong01
    Junior Member
    • Jun 2011
    • 4

    #2
    have you solved this problem

    Comment

    • hi-koike
      Member
      • Jul 2013
      • 13

      #3
      I would like to know if anyone have solved the problem ?

      Thanks in advance,
      Hideaki

      Comment

      • MVictoria
        Junior Member
        • Aug 2014
        • 5

        #4
        Hi!!

        Did you manage to solve this problem??

        I am getting similar error:

        Simple Consensus = cgttgtggtggtgggggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtgg
        Markov Consensus = ggatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatg
        ******** Old Way = ctctgaggatgatgaggatgatgatgatgatgatgatgatgatgatgagatgatgatgatgatgatgatgatgatgatga
        Segmentation fault (core dumped)
        ERROR 69: /media/sdb1/genome_assembly/GlimmerHMM/train/score exited funny: 35584 at ./../trainGlimmerHMM line 445.


        And the log file:


        Code:
        more TrainGlimmM2014-08-18D15\:53\:24.log
            Training data created successfully! Check exons.dat and seqs for accuracy.
        
        
            Acceptor sites for training: 35581
            False acceptor sites for training: 412224
            Donor sites for training: 35572
            False donor sites for training: 410763
        The training files look like this
        Code:
        1. mfasta
        
            >supercontig_01
            GATCATACAAATCATCCCCTTGGCCTCTGTTAGCCTTCTGCGATCTATCGTGCTCGGAGCAGCTGCAAGC
            CCCGCCAAGTGACAATCCGAAACGGACTCAATAAGATTTGGCGTTGTCGACTTCATTTCAGTTCCGCCGA
            CCTTCCAGCTGCAGCTATCGACTGTCGAAGCCGACCCTCCACGAGTCAAACAGATTGGAAACGATAATAA
            ACCGATCTCCCGAGATAAGAATGGCGCTTTGGTCAAACATGAAGGCGTGAGTGAACACTCTGCTGACTTC
            ATGTAAGTGAGGAGAATATCGCTAAATGTGATACGGACATGACATTAGACTTGCAACAGAAAGAATAATA
            CATGCAGGTCCGAGATGAACAACGAGACAAACCTTGTGTGGTGCTCAACATAGTTTGCTAATAGAAACGT
            GATTGACCGTCACATGGCTCCTTGACTGTCTAGATACATCCGGCTGATCATACTTTGTTCTAGTGTATCC
            ATGACGGAGAAAAGTGCATTTATGATTTTTATGATCGATCTGTTGAATGCCAATAGGCACTTGCGGCTGG
            CCGGCGGAATTGGAAAGGAGCAGGTAGCACTCAACATCAGAGGTGTAACAACCAGCGAACCCATTCAACG
            TTGGAGTCATTTATTGTTTATCTCCGCTCTAGTTTCAGTTTCCTCTCGCGACTTGCTTGTTTGTATCTGA
            GTAAGCACCCGATAATAAAGTAGTTGTCATCACTGGCTTGAAAAATCAAACAATTACTCGCATCTCGCGA
            GAAAGAACAGACTGCTCGTAACAAGCAAGCAAACGCCAAGCTCTTATTCAGATAACATTACTGGATCCCC
            TTCTGCTATCTGATTTATTTAGTGACTGGTCCCGGGCCCGAAGCCGCCACCCTGTGCCACCTCATTTTAA
        
        
        2. exon file
        
            supercontig_01 678584 678745
            supercontig_01 678804 678855
            supercontig_01 678924 679629
            supercontig_01 679711 679801
        
            supercontig_01 681196 681196
            supercontig_01 681108 681102
            supercontig_01 680978 680798
            supercontig_01 680562 680452
            supercontig_01 680342 680256
        
            supercontig_01 683416 683414
            supercontig_01 683197 682953
            supercontig_01 682896 682791
            supercontig_01 682737 682599
            supercontig_01 682548 682162
            supercontig_01 682111 681695
            supercontig_01 681579 681549
            supercontig_01 681489 681408
            supercontig_01 681372 681265

        Thanks in advance!!
        Victoria
        --
        M. Victoria Aguilar Pontes
        PhD student, Fungal Physiology

        CBS-KNAW FUNGAL BIODIVERSITY CENTRE
        Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
        Fungal Molecular Physiology, Utrecht University
        [email protected]

        Comment

        • hi-koike
          Member
          • Jul 2013
          • 13

          #5
          Hi Victoria,

          What Linux OS did you use?
          I tried to run training on Ubuntu OS, but I failed.
          Then I tried to run on Cent OS and it worked.

          I am not sure the reason, but anyway I could manage to solved the problem.
          Once I succeeded to train, I can run glimmer with the trained files on Ubuntu OS.

          Cheers,
          Hideaki

          Comment

          • MVictoria
            Junior Member
            • Aug 2014
            • 5

            #6
            Hi Hi-koike,

            I am using a server running Ubuntu 12.04.5 LTS precise.

            I tried also train another dataset and after 4 days running I got the same error. Any ideas??

            Thanks in advance,
            Victoria
            --
            M. Victoria Aguilar Pontes
            PhD student, Fungal Physiology

            CBS-KNAW FUNGAL BIODIVERSITY CENTRE
            Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
            Fungal Molecular Physiology, Utrecht University
            [email protected]

            Comment

            • hi-koike
              Member
              • Jul 2013
              • 13

              #7
              Hi Victoria,

              Can you run a glimmer using already trained files?
              If you can, it might be the same problem I experienced.

              Can you get a computer to run Cent OS or RedHat OS ?

              I used an old computer formerly used for Windows computer.
              It is easy to install Cent OS and you can install glimmer on the
              Cent OS computer.

              You might need to get some libraries (I forgot the correct names,
              but you can find it by web-search using error message).
              In my case, I could run training on Cent OS within a day.

              Cheers,
              Hideaki

              Comment

              • MVictoria
                Junior Member
                • Aug 2014
                • 5

                #8
                Hi Hi-koike,

                I run trainGlimmer in our server (Ubuntu 12.04.5 LTS precise) with trained files and my own files and I have always got the same error (previous post).

                Now I am running train Glimmer in my computer which is also using Ubuntu 12.04.5 LTS precise but at least the trained files works. So now I am waiting to see the results for own files but this might take longer.

                As a backup plan, I am installing CentOS in the VBox just in case.

                Thank you very much for your help.

                Victoria
                --
                M. Victoria Aguilar Pontes
                PhD student, Fungal Physiology

                CBS-KNAW FUNGAL BIODIVERSITY CENTRE
                Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
                Fungal Molecular Physiology, Utrecht University
                [email protected]

                Comment

                • MVictoria
                  Junior Member
                  • Aug 2014
                  • 5

                  #9
                  Hi Hi-koike,

                  As I said before I got trainglimmer running with the example data in Ubuntu 12.04.5 LTS precise, but my files crash. It is always the same error.

                  Now I am running the example file on Cent OS 7 and I got the same error. Do you remember which Cent OS did you use??

                  Thanks

                  Victoria
                  Last edited by MVictoria; 08-22-2014, 05:20 AM.
                  --
                  M. Victoria Aguilar Pontes
                  PhD student, Fungal Physiology

                  CBS-KNAW FUNGAL BIODIVERSITY CENTRE
                  Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
                  Fungal Molecular Physiology, Utrecht University
                  [email protected]

                  Comment

                  • hi-koike
                    Member
                    • Jul 2013
                    • 13

                    #10
                    Hi Victoria,

                    I am sorry to hear that you could not run on centOS neither.

                    I am not sure the version of centOS which I used, because I am traveling
                    abroad. It might be CentOS 6 because I installed in the April.

                    I have succeeded to run on two RedHat machines and one CentOS machine,
                    but I failed on two Ubuntu machines.

                    On 1 RedHat machine, I could not run because the machined did not
                    have installed libstdcc++.

                    If you got the same error, the problem might be different from mine.
                    I am very sorry that I cannot help.

                    Best regards,
                    Hideaki

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by SEQadmin2


                      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                      Here are nine questions we think about, in roughly the order they matter, before...
                      06-18-2026, 07:11 AM
                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, 06-17-2026, 06:09 AM
                    0 responses
                    37 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-09-2026, 11:58 AM
                    0 responses
                    100 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-05-2026, 10:09 AM
                    0 responses
                    121 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-04-2026, 08:59 AM
                    0 responses
                    113 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...