Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fungal Genome Annotation

    I have a genome (fungal genome), it looks like-
    >contig1
    ATTAAATATACCCCACAAAATAGAGACAGAGACACATATTAA
    >contig2
    ATATCGAGAGAGGGCGCGCGCCGCGCGGCCGCGAGGAGAGTATA
    >contig3
    ATGCGCGATAGAGCTATATCTATCTCTCTATATAGAGA

    the genome is approx (50MB)
    i would to annotate the fungal genome, is there a simple and freely available server or tools which do that.

    I appreciate it.

    Best !
    Shashank

  • #2
    Unless most of your contigs are much longer and more complex than those, don't bother. Do you happen to know the N50? If not, you can calculate it with with my assembly stats tool:

    stats.sh contigs.fasta

    ...just post the results in this thread.

    Comment


    • #3
      Actually i am new to command line. anyways, when i used command

      qiime@qiime-VirtualBox:~/Desktop/bbmap$ stats.sh 454AllContigs.fna > new.txt
      A C G T N IUPAC Other GC GC_stdev
      0.2355 0.2637 0.2626 0.2382 0.0000 0.0000 0.0000 0.5263 0.0399

      Main genome scaffold total: 10457
      Main genome contig total: 10457
      Main genome scaffold sequence total: 17.071 MB
      Main genome contig sequence total: 17.071 MB 0.003% gap
      Main genome scaffold N/L50: 2961/1.977 KB
      Main genome contig N/L50: 2960/1.977 KB
      Max scaffold length: 11.239 KB
      Max contig length: 11.239 KB
      Number of scaffolds > 50 KB: 0
      % main genome in scaffolds > 50 KB: 0.00%


      Minimum Number Number Total Total Scaffold
      Scaffold of of Scaffold Contig Contig
      Length Scaffolds Contigs Length Length Coverage
      -------- -------------- -------------- -------------- -------------- --------
      All 10,457 10,457 17,071,168 17,070,688 100.00%
      50 10,457 10,457 17,071,168 17,070,688 100.00%
      100 10,457 10,457 17,071,168 17,070,688 100.00%
      250 10,004 10,004 16,994,544 16,994,072 100.00%
      500 9,421 9,421 16,780,800 16,780,339 100.00%
      1 KB 7,751 7,751 15,427,669 15,427,255 100.00%
      2.5 KB 1,668 1,668 5,688,859 5,688,774 100.00%
      5 KB 113 113 693,645 693,638 100.00%
      10 KB 4 4 42,301 42,301 100.00%
      Last edited by shashankgupta; 02-03-2015, 11:27 PM.

      Comment


      • #4
        That's close - I think the problem is the spaces in the path. Try this:

        bash stats.sh in="../Jitender Fungal genome/454AllContigs.fna"

        That should work. If not, you can copy the assembly into the local folder like this:
        cp file destination

        Comment


        • #5
          P.S. I changed the command (as mentioned above) and i believe i got the expected result.

          Comment


          • #6
            OK, that's not bad - most of the assembly is in fragments over 1900bp, which will give reasonable annotation. Unfortunately, if the genome is expected to be 50Mbp, you only assembled 17Mbp of it, or ~34%. If possible, I recommend trying different assemblers, different parameters, or different preprocessing to obtain the longest possible contigs and highest genome recovery possible before you start annotation.

            I don't know of a good, simple, standalone tool. This is the JGI's standard procedure:



            ...but I'm not directly involved in the annotation, and it looks quite complicated, using lots of different programs. They should all be free, though.

            Edit: Looking through that in more depth, it does not really look possible to replicate outside of JGI. Hopefully someone else will have a suggestion. I will recommend to the fungal team that they package their annotation pipeline in a Docker container, but that may take a few years
            Last edited by Brian Bushnell; 02-03-2015, 11:53 PM.

            Comment


            • #7
              sounds great !
              but i can't wait for few years
              the command i used stats.sh is done for the genome which is having 17 MB in it.
              i try JGI procedure, but it looks very complicated to me.

              Comment


              • #8
                If you do not mind to upload it to a server, you can use NCBI's eukaryotic annotation pipeline;



                If you just want to predict genes, go for Augustus:


                Or use MAKER for both predicting and annotating genes:

                Comment


                • #9
                  While the JGI SOP is a nice writeup on freely available tools, you'd need to gather some command line experience or team up with a skilled bioinformatician to run and merge all the results from these tools, until they release a complete installation package...

                  As an alternative you may want to try Augustus (http://bioinf.uni-greifswald.de/augustus/) to predict genes - they also offer Web submissions to their servers, if you are not skilled running tools on the command line.

                  For the downstream functional annotation you could run InterProScan on the resultant CDS or peptides (http://www.ebi.ac.uk/interpro/interproscan.html). I'd use the download version but there is a possibility to use their servers (with some limits) via web submission. The command line version is quite straightforward to use and integrates many complementary predicition and comparative tools.

                  Comment


                  • #10
                    Well i give it a try.
                    I tried Augustus server, i think server have some limitation about the maximum MB, mine is approx 17 MB. So uploading failed in the server.

                    i downloaded the AUGUSTUS, but i am not able to run it. in the tutorial i got stuck in point 3
                    i.e.

                    3. set environment variable AUGUSTUS_CONFIG_PATH

                    > export AUGUSTUS_CONFIG_PATH=/my_path_to_AUGUSTUS/augustus/config/

                    The program requires that the environment variable AUGUSTUS_CONFIG_PATH is set to the config directory that contains the
                    configuration and parameter files. This is the directory 'augustus/config'. You probably want to add this line to a startup script (like ~/.bashrc).
                    Alternatively, you can specify this directory on the command line when you run augustus:
                    --AUGUSTUS_CONFIG_PATH=/my_path_to_AUGUSTUS/augustus/config/
                    You may want to add the path of the executable to the PATH environment variable or copy augustus into a common directory (e.g. /usr/bin/).


                    Thanx

                    Comment


                    • #11
                      Where did you install Augustus on your machine (i.e. under which path)? Then execute the "export" command as indicated by the tutorial, replacing "my_path_to_AUGUSTUS/augustus" with the installation path...

                      Comment


                      • #12


                        How to upload in the server ? Does this do annotation for fungal genome ?

                        Comment


                        • #13
                          Augustus is installed in

                          /root/Desktop/augustus.2.5.5

                          Comment


                          • #14
                            Originally posted by shashankgupta View Post
                            http://www.ncbi.nlm.nih.gov/genome/a...n_euk/process/

                            How to upload in the server ? Does this do annotation for fungal genome ?
                            You need to submit your assembly first:

                            Comment


                            • #15
                              Originally posted by shashankgupta View Post
                              Augustus is installed in

                              /root/Desktop/augustus.2.5.5
                              So did you try

                              Code:
                              export AUGUSTUS_CONFIG_PATH=/root/Desktop/augustus.2.5.5/augustus/config/
                              or

                              Code:
                              export AUGUSTUS_CONFIG_PATH=/root/Desktop/augustus.2.5.5/config/
                              (in case there is no "augustus" subfolder inside "augustus.2.5.5")

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advanced Tools Transforming the Field of Cytogenomics
                                by seqadmin


                                At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                                09-26-2023, 06:26 AM
                              • seqadmin
                                How RNA-Seq is Transforming Cancer Studies
                                by seqadmin



                                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                                09-07-2023, 11:15 PM
                              • seqadmin
                                Methods for Investigating the Transcriptome
                                by seqadmin




                                Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                                Whole Transcriptome RNA-seq
                                Whole transcriptome sequencing...
                                08-31-2023, 11:07 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:57 AM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-26-2023, 07:53 AM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-25-2023, 07:42 AM
                              0 responses
                              15 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-22-2023, 09:05 AM
                              0 responses
                              45 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X