Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Annotation for contigs from de novo assembly

    Hi,

    I want to annotate my assembled contigs (from de novo assembly). I used BLASTX and only got 10~20% percentage of hits(evalue=1e-5). Now all my differentially expressed contigs (genes) have no annotation. At least I want to know what these genes are, e.g, signaling, transmembrane etc.

    Thanks a lot!
    Victoria

  • #2
    I'd give Prokka a try:

    Comment


    • #3
      Provided Victoria is working with a prokaryotic genome

      NCBI has a eukaryotic annotation pipeline: http://www.ncbi.nlm.nih.gov/genome/a...n_euk/process/ and a prokaryotic one: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ If I recall right, you will have to make the sequence public though at some point in time if you use these.

      Other eukaryotic options (have not used myself):

      Pasa: http://pasa.sourceforge.net/
      Maker: http://www.gmod.org/wiki/MAKER

      Comment


      • #4
        I think Blast2GO would also be useful

        Comment


        • #5
          I've also had good experience with Blast2GO, it doesn't require installation and is quite easy to handle. Also, they updated the quite ugly colours of their pie charts

          Comment


          • #6
            Hi,

            Thank you for your reply. I understand that blast2go (see the below link) just used blast result so basically it won't provide more annotated contigs than BLASTX that I did, is it correct?



            The organism I want to annotate is the protist, Oxyrrhis Marina.

            Thank you!
            Victoria

            Comment


            • #7
              RAST annotation.
              Krishna

              Comment


              • #8
                Hi Victoria, I guess you could use several databases to increase your chances of annotation. What databases have you used? I don't have experience with protists but in general a good start could be to compare against GenBank and Uniprot's Swiss-Prot and TrEMBL protein databases. Have you tried a less conservative e-value? Also try to download similar species that are annotated to compare directly. This reference may help you

                Background Anopheles funestus is one of the primary vectors of human malaria, which causes a million deaths each year in sub-Saharan Africa. Few scientific resources are available to facilitate studies of this mosquito species and relatively little is known about its basic biology and evolution, making development and implementation of novel disease control efforts more difficult. The An. funestus genome has not been sequenced, so in order to facilitate genome-scale experimental biology, we have sequenced the adult female transcriptome of An. funestus from a newly founded colony in Burkina Faso, West Africa, using the Illumina GAIIx next generation sequencing platform. Methodology/Principal Findings We assembled short Illumina reads de novo using a novel approach involving iterative de novo assemblies and “target-based” contig clustering. We then selected a conservative set of 15,527 contigs through comparisons to four Dipteran transcriptomes as well as multiple functional and conserved protein domain databases. Comparison to the Anopheles gambiae immune system identified 339 contigs as putative immune genes, thus identifying a large portion of the immune system that can form the basis for subsequent studies of this important malaria vector. We identified 5,434 1∶1 orthologues between An. funestus and An. gambiae and found that among these 1∶1 orthologues, the protein sequence of those with putative immune function were significantly more diverged than the transcriptome as a whole. Short read alignments to the contig set revealed almost 367,000 genetic polymorphisms segregating in the An. funestus colony and demonstrated the utility of the assembled transcriptome for use in RNA-seq based measurements of gene expression. Conclusions/Significance We developed a pipeline that makes de novo transcriptome sequencing possible in virtually any organism at a very reasonable cost ($6,300 in sequencing costs in our case). We anticipate that our approach could be used to develop genomic resources in a diversity of systems for which full genome sequence is currently unavailable. Our An. funestus contig set and analytical results provide a valuable resource for future studies in this non-model, but epidemiologically critical, vector insect.


                Dave

                Comment


                • #9
                  You can try the Trinotate pipeline. It involves several tools (TransDecoder to get plausible ORFs, PFAM, HMMER, signalIP, tmHMM, RNAmmer) to obtain a quite complete annotation report. They give a lot of details on the website on how to use it.

                  Comment


                  • #10
                    Run a gene prediction tool (e.g. prodigal) over it, throw the proteins in InterproScan, and check if you get anything interesting for your analysis.

                    Might as well be good to know how long the contigs are.
                    Will not be of much use to annotate stuff, which is considerable less long than 900 bp.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Recent Developments in Metagenomics
                      by seqadmin





                      Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                      09-23-2024, 06:35 AM
                    • seqadmin
                      Understanding Genetic Influence on Infectious Disease
                      by seqadmin




                      During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                      Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                      09-09-2024, 10:59 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 10-02-2024, 04:51 AM
                    0 responses
                    13 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 10-01-2024, 07:10 AM
                    0 responses
                    21 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 09-30-2024, 08:33 AM
                    0 responses
                    25 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 09-26-2024, 12:57 PM
                    0 responses
                    18 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X