Header Leaderboard Ad

Collapse

Annotation of NGS data with CLC Bio

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Annotation of NGS data with CLC Bio

    Hi there,

    Has anyone used CLC Bio (or can it be used) to annotate complete genomes or large contigs obtained from either de novo assembly or mapping to a reference genome? I'm talking about gene prediction, TFBS finding, GO annotation etc.

    Thanks!

  • #2
    Yes, there is a pdf posted previously on how to do this. Note for large mammalian genomes the files become very large with each annotation added. For human I usually just annotate the exons and SNPs.

    Comment


    • #3
      Can someone please send the pdf or refer me to the link where I can find it?

      Comment


      • #4
        I think this might be it: http://www.clcbio.com/index.php?id=1343

        Comment


        • #5
          Originally posted by kopi-o View Post
          I think this might be it: http://www.clcbio.com/index.php?id=1343
          Hi, thanks for your help. This is unfortunately to upload a gff file. That means you have already annotated the genome and have created a gff file and just want to upload your annotations. What I am looking for is how to actually create such a file using CLC - if this can be done.

          Thanks again.

          Comment


          • #6
            can you give some example of what you mean by annotate?
            Last edited by husamia; 08-17-2010, 03:44 PM. Reason: short

            Comment


            • #7
              I would actually use artemis instead
              http://www.sanger.ac.uk/resources/software/artemis/

              I believe CLCbio has the ability to do the same. But I got lost in the menus.
              artemis was written to serve this purpose only and it handles embl format which might be useful for submission.

              but correct me if I am wrong, you are not really trying to annotate NGS data but contigs derived from NGS data.
              http://kevin-gattaca.blogspot.com/

              Comment


              • #8
                I have come across several very good tools designed specifically for annotation, but I've been told that CLC can give me the same annotation. Unfortunately no-one can tell me how to get it from CLC. I was hoping someone in the community has had some experience. Sorry, I realize this is moving a bit away from NGS.

                Any help would be much appreciated

                Comment


                • #9
                  Oh no worries, just wanted to clarify the problem in case I am misunderstanding the question.
                  I would think that commercial providers would be able to help your question better?
                  You might get an answer here but I would think getting info from the source would be much more directed and helpful?
                  You did pay for the software so support is obligatory
                  http://kevin-gattaca.blogspot.com/

                  Comment


                  • #10
                    Originally posted by KevinLam View Post
                    You did pay for the software so support is obligatory
                    That's just the thing. Haven't bought it yet. Still trying to determine what percentage of our work could be done on it. Have the trial version. The vendor has been helpful, but not really answering my question so far. That's why I've gone to the community in the meanwhile. Will poke the vendor again today.

                    Comment


                    • #11
                      alternatively

                      Hey

                      We've been using Geneious Pro rather successfully to accomplish this for small contigs (I guess it depends how much memory your computer has). It allows to annotate a sequence, export it as GFF. You can also import other GFFs as annotations only (i.e. if it is a GFF file without a FASTA section but you have a geneious document with the reference sequence).

                      Geneious is also prettier than CLC bio and much much cheaper...

                      cheers,
                      a

                      Comment


                      • #12
                        CLC Bio is a bit buggy. You have to force import your reference genome as normal sequence data NOT next gen sequence data. Once you do that it is easy to annotate it using a GFF file. I tried attach the pdf describing it but this website gave an error message. If you private message me with your email address I can email it to you.

                        Comment


                        • #13
                          When you mark a base you can right-click on it and use the "Add Annotation".
                          For extracting annotations you have to install a plugin called "Extract annotations"

                          http://www.clcbio.com/index.php?id=873

                          There you also find the "Annotate Sequence with GFF File" plugin.

                          Comment


                          • #14
                            I have the same question than husamia: what do you mean by "annotate"?
                            Labeling a genomic region as a "contig/cluster/whatever of reads/ESTs/whatever" is one thing, performing automatic gene structure prediction is a totally different exercice (although both can produce a GFF file). I do not believe that CLC nor Geneious do the latter.

                            Comment


                            • #15
                              You have to annotate your reference genome to know where the coding exons and SNPs are. After the assembly and SNP/DIP detection the software tells you if the variations are coding or known SNPs. You have to know this for mutation discovery.

                              Comment

                              Working...
                              X