Header Leaderboard Ad

Collapse

GFF3 annotation file

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GFF3 annotation file

    Hi All,

    I want to consult everyone how to use this GFF3 annotation file. Since I use bowtie index in which the name of chromosome has been changed as "1","2","3"..., instead of "chr1","chr2","chr3"..., therefore I could not upload the junction to UCSC since the name is case sensitive.

    I just read the tophat manual providing TopHat with an annotation file. But I don't know how to use this annotation file. I just simply run "--solexa1.3-quals", then got the result. Should I use this file before running this command?
    Can some experienced SEQers give me some hints?

    Really appreciate your help

  • #2
    This depends on how you want to treat your data. Giving TopHat the annotation file will force it look for the junctions contained therein even if it would not have considered them otherwise. There is a gtf2gff3 script available online (google the term) that you can use to make a GFF3 file for hg18 from the hg18 knownGenes table (which is downloadable in GTF format).

    HTH,

    Shurjo

    Comment


    • #3
      Hi shurjo,

      Thanks your reply. I already have the GFF3 file of mouse Mus_musculus.NCBIM37.56.gff3. But still have no clue when I should use this GFF file, before or after tophat running? sorry I am a bit confused.

      Many thanks!

      Comment


      • #4
        I am not sure what exactly you want, but if you:

        1) want to use a GFF file to find out about gene-expression, then tophat since version 1.0.12 says: "TopHat no longer calculates gene expression. Users interested in expression calculations should consider using Cufflinks for gene- and isoform-level expression calculations."

        or

        2) want to provide your own junctions, then search the manual for "Supplying your own junctions" and you'll see the "-G/--GFF <GFF3 file>" flag explained

        svl

        Comment


        • #5
          Neither before nor after but during the TopHat run :-). Use it with the -G option to Tophat

          Like so:

          tophat --mate-inner-dist 240 --mate-std-dev 25 ~/bin/bowtie/bowtie-0.12.1/indexes/hg18_inclusive 108971.read1.fa 108971.read2.fa -m 2 -p 4 -G /home/sensh/pipeline_test/GFF3/UCSC_knowngenes_hg18_tweaked.gff3

          Comment


          • #6
            Thanks Shurjo and svl!

            I just want to provide my own junctions. Therefore I should write (I put data file: bic.txt, and index file as well as GFF3 file in the same folder):

            tophat --solexa1.3-quals Mus_musculus.NCBIM37.56 bic.txt -G mus_musculus.NCBIM37.56.gff3

            But I got en error: Error: you must set the mean inner distance between mates with -r
            And my data is not pair-end data.

            Thanks in advance!

            Comment


            • #7
              Originally posted by Wei-HD View Post
              tophat --solexa1.3-quals Mus_musculus.NCBIM37.56 bic.txt -G mus_musculus.NCBIM37.56.gff3
              Maybe you have to put all options before the index-base and reads. The manual says:

              Usage: tophat [options]* <index_base> <reads1_1[,...,readsN_1]> [reads1_2,...readsN_2]

              Comment

              Latest Articles

              Collapse

              • seqadmin
                How RNA-Seq is Transforming Cancer Studies
                by seqadmin



                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                09-07-2023, 11:15 PM
              • seqadmin
                Methods for Investigating the Transcriptome
                by seqadmin




                Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                Whole Transcriptome RNA-seq
                Whole transcriptome sequencing...
                08-31-2023, 11:07 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 09-22-2023, 09:05 AM
              0 responses
              14 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-21-2023, 06:18 AM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-20-2023, 09:17 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-19-2023, 09:23 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Working...
              X