Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jkozubek
    Member
    • Mar 2011
    • 18

    RNA biotypes

    Hello:

    This is my first post.

    I am a genetics student at UConn. I have Illumina RNA-Seq data and my PI wants me to determine the RNA distribution (tRNA%, rRNA%, snoRNA%). I already determined the miRNA content since we used miRDeep software.

    I am thinking about Bowtie-ing our sequencing data to a reference genome, and then somehow checking it against this Ref Gene annotation file I got from UCSC. I am not sure how to interface the Bowtie output against the annotation file, however.

    I am thinking about just writing a Perl script and using a loop to do this.

    What do people think? Have others done this and does this seem like the right way to go about things?

    Jim
  • ppgardne
    Member
    • Oct 2010
    • 13

    #2
    Sounds reasonable to me. There are probably existing overlap scripts available. Eval might be worth trying.

    Comment

    • Jon_Keats
      Senior Member
      • Mar 2010
      • 279

      #3
      Could try Tophat (bowtie backend but allows spliced reads to align). Then either count reads in your regions of interest or run Cufflinks with a GTF and use the output FPKM values to get the proportions of each RNA species.

      Comment

      • jkozubek
        Member
        • Mar 2011
        • 18

        #4
        I am working on Tophat. Do you know if Tophat maps normal reads inside exons as well as mapping exon-exon junction reads?

        Then, looks like i will have to write a script to compare the Tophat output against a UCSC annotation file of RNA types (otherwise known as Ref Gene)

        Comment

        • Jon_Keats
          Senior Member
          • Mar 2010
          • 279

          #5
          Tophat will map both reads that sit inside exons exlusively and those that cross exon-exon junctions. If you provide a GTF file to Cufflinks, program that generates expression estimates from tophat output, you can easily estimate the abundance of each RNA biotype as the RNA types are encoded in the GTF file. If you look at my intro thread (http://seqanswers.com/forums/showthr...?t=4589&page=3 )you should see the workflow that will get you most of the way to your desired result.

          Comment

          • jkozubek
            Member
            • Mar 2011
            • 18

            #6
            I used parafin archived samples for sequencing and all of my RNA is in very tiny bits < 30 bp. Can you suggest an -r setting that would be appropriate for Tophat?

            Comment

            • Jon_Keats
              Senior Member
              • Mar 2010
              • 279

              #7
              Did you do paired-end sequencing? I'd assume not if you only have 30bp inserts in which case the -r option is not appropriate for single end reads. Moreover, with a 30bp read length tophat will most likely not provide you any benefit over just mapping the data with bowtie or bwa directly to genome as few exons will be smaller than 30bp, and the exon-junction finding function splits reads to 25 segments by default so you only have one segment anyways unless you set that value to 12-15bp at which point things will align everywhere.

              Comment

              • jkozubek
                Member
                • Mar 2011
                • 18

                #8
                Thats what i was thinking but it said i am required to input an -r value. i guess i can just set it to 0.

                Comment

                • Jon_Keats
                  Senior Member
                  • Mar 2010
                  • 279

                  #9
                  if these are single end reads (ie. Not Paired-end) the -r option is not needed it is only need for paired-end reads. I guess the manual should be more specific with "There is no default, and this parameter is required for paired end runs but not single end runs"

                  Versus the current

                  -r

                  This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Pathogen Surveillance with Advanced Genomic Tools
                    by seqadmin




                    The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                    03-24-2025, 11:48 AM
                  • seqadmin
                    New Genomics Tools and Methods Shared at AGBT 2025
                    by seqadmin


                    This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                    The Headliner
                    The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                    03-03-2025, 01:39 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-20-2025, 05:03 AM
                  0 responses
                  49 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-19-2025, 07:27 AM
                  0 responses
                  57 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-18-2025, 12:50 PM
                  0 responses
                  50 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-03-2025, 01:15 PM
                  0 responses
                  201 views
                  0 reactions
                  Last Post seqadmin  
                  Working...