Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA biotypes

    Hello:

    This is my first post.

    I am a genetics student at UConn. I have Illumina RNA-Seq data and my PI wants me to determine the RNA distribution (tRNA%, rRNA%, snoRNA%). I already determined the miRNA content since we used miRDeep software.

    I am thinking about Bowtie-ing our sequencing data to a reference genome, and then somehow checking it against this Ref Gene annotation file I got from UCSC. I am not sure how to interface the Bowtie output against the annotation file, however.

    I am thinking about just writing a Perl script and using a loop to do this.

    What do people think? Have others done this and does this seem like the right way to go about things?

    Jim

  • #2
    Sounds reasonable to me. There are probably existing overlap scripts available. Eval might be worth trying.

    Comment


    • #3
      Could try Tophat (bowtie backend but allows spliced reads to align). Then either count reads in your regions of interest or run Cufflinks with a GTF and use the output FPKM values to get the proportions of each RNA species.

      Comment


      • #4
        I am working on Tophat. Do you know if Tophat maps normal reads inside exons as well as mapping exon-exon junction reads?

        Then, looks like i will have to write a script to compare the Tophat output against a UCSC annotation file of RNA types (otherwise known as Ref Gene)

        Comment


        • #5
          Tophat will map both reads that sit inside exons exlusively and those that cross exon-exon junctions. If you provide a GTF file to Cufflinks, program that generates expression estimates from tophat output, you can easily estimate the abundance of each RNA biotype as the RNA types are encoded in the GTF file. If you look at my intro thread (http://seqanswers.com/forums/showthr...?t=4589&page=3 )you should see the workflow that will get you most of the way to your desired result.

          Comment


          • #6
            I used parafin archived samples for sequencing and all of my RNA is in very tiny bits < 30 bp. Can you suggest an -r setting that would be appropriate for Tophat?

            Comment


            • #7
              Did you do paired-end sequencing? I'd assume not if you only have 30bp inserts in which case the -r option is not appropriate for single end reads. Moreover, with a 30bp read length tophat will most likely not provide you any benefit over just mapping the data with bowtie or bwa directly to genome as few exons will be smaller than 30bp, and the exon-junction finding function splits reads to 25 segments by default so you only have one segment anyways unless you set that value to 12-15bp at which point things will align everywhere.

              Comment


              • #8
                Thats what i was thinking but it said i am required to input an -r value. i guess i can just set it to 0.

                Comment


                • #9
                  if these are single end reads (ie. Not Paired-end) the -r option is not needed it is only need for paired-end reads. I guess the manual should be more specific with "There is no default, and this parameter is required for paired end runs but not single end runs"

                  Versus the current

                  -r

                  This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Investigating the Gut Microbiome Through Diet and Spatial Biology
                    by seqadmin




                    The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                    02-24-2025, 06:31 AM
                  • seqadmin
                    Quality Control Essentials for Next-Generation Sequencing Workflows
                    by seqadmin




                    Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

                    Nucleic Acid Quality Control
                    Preparing for NGS starts with isolating the...
                    02-10-2025, 01:58 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-03-2025, 01:15 PM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 02-28-2025, 12:58 PM
                  0 responses
                  124 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 02-24-2025, 02:48 PM
                  0 responses
                  485 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 02-21-2025, 02:46 PM
                  0 responses
                  241 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X