Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • DNAjunk
    Member
    • Jun 2009
    • 62

    Filter ribosomal RNA

    Hi

    Although ribosomal depletion is usually performed some sequences with ribosomal content remain and are sequenced.

    How do you filter out reads (e.g. from 454 Titanium runs) with ribosomal content?

    Many thanks for advice!
  • shurjo
    Senior Member
    • Jan 2009
    • 132

    #2
    I usually align the raw data against a set of ribosomal sequences (and the mitochondrial genome) using ELAND, and use grep with the -v option to remove any reads that find a match. If you are using the standard Illumina pipeline you can configure Gerald to use ANALYSIS:eland_rna to do this automatically without having to do it step-by-step.

    Comment

    • DNAjunk
      Member
      • Jun 2009
      • 62

      #3
      Thanks for your reply, shurjo.

      I am wondering where you get the set of ribosomal sequences (and the mitochondrial genome) from?

      Did you make it yourself or download from somewhere?

      Comment

      • shurjo
        Senior Member
        • Jan 2009
        • 132

        #4
        Here are the gi numbers for the ribosomal sequences, you can download them from Genbank.

        gi|555853|gb|U13369.1|HSU1336 Human ribosomal DNA complete repeating unit

        gi|23898|emb|X12811.1| Human 5S DNA

        To this I would add the mitochondrial genome which you can get from the UCSC Genome Browser site (unless you are interested in mitochondrial gene expression)

        HTH,

        Shurjo

        Comment

        • ikim
          Member
          • Mar 2010
          • 13

          #5
          Hello, was wondering how well conserved ribosomal RNA, and ribosomal proteins are? How relevant is it to use the Human ribosomal units for matching waterlily datasets for example?

          Comment

          • cliffbeall
            Senior Member
            • Jan 2010
            • 144

            #6
            Reply to Ikim

            You want to use the closest species you can find. Try the SILVA database: http://www.arb-silva.de/ - they have an extensive collection of small and large subunit sequences and a taxonomic browser to find what you need.

            Comment

            • carmeyeii
              Senior Member
              • Mar 2011
              • 137

              #7
              Originally posted by shurjo View Post
              I usually align the raw data against a set of ribosomal sequences (and the mitochondrial genome) using ELAND, and use grep with the -v option to remove any reads that find a match. If you are using the standard Illumina pipeline you can configure Gerald to use ANALYSIS:eland_rna to do this automatically without having to do it step-by-step.
              Shurjo,

              How do you configure the grep command so that it strips the sequence IDs that mapped to rRNA database, AND the following three lines that contain the sequence, the + and the quality string?

              Thanks!
              Carmen

              Comment

              • carmeyeii
                Senior Member
                • Mar 2011
                • 137

                #8
                Hi everyone!

                I am analyzing some Illumina libraries that appear to have a lot of ribosomal RNA contamination.

                I'm using Bowtie to align the reads only to a specific set of sequences, and because of the differing amount of rRNA contamination in each sample, each of them maps a different percentage of reads to the dataset (some half of what others map), ranging from 1% to 0.3%.

                I wonder if the amount of rRNA contamination in the preparation of a library can have an impact on the apparent expression level of a gene -- even though one normalizes its counts agains the total number of reads that mapped.

                What's your opinion in this subject?

                Carmen

                Comment

                • swbarnes2
                  Senior Member
                  • May 2008
                  • 910

                  #9
                  Originally posted by carmeyeii View Post
                  Shurjo,

                  How do you configure the grep command so that it strips the sequence IDs that mapped to rRNA database, AND the following three lines that contain the sequence, the + and the quality string?

                  Thanks!
                  Carmen
                  That's not how you do it. You have the .bam, which has the sequence and what it mapped to all on one line; you filter that. You could do that pretty easily with grep.

                  Comment

                  Latest Articles

                  Collapse

                  • GATTACAT
                    Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by GATTACAT
                    Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                    07-01-2026, 11:43 AM
                  • SEQadmin2
                    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by SEQadmin2


                    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                    Here are nine questions we think about, in roughly the order they matter, before...
                    06-18-2026, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 07-02-2026, 11:08 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-30-2026, 05:37 AM
                  0 responses
                  15 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-26-2026, 11:10 AM
                  0 responses
                  20 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-17-2026, 06:09 AM
                  0 responses
                  54 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...