Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I remove DNA from RNA dataset by bioinformatics?

    Dear all,

    I am a new starter as I just start to do research related to bioinformatics.
    And I am now facing a big problem.

    My demonstrator told me that my RNA data contain DNA by blasting the whole dataset. And now he would like me to find out a tool to remove DNA data.

    my research is related to metagenomics research, using RNA but not DNA.
    As I have no reference to filter DNA that may come from difference species, could anyone give me a hand?

    Is there any tools to remove DNA from RNA data that with higher accuracy than removal of blast result which indicate DNA features ?

  • #2
    Perhaps someone else will chime in with a bright idea, but I seriously doubt you can distinguish reads coming from DNA and RNA. Just randomly blasting stuff and saying, "Gee, this read falls in the middle of no where in all genomes to which it matches...must just be DNA contamination", seems like a really bad idea. I certainly hope that your instructor/demonstrator/whatever did something vastly more clever than that, but I suspect not.

    Next time, just tell whomever is preparing the samples to DNase treat things.

    Comment


    • #3
      Perhaps hellingwyk is looking to retain reads that match rRNA and discard the rest.

      If that is so look into getting the appropriate sequences databases (http://www.arb-silva.de/ or http://rdp.cme.msu.edu/) for the search.
      Last edited by GenoMax; 09-26-2012, 09:33 AM.

      Comment


      • #4
        Thank you very much for answer my question~

        However, I am now focusing on all kinds of RNA from microbe and virus in order to search novel information...............

        that's why I cannot easily find a reference to filter out DNA....................

        Comment


        • #5
          Overall, if you are looking for all kinds of RNA than you need to trust your library preps and assume that all of the sequences you are seeing are RNAs and not DNAs (and if stuff is mapping to things that have not previously known to be transcribed that doesn't mean it's DNA, it means that perhaps there is more transcription in your system than previously characterized). BLASTing to the genome and assuming that something is DNA because it hasn't been shown to be transcribed before is not evidence of contamination (although it could be, I suppose; hard to say without knowing more of your experimental setup).

          In eukaryotes, a "rough" way of doing this would be to look for evidence of unspliced transcripts, and if their numbers are extremely(!) high - to go back and redo the library preps.

          Comment


          • #6
            I basically agree with dvanic. There is not going to be any reliable way of distinguishing DNA reads from RNA reads when they are mixed together in an RNA library. If this distinction is important to your analysis then you need to verify that efforts were made to remove the DNA that will contaminate any RNA prep from that prep during library construction. In most cases this should include a DNAse treatment of the purified RNA.

            If the RNA library is strand-specific you could re-assure yourself that this was the case by making sure that strand-specificity is reflected in the data. Not sure what a reasonable ratio of +strand to -strand is in a pure RNA library, but it should be pretty high, overall.

            --
            Phillip

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Technologies
              by seqadmin







              Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

              Long-Read Sequencing
              Long-read sequencing has...
              12-02-2024, 01:49 PM
            • seqadmin
              Genetic Variation in Immunogenetics and Antibody Diversity
              by seqadmin



              The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
              11-06-2024, 07:24 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 12-02-2024, 09:29 AM
            0 responses
            150 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-02-2024, 09:06 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-02-2024, 08:03 AM
            0 responses
            42 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 11-22-2024, 07:36 AM
            0 responses
            74 views
            0 likes
            Last Post seqadmin  
            Working...
            X