Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTS microrna analysis : a brief summary

    Hi everybody,

    I'm working on HTS microrna data ( Illumina GAIIe ) . I open this thread to put all ideas to analyze such data. I think this can help a lot of people.

    Don't hesitate to reply your way to analyze mirna data !

    I created a wiki page, it's more simple for collaborative editing : microRNA Analysis

    I started :

    1. Pre-Processing of raw data

    1.1 Trim the 3' adapter

    Before analyzing the data, the first thing to do is to trimm the 3' adapter. You can find the adapter sequence here or here. In general it's this sequence you have to trimmed : UCGUAUGCCGUCUUCUGCUUGU

    For this purpose, you can use :
    - trimLRPatterns from R
    - BioPerl script
    - Use a alignment program (bowtie, soap, ... ) to align the adapter to the 3' part of the read sequence.
    - ...


    1.2 Filter reads on size

    Because microrna length is between 17-22, you can discard all the reads with a length < 15 .

    1.3 Filter on quality

    You can discard all the reads with a poor quality. I think under 20 (in phred score) , it's not a good quality ( a score of 20 represent 99% of base call acuracy )

    1.4 Alignement to a reference genome

    Align the reads to a reference genome and discard the reads who don't perfectly matche.
    You can use bowtie, soap, maq,...

    1.5 Filtering on other RNA species

    Other RNA, like snoRNA, tRNA, piRNA,... , are maybe present in the reads. To discard this RNAs ( or to analyze them later ), you can match the reads with the RFam database.


    2. Differential Expression Analysis

    Before the DE analysis step, you can align the reads on the mirBase database to find known miRNA. mirAnalyzer can do that.

    To compare different sample, you've got to normalize them. Some methods exist. For this step, I don't know a lot of methods, so I ask you to complete this list :

    - edgeR : a R package to make DE analyze
    - DESeq : an another R package
    - T-test
    - ANOVA

    Some lectures :

    - A scaling normalization method for differential expression analysis of RNA-seq data, Mark D Robinson and Alicia Oshlack
    - Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments, James H Bullard, Elizabeth Purdom, Kasper D Hansen and Sandrine Dudoit
    - Differential expression analysis for sequence count data, Simon Anders and Wolfgang Huber
    - Normalization strategies for microRNA profiling experiments: a 'normal' way to a hidden layer of complexity?, Meyer SU, Pfaffl MW, Ulbrich SE.



    3. Prediction of novel microRNA

    After alignement on the mirBase database, some reads may not matching with any miRNA in mirBase. They are maybe unknown miRNA.

    Some methods exist to predict novel miRNAs :

    - mirDeep
    - miTrap
    - ...


    4. Prediction of miRNA Targets

    To predict the targets of the miRNAs, like the differentially expressed per example, some programs exists :

    - TargetScan
    - miRanda
    - PicTar
    - RNAHybird
    - miTarget
    - DIANAmicroT
    - ...


    General lecture :

    - Next Generation Sequencing of miRNAs – Strategies, Resources and Methods, Susanne Motameny, Stefanie Wolters, Peter Nürnberg and Björn Schumacher

    Sorry for my english, I'm not very fluent in English writing

    I'm aware that is a very little part of the analyzing process of the HTS miRNA data .

    Don't hesitate to reply your way to analyze miRNA data !

    Nicolas
    Last edited by NicoBxl; 08-27-2010, 01:04 AM.

  • #2
    commercial miRNA analysis pipeline

    Dear Nicholas,

    Please let me know what you think about the CLC bio miRNA analysis pipeline. We released it in version 4.0 in June. The feedback that we have had, especially for its speed and ease of use, has been really positive.

    You can download the GWB 4.0 from here: http://clcbio.com/index.php?id=1292
    (it is free to use for two weeks)

    and the small RNA tutorial is available here: http://www.clcbio.com/index.php?id=649

    You can respond to me via SeqAnswers, so everyone can learn from your observations, or feel free to e-mail me. I am at [email protected].

    Thanks for collecting all of this information. I look forward to hearing what you think.
    -Naomi

    Comment


    • #3
      Hi Naomi,

      I have been testing the CLC pipeline extensively. I had a few questions to our local support. My main concern with the pipeline is that ambiguous reads get assigned randomly when I map my reads to known mirnas from Mirbase. That means I can't compare results if I ran the same analysis twice. I also don't know to which known mirnas the reads map to if it is ambiguous.

      Is there any chance that you can change this in subsequent releases?

      Thanks!
      Anelda

      Comment


      • #4
        miRNA analysis - making it better!

        Hi Anelda-

        You are right. I looked at your record and I saw your e-mails with your recommendations for improving the miRNA functions. Thanks for passing that on. I am sure your feedback has been registered through your representatives, but I also passed them to my contact in the US, so you will be heard twice.

        Thanks for testing our software, and please keep the comments coming.
        -Naomi

        Comment


        • #5
          Hi Nicholas:
          I have a question regarding normalization. If I want to do a simple normalization like RPM (reads per millon), which library do I have to use? mappable reads (including mRNA, tRNA, snoRNA, etc, and sRNA) or just mapped reads against mirBase?

          Thanks a lot

          Diego

          Comment


          • #6
            in section 1.3 Filter on quality, are you saying remove reads that include at least one low quality nt?

            Comment


            • #7
              Originally posted by jay2008 View Post
              in section 1.3 Filter on quality, are you saying remove reads that include at least one low quality nt?
              You can discard the reads with a mean quality under 20.

              Comment


              • #8
                Originally posted by dzavallo View Post
                Hi Nicholas:
                I have a question regarding normalization. If I want to do a simple normalization like RPM (reads per millon), which library do I have to use? mappable reads (including mRNA, tRNA, snoRNA, etc, and sRNA) or just mapped reads against mirBase?

                Thanks a lot

                Diego
                I thnik it's better to use mappable reads (mRNA,snoRNA,...) . Ask Simon Andrews who works on DESeq, he can probably help you better than me..

                Comment


                • #9
                  thanks NicoBxl,
                  before adapter trimming or after adatper trimming?

                  Comment


                  • #10
                    I think I did it after

                    Comment


                    • #11
                      I can not find piRNA from http://rfam.janelia.org/.
                      can you please give me more detailed info. ?

                      thanks
                      Yu

                      Comment


                      • #12

                        Comment


                        • #13
                          thanks! But who is Simon Andrews, and how can I contact him?

                          Originally posted by NicoBxl View Post
                          I thnik it's better to use mappable reads (mRNA,snoRNA,...) . Ask Simon Andrews who works on DESeq, he can probably help you better than me..

                          Comment


                          • #14
                            Originally posted by dzavallo View Post
                            thanks! But who is Simon Andrews, and how can I contact him?
                            he's on this board

                            Comment


                            • #15
                              Should we be aligning first to the genome and then to miRBase or the other way around? Or is it dependent on the reference genome used? I'm working with mouse samples and after mapping 16 million reads to the mouse genome I was left with only ~400,000 that actually mapped. I was later told that the mouse genome isn't well annotated and that only areas around the genes are mapped, and a lot of intergenic space is not.

                              I would think that I should just align to miRBase to find existing miRNAs, and then go and align to the mouse genome when I'm looking at potential new locations of miRNAs or for novel miRNAs. Does this make sense?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Best Practices for Single-Cell Sequencing Analysis
                                by seqadmin



                                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                                06-06-2024, 07:15 AM
                              • seqadmin
                                Latest Developments in Precision Medicine
                                by seqadmin



                                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                Somatic Genomics
                                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                05-24-2024, 01:16 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 07:24 AM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-13-2024, 08:58 AM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-12-2024, 02:20 PM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-07-2024, 06:58 AM
                              0 responses
                              184 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X