Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    this is very useful miRNA discussion. I have some experience using Illumina's flicker tool.. but not much beyond that. mirTools did not work as well as expected and has its shortcomings..

    my hypothesis is

    fastq -> adapter trimming -> alignment (novoalign?) (to human genome or reference of mirBase?) -> expression

    feel free to add to this..


    • #17
      Hi all,

      I'm glad to see my post has received some excellent feedback. Since posting, I have since gone on to developing a pipeline which utilizes a variety of tools. If you're looking for a nice, self-contained method for analyzing small-RNA transcriptome sequencing data, I have been pleased with using miRanalyzer, miRexpress, miRtools, and DSAP. These tools are web-based except for miRexpress, which is command-line. They all address issues of taking reads, adapter trimming, filtering, alignment, annotations, expression profiling, and some utilize different strategies for identifying novel miRNA candidates.

      I have also (and am still in the process) of developing an in-house pipeline for such analysis. The basic steps are basically what bioinfosm diagrammed:

      fastq --> remove redundancy --> adapter trimming --> remove redundancy again --> filter out low CN --> filter out reads that align to the wrong organism --> alignment (bowtie, maq, novoalign) to the appropriate genome or miRbase hairpins or to known non-coding RNA, snoRNA, etc. --> annotate the aligned reads --> use the reads that aligned and their associated copy numbers to derive expression profiles for the miRNA. I've also started to implement some in-house novel, candidate miRNA algorithms.

      A paper I found extremely useful in addition to the great responses from this forum:

      The Authors do a nice job walking readers through the steps of analyzing sequencing data for small RNAs.


      • #18
        miRNA analysis

        Hi All,

        I'm attaching our recent publication which may be of help for those like myself that do not have a bioinformatics background

        Background MicroRNAs (miRNAs) are 18–23 nucleotide non-coding RNAs that regulate gene expression in a sequence specific manner. Little is known about the repertoire and function of miRNAs in melanoma or the melanocytic lineage. We therefore undertook a comprehensive analysis of the miRNAome in a diverse range of pigment cells including: melanoblasts, melanocytes, congenital nevocytes, acral, mucosal, cutaneous and uveal melanoma cells. Methodology/Principal Findings We sequenced 12 small RNA libraries using Illumina's Genome Analyzer II platform. This massively parallel sequencing approach of a diverse set of melanoma and pigment cell libraries revealed a total of 539 known mature and mature-star sequences, along with the prediction of 279 novel miRNA candidates, of which 109 were common to 2 or more libraries and 3 were present in all libraries. Conclusions/Significance Some of the novel candidate miRNAs may be specific to the melanocytic lineage and as such could be used as biomarkers to assist in the early detection of distant metastases by measuring the circulating levels in blood. Follow up studies of the functional roles of these pigment cell miRNAs and the identification of the targets should shed further light on the development and progression of melanoma.

        We used utilized miRanalyzer as it easily found which of the known mir's where present in mirBase at the time (early 2009 I think) but more importantly it mapped, after removing unwanted reads, back to the genome to predict novel mir's. This prediction is still of course a prediction but after filtering with another program (CID-miRNA), this reduced the list of candidates considerably...many of these have since been deposited in mirBase.

        anyway I hope this helps,




        • #19
          Hi Bioinfosm,
          I read about Flicker utility but have not found much about it. Where can it be obtained? How does it compare to FASTX toolkit?


          • #20
            flicker is from Illumina's ICOM download. How are you comparing it to fastx, which I believe is a QC reporting toolkit!

            @mitchelS, thanks for sharing the paper. I could not get their perl script to work but will code up my own and try out their tool!

            @quicksand21, thanks for a more inclusive flow-gram!


            • #21
              FASTX toolkit has also utilities for adapter removal

              Edit: by the way, has anyone seen a TC end in a large portion of the small RNA sequences after removing Illumina's adapter? FASTX_clipper seems to have removed the adapter but I end up with a TC pair as the example

              original read:
              read after trimming



              I can answer myself, we were using the new illumina adapters without notice, so actually the ATC tail is also part of the new adapter. This was also mentioned in another post.
              Last edited by dnusol; 10-07-2010, 05:45 AM.


              • #22
                I have miRNA data and have aligned the nonredundant sequences for each of 3 samples to each of 5 chromosomes in Arabidopsis.
                question -is it correct ? or should I use all sequences to map (redundant miRNA sequences)
                I have used the bowtie files (sam file)->bam-> sorted->indexed
                and visualize them on IGV ,And I see many reads align at almost same locations in chromosome in the 3 samples.
                I want to know how important is this, and how can I find the top places (gene locations) where maximum number of these short reads map.
                are there any softwares for statistical analysis of such data.


                • #23

                  I am trying to use flicker for miRNA illumina data which i downloaded from SRA.

                  I executed following command
                  perl ~/scripts/ --fastq=project_illu_rice/SRR062265.fastq --casava=/usr/local --contam=./AbundantSequences --genomic=./genome/ --mir=miRBase/mature.fa --precursor=miRBase/hairpin.fa --tagSum --summary --adaptor=TGGAATTCTCGGGTGCCAAGGT --species osa

                  it gave following error
                  INFO: Trying to open directory /home/bioinfo.corp/Desktop/test_rna2map/test_flicker/FLICKER_201221_15.18.39/contam/reference ...
                  INFO: ... success, will output file sizes to XML
                  Making index /home/bioinfo.corp/Desktop/test_rna2map/test_flicker/FLICKER_201221_15.18.39/contam/reference/oryza_filter_reference.fa.idx
                  Fastq header (@SRR062265.47:HWI-EAS-58_4_FC20AY9AAXX:3:1:911:683:length=35) not proper length: 7 != 10

                  Can anyone explain me why it is giving this error?


                  • #24
                    all.summary.hist.txt and all.summary.tagHist.txt are the output of flicker. The is one column representing HNA.

                    Manual explained its as : HNA (Hit Normalized Abundance): Raw tag count/Number of spread (hits) to the database.

                    Can you please explain what is 'Number of spread (hits) to the database' ?

                    Secondly, there is another column 'Normalized count = (Column2/total count)*1 million' in 'all.summary.hist.txt' where column2 = HNA
                    Does it mean Normalized count is same as RPKM?


                    • #25
                      Can anyone suggest me from the following mapping tools which is the better tool and can be used for illumina small RNA data analysis?


                      Latest Articles


                      • seqadmin
                        Best Practices for Single-Cell Sequencing Analysis
                        by seqadmin

                        While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                        06-06-2024, 07:15 AM
                      • seqadmin
                        Latest Developments in Precision Medicine
                        by seqadmin

                        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                        Somatic Genomics
                        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                        05-24-2024, 01:16 PM





                      Topics Statistics Last Post
                      Started by seqadmin, 06-17-2024, 06:54 AM
                      0 responses
                      Last Post seqadmin  
                      Started by seqadmin, 06-14-2024, 07:24 AM
                      0 responses
                      Last Post seqadmin  
                      Started by seqadmin, 06-13-2024, 08:58 AM
                      0 responses
                      Last Post seqadmin  
                      Started by seqadmin, 06-12-2024, 02:20 PM
                      0 responses
                      Last Post seqadmin