Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • 48 SOLiD Small RNA

    Dear all,

    As per the title, I have 48 small RNA samples that I've run as part of a barcoded experiment. I've downloaded the primary data which includes the cfasta and qual files for analysis.

    My interest is miRNAs and I understand this area is still very new and ever changing.

    Just wondering whether anyone has any recommendations for analysis software. I've run the samples through CLC Bio but having got raw miRNA counts, I am unsure how to procede with normalising these values (my various libraries have very different total counts in some cases as they come from different tissue pathologies).
    Moreover, CLC doesn't give me the aligned reads in a format I can easily manage downstream.

    I'm therefore looking for a complete solution from csfasta to normalised vaules...

    Any help would be much appreciated.


  • #2
    Here's a rough workflow for RNA transcripts using the free/open source formal attire programs. I'm not quite sure how successful this would be for miRNAs, given that cufflinks is designed for transcripts and finding relative isoform abundancies:
    1. split reads based on index barcode somehow (I assume there's SOLiD software that can do this)
    2. create/retrieve genome or transcriptome index in colour-space (e.g. using bowtie-build)
    3. map reads in colour-space using bowtie (-C, --sam options) or tophat (-C option, default output is sam/bam). Do this once for each sample to produce 48 sam files.
    4. run cufflinks on all sam files together to generate a consensus gtf file
    5. run cuffdiff on all sam files together with the gtf file created in the previous step

    Cuffdiff will produce normalised read counts and probabilities, but because it's designed with the assumption that the input data is from RNA transcripts, it may not produce correct results for miRNAs.


    • #3
      CLC can export to SAM format. This is necessary for downstream processing with other programs.

      You can obviously normalise for read count, i.e. to counts per million reads or similar. I'm not sure about any miRNA specific read normalisation.

      There is still some debate in the RNA-seq community about normalising for gene length, or in this case for miRNA length.


      • #4
        I haven't tried it but I've run across it from time to time: miRanalyzer
        Can be downloaded or run from a webserver. I think it uses DESeq for differential expression.

        miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments. Nucl. Acids Res. (2009) 37 (suppl 2): W68-W76.

        From the site:
        miRanalyzer is a free web-server tool for processing small-RNA data obtained with next generation sequencing platforms such Illumina or SOLiD. The tool requires unique reads in read-count format (i.e., a list of sequences together with the number of times each has been sequenced in the experiment) which can be sequence space (Illumina) or color space (SOLiD). The main features of miRananlyzer are:

        * Mapping all reads against:
        o libraries of known mature microRNAs (including the mature-star libraries – the sequences which pair with the mature microRNAs in the secondary structure of the pre-microRNAs).
        o libraries of theoretically possible mature-star microRNAs which are currently not annotated in miRBase, i.e. have not been observed before.
        o other libraries of transcribed sequences, such as transcriptome and RFam to discard messenger and small non-coding RNAs.
        * Prediction of previously unknown microRNAs by data mining with models specifically trained for plant or animal microRNAs. Detection of differentially expressed microRNAs (including newly predicted ones)
        * Identification of putative target genes for the differentially expressed, new microRNAs.


        • #5
          I will make a slightly different suggestion. Consult a service provider specializing in NGS analysis who can help you reduce the time taken to analyze this data and get the data normalized using the various options of RPKM, etc.
          This can be especially important with SOLiD where the protocols for analysis may not be as straightforward as Illumina.
          There is a service provider section on seqanswers. Otherwise some notable vendors who do alignment software and other solutions could help you out. Shop around and you will definitely find a good partner to work with.


          • #6
            I suggest map to genome and convert them to fasta files, then use miRdeep.
            For normalization, it's different from the RNA-seq or transcriptome analysis, you can use Sequencing depth to normalize these libraries.


            Latest Articles


            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin

              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
              Today, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin

              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM





            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            Last Post seqadmin