Seqanswers Leaderboard Ad

**bioinfosm** · 09-29-2010, 11:42 AM

this is very useful miRNA discussion. I have some experience using Illumina's flicker tool.. but not much beyond that. mirTools did not work as well as expected and has its shortcomings..

my hypothesis is

fastq -> adapter trimming -> alignment (novoalign?) (to human genome or reference of mirBase?) -> expression

feel free to add to this..

**quicksand21** · 09-29-2010, 11:16 PM

Hi all,

I'm glad to see my post has received some excellent feedback. Since posting, I have since gone on to developing a pipeline which utilizes a variety of tools. If you're looking for a nice, self-contained method for analyzing small-RNA transcriptome sequencing data, I have been pleased with using miRanalyzer, miRexpress, miRtools, and DSAP. These tools are web-based except for miRexpress, which is command-line. They all address issues of taking reads, adapter trimming, filtering, alignment, annotations, expression profiling, and some utilize different strategies for identifying novel miRNA candidates.

I have also (and am still in the process) of developing an in-house pipeline for such analysis. The basic steps are basically what bioinfosm diagrammed:

fastq --> remove redundancy --> adapter trimming --> remove redundancy again --> filter out low CN --> filter out reads that align to the wrong organism --> alignment (bowtie, maq, novoalign) to the appropriate genome or miRbase hairpins or to known non-coding RNA, snoRNA, etc. --> annotate the aligned reads --> use the reads that aligned and their associated copy numbers to derive expression profiles for the miRNA. I've also started to implement some in-house novel, candidate miRNA algorithms.

A paper I found extremely useful in addition to the great responses from this forum:

http://bib.oxfordjournals.org/content/early/2009/03/30/bib.bbp019.abstract

The Authors do a nice job walking readers through the steps of analyzing sequencing data for small RNAs.

**mitchelS** · 09-30-2010, 05:06 PM

miRNA analysis

Hi All,

I'm attaching our recent publication which may be of help for those like myself that do not have a bioinformatics background

Characterization of the Melanoma miRNAome by Deep Sequencing

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0009685

Background MicroRNAs (miRNAs) are 18–23 nucleotide non-coding RNAs that regulate gene expression in a sequence specific manner. Little is known about the repertoire and function of miRNAs in melanoma or the melanocytic lineage. We therefore undertook a comprehensive analysis of the miRNAome in a diverse range of pigment cells including: melanoblasts, melanocytes, congenital nevocytes, acral, mucosal, cutaneous and uveal melanoma cells. Methodology/Principal Findings We sequenced 12 small RNA libraries using Illumina's Genome Analyzer II platform. This massively parallel sequencing approach of a diverse set of melanoma and pigment cell libraries revealed a total of 539 known mature and mature-star sequences, along with the prediction of 279 novel miRNA candidates, of which 109 were common to 2 or more libraries and 3 were present in all libraries. Conclusions/Significance Some of the novel candidate miRNAs may be specific to the melanocytic lineage and as such could be used as biomarkers to assist in the early detection of distant metastases by measuring the circulating levels in blood. Follow up studies of the functional roles of these pigment cell miRNAs and the identification of the targets should shed further light on the development and progression of melanoma.

We used utilized miRanalyzer as it easily found which of the known mir's where present in mirBase at the time (early 2009 I think) but more importantly it mapped, after removing unwanted reads, back to the genome to predict novel mir's. This prediction is still of course a prediction but after filtering with another program (CID-miRNA), this reduced the list of candidates considerably...many of these have since been deposited in mirBase.

anyway I hope this helps,

cheers,

M.

**dnusol** · 10-04-2010, 01:10 AM

Hi Bioinfosm,
I read about Flicker utility but have not found much about it. Where can it be obtained? How does it compare to FASTX toolkit?

**bioinfosm** · 10-05-2010, 01:38 PM

flicker is from Illumina's ICOM download. How are you comparing it to fastx, which I believe is a QC reporting toolkit!

@mitchelS, thanks for sharing the paper. I could not get their perl script to work but will code up my own and try out their tool!

@quicksand21, thanks for a more inclusive flow-gram!

**dnusol** · 10-06-2010, 01:41 AM

FASTX toolkit has also utilities for adapter removal

Edit: by the way, has anyone seen a TC end in a large portion of the small RNA sequences after removing Illumina's adapter? FASTX_clipper seems to have removed the adapter but I end up with a TC pair as the example

original read:
TGACTCGGAGCGAAGTGACGGATCTCGTATGCCGTCTT
read after trimming
TGACTCGGAGCGAAGTGACGGATC

best

Dave

Edit:
I can answer myself, we were using the new illumina adapters without notice, so actually the ATC tail is also part of the new adapter. This was also mentioned in another post.

**konika** · 10-07-2010, 01:06 AM

Hi,
I have miRNA data and have aligned the nonredundant sequences for each of 3 samples to each of 5 chromosomes in Arabidopsis.
question -is it correct ? or should I use all sequences to map (redundant miRNA sequences)
I have used the bowtie files (sam file)->bam-> sorted->indexed
and visualize them on IGV ,And I see many reads align at almost same locations in chromosome in the 3 samples.
I want to know how important is this, and how can I find the top places (gene locations) where maximum number of these short reads map.
are there any softwares for statistical analysis of such data.
Konika

**Jayu** · 03-01-2012, 02:19 AM

Hi,

I am trying to use flicker for miRNA illumina data which i downloaded from SRA.

I executed following command
perl ~/scripts/flicker.pl --fastq=project_illu_rice/SRR062265.fastq --casava=/usr/local --contam=./AbundantSequences --genomic=./genome/ --mir=miRBase/mature.fa --precursor=miRBase/hairpin.fa --tagSum --summary --adaptor=TGGAATTCTCGGGTGCCAAGGT --species osa

it gave following error
INFO: Trying to open directory /home/bioinfo.corp/Desktop/test_rna2map/test_flicker/FLICKER_201221_15.18.39/contam/reference ...
INFO: ... success, will output file sizes to XML
Making index /home/bioinfo.corp/Desktop/test_rna2map/test_flicker/FLICKER_201221_15.18.39/contam/reference/oryza_filter_reference.fa.idx
Fastq header (@SRR062265.47:HWI-EAS-58_4_FC20AY9AAXX:3:1:911:683:length=35) not proper length: 7 != 10

Can anyone explain me why it is giving this error?

**Jayu** · 03-01-2012, 03:06 AM

all.summary.hist.txt and all.summary.tagHist.txt are the output of flicker. The is one column representing HNA.

Manual explained its as : HNA (Hit Normalized Abundance): Raw tag count/Number of spread (hits) to the database.

Can you please explain what is 'Number of spread (hits) to the database' ?

Secondly, there is another column 'Normalized count = (Column2/total count)*1 million' in 'all.summary.hist.txt' where column2 = HNA
Does it mean Normalized count is same as RPKM?

**Jayu** · 03-10-2012, 01:41 AM

Can anyone suggest me from the following mapping tools which is the better tool and can be used for illumina small RNA data analysis?
Patman,
BLAST,
ELAND,
BWA,
Megablast

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News