Hi,
I don't understand how to convert aligned reads info. (mine is by bowtie, which format should i use?) into gff format and cannot proceed the downstream scripts.
And i cannot produce soap2 output, it skips all reads shorter than 28nt..
Would anyone give some help?
Thanks very much!
Franklin
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
It looks like for some reason, it thinks your repeat regions file is given by the number "16714". Can you paste some of your config file?
It isn't 100% necessary to filter out repeat regions, but I would strongly recommend it to avoid false positives. You can get the data for this at UCSC for example here:
or whatever works best for your preferred version of the genome. You may look to filter out simple repeats and transposon-associated repeats. The format for the repeat region file is just a simple tab delimited file of chrom start stop:
<chrom> <start> <stop>
so you could map the repeat data from UCSC to such a format with a simple perl script.
Leave a comment:
-
I am tring to use miRTRAP. when I set "repeatRegionsFile" to an empty file. I got error :
could not open 16714.
is "repeatRegionsFile" necessary? for human genome, how can I get repeatRegionsFile?
thanks
Yu
Leave a comment:
-
Originally posted by jay2008 View Postthere are several tool to predict miRs such as miRDeep, MIReNA. what is the advantages for different miR prediction tools?
Yu
I will say that miRTRAP takes into account a lot of information. It is necessary for you to align the reads allowing a lot of hits to the genome for each read, because the program takes this information into account in its prediction. Loci with reads that have a lot of hits to other places in the genome (greater than the maxHit parameter) are excluded. Furthermore, loci that are surrounded by such repetitive small RNAs are also filtered out. In my experience, miRDeep has very few false negatives, but some false positives. miRDeep has very few false positives, but some false negatives. Depending on your purposes and your available data, either could work.
Another drawback is that miRTRAP takes a lot of RAM, and for large genomes it may require you to split it up into chromosomes.
Leave a comment:
-
there are several tool to predict miRs such as miRDeep, MIReNA. what is the advantages for different miR prediction tools?
Yu
Leave a comment:
-
miRTRAP question
Hi everyone, this is Dave Hendrix. miRTRAP is my software, and I am happy to answer any questions. You can email me directly (my email is in the manuscript as a corresponding author). A description of the steps of miRTRAP is at:
These instructions have been updated to add more clarity. You can also download a more up-to-date version of the software.
In general, there should be error messages printed out if things don't work with the program. You can post those messages to this thread for more detail. I will attempt to answer these questions one-by-one.
1. "readListFile" - the aligned data in gff format (I've changed mine from Soap2 output to gff format)
The readListFile is a tab separated list of files, with a label and the file name, like this:
tissue1 tissue1_reads.gff
tissue2 tissue2_reads.gff
tissue3 tissue3_reads.gff
where the reads are a size-selected (around 17-25nt) sequencing data in gff format. The file names require a full path to the file if it is not in the directory that you are running the scripts from.
3. "repeatRegionsFile" - What's the difference from genomeFile? (With mask?)
The genome file is the actual fasta file of the genome. Each chromosome/scaffold should be a separate entry of the fasta file. The repeatRegionsFile is a list of the genomic coordinates in the form (chrom start stop) separated by tabs as in:
Scaffold_1631 1739 1818
Scaffold_1631 2189 2258
Scaffold_1631 4125 4178
Scaffold_1631 4369 4415
Scaffold_1631 4505 4588
Please send any other questions my way as I am interested in improving the explanation of the software. Also, in general it doesn't hurt to look at the main perl module miRTRAP.pm and reading through it to become more familiar with how it reads in files and processes them. Best wishes and good luck on your search for microRNAs.
Dave
Leave a comment:
-
hi,Yushan Hsiao
I would like to ask you if the problem have been solved,Whether the process has been smooth,If you are being studied the mirna,Do you have any good software about mirtron (one format of mirna) to find.In this process, I encountered some difficulties,Hope for your help.thanks very much.
Chong Chen
Leave a comment:
-
problem with miRTRAP
I have a simple question with the usage of miRTRAP. I couldn't find others on the internet facing the same problem too. Hope someone could help.
My recent work is to discover novel miR of mouse by miRTRAP. I've checked the "Usage Table of Contents" on the miRTRAP website, but still got some problem with the input data. From the website, the "config.txt" and reads.txt" should be prepared. And the "config.txt" should include the following input data:
1. "readListFile" - the aligned data in gff format (I've changed mine from Soap2 output to gff format)
2. "genomeFile" - the whole mouse genome in fasta format
3. "repeatRegionsFile" - What's the difference from genomeFile? (With mask?)
My first trial was to ignore the "repeatRegionsFile", but the output files of command "printReadRegions.pl config.txt" are all 0kb.
I guess there might be some mistakes in my understanding.
Could anyone help me?
Thanks a lot!Tags: None
Latest Articles
Collapse
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
-
by seqadmin
During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.
Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...-
Channel: Articles
09-09-2024, 10:59 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
22 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
09-30-2024, 08:33 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Leave a comment: