Thanks for your tutorial
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Originally posted by flyyuan View PostThanks Matt for this nice guide, now, I am tring to analysis some soybean rna-seq data following this article. However, I am very new to this work, could anybody give me some suggestions to solve following problems:
1. I try to use makeTranscriptDbFromBiomart to get the information of soybean in phytozome database, but it seems there many organisms in phytozome database, how can I select the G.max which I need?
2.bowtie software map the RNA-seq tag to reference gene, what is the criterion for match or does not match.
thanks in advance!
Comment
-
I ran this commond
> txdb=makeTranscriptDbFromUCSC(genome='G.max', tablename='ensGene')
Output was;
Download the ensGene table ... OK
Download the ensGtp table ... OK
Extract the 'transcripts' data frame ... OK
Extract the 'splicings' data frame ... OK
Every thing was OK till this step;
Download and preprocess the 'chrominfo' data frame ... Error in download.file(url, destfile, quiet = TRUE) :
cannot open URL 'http://hgdownload.cse.ucsc.edu/goldenPath/G.max/database/chromInfo.txt.gz'
In addition: There were 50 or more warnings (use warnings() to see the first 50)
I don't know how to handle that.
Comment
-
I had a problem when running this workflow and was wondering if someone could help me solve this problem. It appears as if the chromosome lengths for chromosomes 14 and 10 in the yeast genome are off by a single base between the two reference sequences used when trying to generating the overlaps (read counting). Can anyone assist me with the issue below?
> counts=countOverlaps(tx_by_gene,reads)
Error in queryHits(findOverlaps(query, subject, maxgap = maxgap, minoverlap = minoverlap, :
error in evaluating the argument 'x' in selecting a method for function 'queryHits': Error in mergeNamedAtomicVectors(seqlengths(x), seqlengths(y), what = c("sequence", :
sequences chrXIV, chrX have incompatible seqlengths:
- in 'x': 784333, 745742
- in 'y': 784334, 745741
All help is greatly appreciate. Thanks.
Comment
-
Matt,
I tried to run the Prostate cancer data set explained in the tutorial following the same process using the same tools mentioned. I get the error below when running bowtie
Error: reads file does not look like a FASTQ file
Command: bowtie -v 3 --best --sam /usr/local/bowtie/indexes/hg19 s1.fa s1_test.sam
Any help would be great, I am trying to setup a NGS workflow like the one mentioned in the paper
vic
Comment
-
Originally posted by ssvictor View PostMatt,
I tried to run the Prostate cancer data set explained in the tutorial following the same process using the same tools mentioned. I get the error below when running bowtie
Error: reads file does not look like a FASTQ file
Command: bowtie -v 3 --best --sam /usr/local/bowtie/indexes/hg19 s1.fa s1_test.sam
Any help would be great, I am trying to setup a NGS workflow like the one mentioned in the paper
vic
It looks like bowtie is expecting fastQ format and your reads are in fastA format
Try adding -f
bowtie -v 3 --best --sam /usr/local/bowtie/indexes/hg19 -f s1.fa s1_test.sam
Comment
-
Small code issue
Hi Matt,
First of all, fantastic tutorial--very thorough. Just wanted to let you know of a small error in your code segments in your latest update--a missing terminal ')' in your prostate data example when building a TOC.
toc=data.frame(rep(NA,length(tx_by_gene))
Are you currently adding examples or interested in working on a technical report for other DEG software packages (DEGseq, myrna, etc)? I would be interested in working on such a tutorial or wiki.
Originally posted by MDY View PostHello,
I've written a guide to the analysis of RNA-seq data, for the purpose of differential expression analysis. It currently lives on our internal wiki that can't be viewed outside of our division, although printouts have been used at workshops. It is by no means perfect and very much a work in progress, but a number of people have found it helpful, so I thought it would useful to have it somewhere more publicly accessible.
I've attached a pdf version of the guide, although really what I was hoping was that someone here could suggest somewhere where it could be publicly hosted as a wiki. This area is so multifaceted and fast-moving that the only way such a guide can remain useful is if it can be constantly extended and updated.
If anyone has any suggestions about potential hosting, they can contact me at [email protected]
Cheers
Matt
Update: I've put a few extra things on our local Wiki and seeing as people here seem to be finding this useful I thought I'd post an updated version. I'm also an author on a review paper on Differential Expression using RNA-seq which people who find the guide useful, might also find relevant...
RNA-seq Review
Comment
-
Hello,
Sorry, but after a while I am not able to figure out how this regular expression works:
new_read_chr_names=gsub("(.*)[T]*\\..*","chr\\1",rname(reads))
and can convert these chromosome names:"10.1-129993255" , "11.1-121843856" ,"1.1-197195432" , "12.1-121257530", "13.1-120284312"
Into this new format:
"chr10", "chr11", "chr1", "chr12", "chr13"
I would be very grateful if someone could give me a more detailed explanation about it because I am not able to understand this regular expression.
Thanks in advance!
Comment
-
Hi,
I find the script great and very helpful.
I don't understand the part with the GO enrichment.
I am trying to do it with drosophila genes, which I converted into entrez annotations.
Code:> head(pwf) DEgenes bias.data pwf 43072 0 2002 0.018493247 40191 0 14212 0.044941512 318077 0 493 0.009297322 32941 0 1918 0.017987442 42674 0 1182 0.012241218 42675 0 566 0.009468054
Code:> GO.pvals <- goseq(pwf, "dm3", "refGenes") Fetching GO annotations... Error in getgo(rownames(pwf), genome, id, fetch.cats = test.cats) : Couldn't grab GO categories automatically. Please manually specify
Thanks
Assa
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 05-02-2024, 08:06 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
05-02-2024, 08:06 AM
|
||
Started by seqadmin, 04-30-2024, 12:17 PM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-30-2024, 12:17 PM
|
||
Started by seqadmin, 04-29-2024, 10:49 AM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
04-29-2024, 10:49 AM
|
||
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
Comment