@rgejman
This issue has been resolved in v0.9.3.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
IS anyone know how can I plot RNAseq differential expression results using Tophat Cufflink and Cuffdiff, for visualization
Leave a comment:
-
Has anyone resolved this issue? I am still seeing disconnected transcripts in many genes using tophat v1.1.2 and cufflinks 0.9.2. Tophat is run without the -G option, so de novo transcripts are found and I DO see reads connecting junctions that are then not reflected in the cufflinks transcripts.gtf output file.
Leave a comment:
-
how do you make refFlat_RefSeq.gff for mm9
Can somebody tell me where you get the refFlat_RefSeq.gff for mm9? I have found gff3 files for each chromosome (reference assembly, MGSCv37, of mouse build 37.1, in GFF3 format). Do these correspond to mm9? If so you have to combine these gff3 for each chromosome into one file, adding column for chromosome (chr1, chr2 etc) to each gff3 before merging the gff3 files?
Thanks
Leave a comment:
-
If anyone is interested, Cole is working on a new version (0.8.3), that will improve these results.
Leave a comment:
-
Originally posted by rcorbett View PostI'm pretty jealous of your nice results! I have played with cufflinks quite a bit and haven't seen a decent transcript such as that in all of my data.
Is it possible I am not seeing such good results because I am using 50bp reads? I just don't know at this point. Certainly the tophat results show a consistent level of junction reads for cufflinks to be expected to put it together correctly (after all scripture does a fine job).
To show the bam on UCSC you need to index it with samtools, and then as you suggest, upload to a publicly viewable site. THen you just point UCSC browser at your bam and it works! If you are using picassa, you can probably (though I'm not sure) host your bam file on google somewhere and point UCSC to that.
Unfortunately I have been looking at many genes and they all show exactly the same behaviour.
Can you tell me exactly what version of cufflinks you are using, and on what OS? For extra points I could share a small part of my sam file with you and would love to see if you get the same results on my data.
Leave a comment:
-
I'm pretty jealous of your nice results! I have played with cufflinks quite a bit and haven't seen a decent transcript such as that in all of my data.
Is it possible I am not seeing such good results because I am using 50bp reads? I just don't know at this point. Certainly the tophat results show a consistent level of junction reads for cufflinks to be expected to put it together correctly (after all scripture does a fine job).
To show the bam on UCSC you need to index it with samtools, and then as you suggest, upload to a publicly viewable site. THen you just point UCSC browser at your bam and it works! If you are using picassa, you can probably (though I'm not sure) host your bam file on google somewhere and point UCSC to that.
Unfortunately I have been looking at many genes and they all show exactly the same behaviour.
Can you tell me exactly what version of cufflinks you are using, and on what OS? For extra points I could share a small part of my sam file with you and would love to see if you get the same results on my data.
Leave a comment:
-
SO, I looked at this same gene on UCSC along with junctions from tophat and fortunately I get the entire transcript connected.
I have 75bp reads sequenced to ~30 million depth.
I ran tophat with this parameter
tophat -a 10 --coverage-search -p 4 -g 10 -G refFlat_RefSeq.gff -o s2_tophat mm9 ../s2.fastq
cufflinks without -G option
Here is the UCSC image
http://picasaweb.google.com/priyamsi...08504614142194
I have not explored too many other genes systematically but around 5 of them I have seen so far are connected well.
I don't understand why there are two cuff ids (CUFF.204951 and CUFF.204952) with such different FPKM and coverages!! only difference in the two CUFF ids is that one is 3 base longer?
HTML Code:~/tophat/S2$ grep "Insr" transcripts.tmap Insr ENSMUST00000091291 p CUFF.203963 CUFF.203963.1 100 1.082117 0.000000 2.612462 1.685393 89 CUFF.203963.1 Insr ENSMUST00000091291 p CUFF.203965 CUFF.203965.1 100 0.687917 0.000000 1.482256 1.071429 210 CUFF.203965.1 Insr ENSMUST00000139504 c CUFF.203967 CUFF.203967.1 100 2.363397 0.692223 4.034571 3.680982 163 CUFF.203967.1 Insr ENSMUST00000139504 j CUFF.204951 CUFF.204951.1 44 5.694980 1.526815 9.863144 8.869908 9073 CUFF.204952.2 Insr ENSMUST00000139504 j CUFF.204952 CUFF.204952.2 100 13.057124 8.871707 17.242540 20.336418 9076 CUFF.204952.2
May be you should see if tophat is picking those junctions for this gene? Given your image though, I can already see a lot of your reads are crossing junctions. You should also look systematically to see how many genes exhibit this behavior, you may just be unlucky with this one.
Leave a comment:
-
Hi thinkRNA,
To run cufflinks, I used entirely default parameters. I used the pre-compiled 0.8.2 beta version for 64bit linux. I didn't provide a gtf of reference exons because I wanted to test the "de-novo" transcript assembly. I think that cufflinks should do this well according to the paper....
"High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks."
The intronic reads are more or less to be expected. The number of intronic reads varies with the genes, libraries, and disease type that we study. The disconnectivity of the identified transcripts is prevalent throughout my data set, in genes with high and low intronic levels.
To load the cufflinks output into UCSC you can just take your transcripts.gtf output file and load it directly as a custom track.
I would be interested to hear how your data performs with this software.
thanks!
Leave a comment:
-
Originally posted by rcorbett View PostHi all,
I have 50bp paired illumina reads which I have aligned with tophat (default parameters).
The alignments look reasonable in IGV, or UCSC browser.
I have run scripture on the tophat output, and I get a list of isoforms that look reasonable, if not verbose.
However, when I run cufflinks I get very spotty connectivity.
I am trying to attach a screenshot, which shows at the top, the split alignments of tophat, then the predicted transcripts of scripture, and below that, before the reference gene annotations there is the cufflinks output.
Has anyone else seen cufflinks output that is disconected like this? Any ideas on how to improve the results?
I have run scripture, and cufflinks on the same file.
(the screenshot attempt didn't work out)
The image I tried to attach has been posted here instead:
http://www.bcgsc.ca/downloads/rnaSeq...f3f_22ffe0.gif
Also I noted that are a lot of reads landing in intronic regions, is that to be expected?
Finally, can you please tell me which file you used to get the cuff.23.1 track on UCSC? I would like to see if I get similar dis-connectivity in my data.
Leave a comment:
-
cufflinks funny output, scripture comparison
Hi all,
I have 50bp paired illumina reads which I have aligned with tophat (default parameters).
The alignments look reasonable in IGV, or UCSC browser.
I have run scripture on the tophat output, and I get a list of isoforms that look reasonable, if not verbose.
However, when I run cufflinks I get very spotty connectivity.
I am trying to attach a screenshot, which shows at the top, the split alignments of tophat, then the predicted transcripts of scripture, and below that, before the reference gene annotations there is the cufflinks output.
Has anyone else seen cufflinks output that is disconected like this? Any ideas on how to improve the results?
I have run scripture, and cufflinks on the same file.
(the screenshot attempt didn't work out)
The image I tried to attach has been posted here instead:
Attached Files
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
62 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: