@thorondor - From all this, one thing for sure I've learnt is I need to know much more about the wetlab experiment also. Yes I will be assembling the reads into contigs.
@colindaven - Yes I will have to learn R for that. I was plain curious to know how one would calculate RNA-seq coverage? that's all ..
Bottom-line is (hope I've understood), is that the number of singletons you get in your assembly of reads as well as its coverage value - assesses your quality of data at the basic level of analysis.
Thanks everyone for helping!!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
If I were you I wouldn't worry so much about these coverage claims. As others have suggested just try looking at how many reads were aligned to each gene / exon/ transcript and look at some summary boxplots.
I have found the bioconductor package edgeR and typical R boxplots to be excellent for this. There's an excellent guide for edgeR on this site too.
Leave a comment:
-
that might be so, but then you imply that the sample is perfectly normalized, which is not the case. ;-) I am also more into bioinformatics. ;-)
Still it is recommended to trim your reads and i doubt that after trimming your coverage is still around 20x. Or is it not your job to assemble them?
only clustering the reads is not the best option, that's my new conclusion after some more thinking.
better first trim, then assemble, then map back the reads to your assembled contigs. Or assemble with velvet, in the contigs id there will be the coverage of the contig. e.g. >NODE_xxxx_length_xxxx.xxxx_cov_xxxxxx.xxxxx
Leave a comment:
-
Thanks Jeremy, I have read the articles and I have a better idea of normalization of a library now.
@thorondor - If I am not wrong, a sequencer can be set to produce the reads with particular coverage required once the sample preparation has been done keeping in mind that approximately 20x coverage needs to be obtained.
P.S.Do correct me .. I am more into Bioinformatics and have lesser knowledge in wetlab..
And on what basis shall I cluster my reads?
Leave a comment:
-
the coverage wont be homogeneously in your transcriptome.
you could cluster your reads which will give you an idea about the coverage.
but how do your clienst know that he sends you a 20x cov RNA-seq, seems to be more like a wild guess.
Leave a comment:
-
By normalised RNA, I mean the use of a normalised RNA library. Not sure if this is the most appropriate reference, but it gives the idea and a starting point - Construction and characterization of a normalized cDNA library. I think I remember reading about someone who has used that approach (but probably not the exact same method from that reference) in here somewhere, but not sure.
This review RNA-Seq: a revolutionary tool for transcriptomics cites some work where they investigate coverage. But again its not something that can be meaningfully applied to a single sequence run.
Leave a comment:
-
Originally posted by Jeremy View PostIf you have normalised RNA maybe, and even then there is still a fairly large copy number difference between highly expressed and lowly expressed genes. Don't forget though that a different set of genes is expressed in each tissue type, so that method would give zero for many genes simply because they naturally are not present.
If its not normalised then you are taking the average across genes that have expression differences of several fold, which isn't a very good indicator.
Leave a comment:
-
Hello,
I agree in that looking for a 'transcriptome coverage' is not sensible since in contrast to the genome, the transcriptome varies over time and tissue.
Maybe a more meaningful statistics to assess the depth of a transcriptome sequencing is in terms of 'transcript detection threshold', i.e. What is the minimal expression level that my sequenced library can detect? So, if in a typical (!?) human cell you have approximately 300000 mRNA molecules (see http://bionumbers.hms.harvard.edu/bi...r=3&hlid=43015) than with 3 million reads you are able to assign ~10 reads to a transcript expressed at a level of 1 molecule/cell. (...I'm aware there is a lot of hand waving here).
Does it make sense, at least in principle?
My 2p
Dario
Leave a comment:
-
Originally posted by seidel View PostWouldn't the way to calculate average coverage for a transcriptome be to just take a description of all the known exons (e.g. from UCSC or Ensembl), and calculate the average depth across the features? Sure there may be novel exons, but as an estimate, average coverage across known features would give an average depth. No?
If its not normalised then you are taking the average across genes that have expression differences of several fold, which isn't a very good indicator.Last edited by Jeremy; 02-13-2011, 07:11 PM.
Leave a comment:
-
Wouldn't the way to calculate average coverage for a transcriptome be to just take a description of all the known exons (e.g. from UCSC or Ensembl), and calculate the average depth across the features? Sure there may be novel exons, but as an estimate, average coverage across known features would give an average depth. No?
Leave a comment:
-
Originally posted by Jeremy View PostWhat platform was used? And was it paired-end sequencing?
You should be able to mine the information that you want out of the data that is available. A good way to gauge how well you have covered the transcriptome is by the number of singletons you obtain.
That was indeed a useful information! Thanks.. but for genome we can calculate, it cannot be calculated for that particular tissue/celltype related RNA-seq reads for an organism??
Leave a comment:
-
What platform was used? And was it paired-end sequencing?
You should be able to mine the information that you want out of the data that is available. A good way to gauge how well you have covered the transcriptome is by the number of singletons you obtain.
Leave a comment:
-
Dear Jeremy,
Well, I am not sure of that as the Wet lab experiment has been done by different group of people. Say, I assume its normalised (I shall ask them for the same though :-( ), then what next?
Thanks..
Leave a comment:
-
No answers yet!! 99 viewers but no solution to this problem?
Please help..
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
Nobel Prize for MicroRNA Discovery
This week,...-
Channel: Articles
10-07-2024, 08:07 AM -
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 06:55 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
Today, 06:55 AM
|
||
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
105 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
113 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
1 response
117 views
0 likes
|
Last Post
by EmiTom
10-07-2024, 06:46 AM
|
Leave a comment: