Seqanswers Leaderboard Ad

**kopi-o** · 09-01-2012, 12:33 PM

I would say it's normal, yes. At least this kind of thing is what I typically observe.

**rebrendi** · 09-01-2012, 12:36 PM

Originally posted by kopi-o View Post

I would say it's normal, yes. At least this kind of thing is what I typically observe.

and you considered that all those transcripts have no expression, or just the signal is missing?

**kopi-o** · 09-01-2012, 12:44 PM

Well, of course if the seq depth is very low you will get zero counts for transcripts that are really expressed. Also discarding multi-mapping reads could lead to this sort of effect. But in general, I tend to assume most of the all-zero transcripts are really not expressed.

Perhaps I should go back to my existing RNA-seq data and plot the fraction of all-zero count genes against the sequencing depth. That might give a clue about when the fraction of zero-count genes starts to bottom out.

**rebrendi** · 09-01-2012, 12:51 PM

Originally posted by kopi-o View Post

Perhaps I should go back to my existing RNA-seq data and plot the fraction of all-zero count genes against the sequencing depth. That might give a clue about when the fraction of zero-count genes starts to bottom out.

Yes, that would be the best check. I have actually, for one of the cell lines, two replicate experiments with 30,000 and 5,000 mapped reads. Both of them have these ~8-9% transcripts with zero reads.

**kopi-o** · 09-01-2012, 01:01 PM

30,000 and 5,000 mapped reads, respectively, seems awfully low. I am surprised you have as few as 8-9% zero-count transcripts, unless it is a bacterium or something, but you said it was a cell line. Are these human cell lines or some other species? And what transcript annotation (e g RefSeq) do you use? I use ENSEMBL and I suspect that in itself leads to a larger fraction of zero-count genes.

**rebrendi** · 09-01-2012, 01:25 PM

Originally posted by kopi-o View Post

30,000 and 5,000 mapped reads, respectively, seems awfully low. I am surprised you have as few as 8-9% zero-count transcripts, unless it is a bacterium or something, but you said it was a cell line. Are these human cell lines or some other species? And what transcript annotation (e g RefSeq) do you use? I use ENSEMBL and I suspect that in itself leads to a larger fraction of zero-count genes.

I am using Eldorado, it contains much more than RefSeq, so more noise. But I am getting non-zero expression for these 9% transcripts in one cell line, and zero expression in another line, so this is not the annotation artifact. Sorry, I misprinted in the last post, I have 30 millions and 5 millions mapped reads in these two replicate experiments. What do you think?

**kopi-o** · 09-02-2012, 02:56 AM

OK,

(1) I checked my existing RNA-seq data, admittedly a small sample, but anyway. The most interesting data point is a study where we have 134 (human) biological replicates and up to 60M (paired) reads per sample. Even with this relatively deep probing, I find 23% ENSEMBL genes with all-zero counts! (Again, it may be that ENSEMBL, which is relatively generous regarding inclusion, will systematically yield higher values) For other organisms like Drosophila, the fraction is lower.

(2) If we forget about this zero-count business for a while, and just focus on your core problem, which is to distinguish truly expressed transcripts from truly non-expressed, I haven't found a better way to do it than the one outlined in this paper: http://www.ploscompbiol.org/article/...l.pcbi.1000598

Basically one uses as controls a set of genomic regions for which there is no evidence of expression in any source. Then, by counting how many reads that fall into these "gold standard negative" regions, one can calculate a false positive rate for a range of RPKM values. By finding a good compromise between a low false positive rate and a low false negative rate (calculated from annotated transcripts), one can get an estimate for an RPKM cutoff.

**ETHANol** · 09-02-2012, 03:01 AM

You'll never be able tell which gene are truly not expressed. That's how science works. We can only see what is, you can never see what isn't!!!!!

In this case you will always be able to say, if you sequenced a little deeper a given gene would show some expression.

**rebrendi** · 09-02-2012, 03:26 AM

Originally posted by kopi-o View Post

(2) If we forget about this zero-count business for a while, and just focus on your core problem, which is to distinguish truly expressed transcripts from truly non-expressed, I haven't found a better way to do it than the one outlined in this paper: http://www.ploscompbiol.org/article/...l.pcbi.1000598

Thank you, great article!

**rebrendi** · 09-02-2012, 03:27 AM

Originally posted by kopi-o View Post

(1) I checked my existing RNA-seq data, admittedly a small sample, but anyway. The most interesting data point is a study where we have 134 (human) biological replicates and up to 60M (paired) reads per sample. Even with this relatively deep probing, I find 23% ENSEMBL genes with all-zero counts!

so these were all-zero in all 134 replicates, or just in some fraction of them?

**kopi-o** · 09-02-2012, 03:36 AM

In all 134.

Topics	Statistics	Last Post
Bacterial Timeline Study Suggests Oxygen Use Preceded Photosynthesis by seqadmin Started by seqadmin, Today, 12:59 PM	0 responses 6 views 0 reactions	Last Post by seqadmin Today, 12:59 PM
New Software Simplifies 3D Gene Expression Mapping by seqadmin Started by seqadmin, Yesterday, 10:17 AM	0 responses 8 views 0 reactions	Last Post by seqadmin Yesterday, 10:17 AM
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 60 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM

Seqanswers Leaderboard Ad

RNA-seq results interpretation - help needed

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News