Seqanswers Leaderboard Ad

**Joann** · 04-21-2011, 07:24 AM

another question

Originally posted by Jon_Keats View Post

If you can not purify the tumor cells to high purity (ie. 85% or greater minimum) you are likely to end up with feelings similar to Simon's comment "they usually lead to nothing". Even with that, I can tell you in our field were we can robustly purify tumor cells by magnetic sorting to an average of 95% purity the really good risk models only fell out in cohorts of 250-350 patients.

How many tumor cells contribute to each DNA or RNA sequencing sample in this type of comparison (risk model)?

**husamia** · 04-21-2011, 09:43 AM

I have a colleague who did RNA-seq experiment and I have analyzed reads for differential 2-sample test with RPKM values. I have a comment about the design which was illumina 25bps reads plus the adapter. So first I had to remove adapter and quality trim. then align to mouse genome and map to mirbase to get my "expression". The first issue I was concerned with is the alignment of the 25bps which I believe was set to 80% match, when I used 90% match more than half of reads didn't align. with 80% match I got 80% aligned which is what I expected. Thats 5 mismatches which may be too high however, I nocited high heterogeniety at ends. I have read reports of this in literature. Does everyone use this cutoff for alignment. I believe this may not be strong enough since 25 with 5 mismatches may not well represent the target. Another concern is this assumption that coverage (depth) as proxy for expression. I believe that there is amplification bias here that may be overwhilming at some cases. The read depth varried which is another issue, what is good depth range and should I remove duplicate reads or not? if I see difference how many samples should I duplicate to be confident that I am measuring different in expression not amplification bias or is it possible to make this stronger?

**christinawu2008** · 05-01-2011, 10:06 PM

Originally posted by husamia View Post

I have a colleague who did RNA-seq experiment and I have analyzed reads for differential 2-sample test with RPKM values. I have a comment about the design which was illumina 25bps reads plus the adapter. So first I had to remove adapter and quality trim. then align to mouse genome and map to mirbase to get my "expression". The first issue I was concerned with is the alignment of the 25bps which I believe was set to 80% match, when I used 90% match more than half of reads didn't align. with 80% match I got 80% aligned which is what I expected. Thats 5 mismatches which may be too high however, I nocited high heterogeniety at ends. I have read reports of this in literature. Does everyone use this cutoff for alignment. I believe this may not be strong enough since 25 with 5 mismatches may not well represent the target. Another concern is this assumption that coverage (depth) as proxy for expression. I believe that there is amplification bias here that may be overwhilming at some cases. The read depth varried which is another issue, what is good depth range and should I remove duplicate reads or not? if I see difference how many samples should I duplicate to be confident that I am measuring different in expression not amplification bias or is it possible to make this stronger?

May be some artefacts in your data and you can detect them by comparing replicates and discard them.

**christinawu2008** · 05-01-2011, 10:48 PM

Hi Simon,

Can you post out the literatures related to comparing GE between using microarray and RNA-Seq, and prove your views? Even though I agree with you, but I'd like to see some published statistics on it.

Thank you!

**ymc** · 02-11-2014, 11:19 PM

Originally posted by Jon_Keats View Post

RNAseq is best used currently for small scale test vs control comparisons or time series. But that is largely assuming you want to look at gene expression and transcript expression comparisons were you need significant read depth for the later. In your situation I think limiting the analysis to "gene" expression you could generally replace affy arrays getting rid of their many inaccuracies for around double the cost per sample likely less depending on vendor. Depending on the tissue, the cost could drop even more if you can multiplex, but the limitation will be relative gene expression. Take my situation of working on multiple myeloma which is a plasma cell disease were the cell really is a factory producing immunoglobulin we need to double the read count to get equal counts on the non-immunoglobulin genes compared to breast cancer because ~50% of all the transcripts in the cell are immunoglobulin.

Hi Jon, can RNA-Seq be used to measure expression of the immunogolbulin genes? I don't see them (ie, IGH*, IGK*, IGL*) in the genes.gtf file. Therefore, I can't see them in the output. Is there any genes.gtf files that contains them? Where can I download?

Thanks a lot in advance.

**dpryan** · 02-12-2014, 01:08 AM

They're included in the Ensembl annotation. The second column of the GTF file for these starts with "IG_".

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News