Unconfigured Ad

**john_mu** · 06-14-2010, 10:24 PM

Looking at that data, the reads don't look very short... They don't look like they are from a high-throughput sequencer.

For reads that long (and since you don't have many reads), your best best is probably to align them with BLAT. They should align pretty confidently.

EDIT: actually, sorry, they are already annotated... I'm not too familiar with this, you'll have to wait for someone else.

**cswarth** · 06-14-2010, 10:36 PM

Originally posted by john_mu View Post

For reads that long (and since you don't have many reads), your best best is probably to align them with BLAT. They should align pretty confidently.

Your looking at the Riken database? Those aren't our reads, those are what I want to align our reads to.

We will be getting about 25 millions reads from each sample, and there will be many samples. blat isn't an option for that many, right? In any case, I am searching for the right thing to align mouse cDNA reads against that at least has coding regions annotated, and if possible also tissue source and protein product annotations. I don't know if that is a reasonable expectation.

**john_mu** · 06-14-2010, 10:43 PM

oh... so you want to align your short reads against only the known annotated coding regions?

I don't think that is necessary though. If you align them to the entire genome with TopHat or SpliceMap, 90%+ of the reads will align to the coding regions anyway. RNA-seq is quite specific.

After the alignment, then you can compare your results with known annotations and try to see what happened.

Is there any particular reason, you want to only align the reads to known coding regions?

**cswarth** · 06-14-2010, 10:58 PM

Originally posted by john_mu View Post

Is there any particular reason, you want to only align the reads to known coding regions?

Frankly I am doing it that way because that is what I was told to do by the PI. It makes sense if the Riken database is clean and has annotations that lead back to well known gene names and protein products. And the details of our experimental setup really guarantees we will only see reads in coding regions.

But the more I look at the Riken database, the less I like this approach. I'll install cufflink and tophat and see how those can help me align against the whole genome.

Thank you for the reply.

**john_mu** · 06-14-2010, 11:08 PM

oh I see.. well hope it works out! Good luck

Topics	Statistics	Last Post
UC San Diego Bioengineers Map Gene Function in Human Stem Cells by SEQadmin2 Started by SEQadmin2, 07-13-2026, 10:26 AM	0 responses 27 views 0 reactions	Last Post by SEQadmin2 07-13-2026, 10:26 AM
New Analysis Splits Leukemia Into 16 Epigenomic Subgroups by SEQadmin2 Started by SEQadmin2, 07-09-2026, 10:04 AM	0 responses 37 views 0 reactions	Last Post by SEQadmin2 07-09-2026, 10:04 AM
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, 07-08-2026, 10:08 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 07-08-2026, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 35 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM

Unconfigured Ad

Aligning mRNA against Riken database

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News