Unconfigured Ad

**kmcarr** · 04-20-2009, 10:28 AM

The typical protocol for sequencing RNA with 454 is to make ds cDNA, fragment it (nebulizer, covaris, etc.) then use a standard genomic library prep kit from Roche. This means polishing (blunting) the ends and attaching the sequencing adapters in a non-directional manner. Thus the reads you get will be a mixture of both directions.

**behoward** · 04-20-2009, 11:01 AM

Thanks! I guess I have to use q=dna, then.

The dataset I am looking at is a public 454 GS20 dataset from the paper "Sampling the Arabidopsis Transcriptome with massively parallel pyrosequencing" (Weber et al, Plant Physiology May 2007). Kmcarr, I think I remember from a previous post that you have some experience with this particular dataset.

Do you have any guess whether the original researchers used q=rna in the BLAT alignment? I remember they had about 11% of the reads that don't map to the genome. But if I use q=dna, I get a larger percent mapping to TAIR7.

Also, if I do use q=dna, I guess I will only want to 'count' reads once when they map to a gene and its reverse complement. However, I would want to keep both matches when a read maps to multiple genes (say paralogs, or duplicate genes) I'm not sure how to tell these two cases apart... Anyone have any suggestions?

**kmcarr** · 04-20-2009, 12:10 PM

Originally posted by behoward View Post

The dataset I am looking at is a public 454 GS20 dataset from the paper "Sampling the Arabidopsis Transcriptome with massively parallel pyrosequencing" (Weber et al, Plant Physiology May 2007). Kmcarr, I think I remember from a previous post that you have some experience with this particular dataset.

Do you have any guess whether the original researchers used q=rna in the BLAT alignment? I remember they had about 11% of the reads that don't map to the genome. But if I use q=dna, I get a larger percent mapping to TAIR7.

Also, if I do use q=dna, I guess I will only want to 'count' reads once when they map to a gene and its reverse complement. However, I would want to keep both matches when a read maps to multiple genes (say paralogs, or duplicate genes) I'm not sure how to tell these two cases apart... Anyone have any suggestions?

Man! That dataset just won't die. When I said I had some familiarity with the data I was understating it a bit. I was one of the authors, performing all of the bioinformatics. I used the default BLAT settings for query and target type, i.e. both -q ant -t=dna. However BLAT will only output a single alignment for a read at a given location; it will not report both the forward and reverse alignment of a read. You don't have to worry about that.

Your are correct that you will find equally good alignments to paralogous genes. You will have to decide how you want to approach assigning or counting those reads.

You will also find many poor alignments of reads to the genome. You should play with the pslReps program to filter your initial BLAT output. pslReps is meant to retain only the best alignment if a query sequence aligns to multiple target locations. If there are a group of alignments which are equally good (or nearly so) they will all be retained.

**behoward** · 04-25-2009, 12:17 PM

Well, thanks again

I guess I came to the right person! I suppose the good thing about a dataset that won't die is that you must get a ton of citations.

Cheers,
Brian

Topics	Statistics	Last Post
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 28 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 22 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM

Unconfigured Ad

454 read orientation

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News