Hello,
My name is David Brohawn and I am new to RNA-Seq.
My advisor and I are interested in doing an RNA-Seq experiment to compare the transcriptomes of iPSC neurons we generate from both ALS patients and controls. Ultimately we would like to identify molecular phenotypes based on transcriptome expression profiles for different instances of ALS (much like how cancer researchers now identify underlying molecular phenotypes for different instances of a given cancer).
We are primarily interested in generating transcriptome profiles (involving both coding and non-coding RNA and novel transcripts), with a heavy interest in differential gene expression and less interest in mapping full transcript isoforms.
As I understand it, a greater number of small reads is best to assess differential gene expression (Solid and Illumina look most amenable to this), while a smaller number of long reads is best to assess isoforms (Roche and PacBio look most amenable to this).
I see the ENCODE project recommends “Experiments whose purpose is discovery of novel transcribed elements and strong quantification of known transcript isoforms… a minimum depth of 100-200 M 2 x 76 bp or longer reads is currently recommended.”
We plan on using Illumina Truseq total RNA prep kits followed by sequencing on the Illumina HiSeq 2500. An Illumina rep quoted 187 million reads per lane as typical output for a 2X100 run. If this is true, I am thinking we multiplex our 20 total samples (10 cases and controls) and run 11 total lanes which would average out to just over 100 million reads per sample.
We would then analyze the data with the Tuxedo Suite bioinformatics package (we may substitute STAR for Tophat and Bowtie), and visualize our data using CummeRbund.
We are considering purchasing a LINUX based machine or a Mac with these specs for processing:
CPU – 2 quad core processors
HDD 8 TB – RAID assembly of 4 2-TB drives
RAM – 24 GB of RAM
GHz – 3.2 GHz
I have been told the number of reads per sample may be overkill given our goals, but I am really following ENCODEs recommendations. Do you all have any suggestions based on what I have reported?
Thanks for taking the time to read and respond!
Dave Brohawn
My name is David Brohawn and I am new to RNA-Seq.
My advisor and I are interested in doing an RNA-Seq experiment to compare the transcriptomes of iPSC neurons we generate from both ALS patients and controls. Ultimately we would like to identify molecular phenotypes based on transcriptome expression profiles for different instances of ALS (much like how cancer researchers now identify underlying molecular phenotypes for different instances of a given cancer).
We are primarily interested in generating transcriptome profiles (involving both coding and non-coding RNA and novel transcripts), with a heavy interest in differential gene expression and less interest in mapping full transcript isoforms.
As I understand it, a greater number of small reads is best to assess differential gene expression (Solid and Illumina look most amenable to this), while a smaller number of long reads is best to assess isoforms (Roche and PacBio look most amenable to this).
I see the ENCODE project recommends “Experiments whose purpose is discovery of novel transcribed elements and strong quantification of known transcript isoforms… a minimum depth of 100-200 M 2 x 76 bp or longer reads is currently recommended.”
We plan on using Illumina Truseq total RNA prep kits followed by sequencing on the Illumina HiSeq 2500. An Illumina rep quoted 187 million reads per lane as typical output for a 2X100 run. If this is true, I am thinking we multiplex our 20 total samples (10 cases and controls) and run 11 total lanes which would average out to just over 100 million reads per sample.
We would then analyze the data with the Tuxedo Suite bioinformatics package (we may substitute STAR for Tophat and Bowtie), and visualize our data using CummeRbund.
We are considering purchasing a LINUX based machine or a Mac with these specs for processing:
CPU – 2 quad core processors
HDD 8 TB – RAID assembly of 4 2-TB drives
RAM – 24 GB of RAM
GHz – 3.2 GHz
I have been told the number of reads per sample may be overkill given our goals, but I am really following ENCODEs recommendations. Do you all have any suggestions based on what I have reported?
Thanks for taking the time to read and respond!
Dave Brohawn
Comment