Header Leaderboard Ad


500 million reads needed for RNA-Seq?!



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • 500 million reads needed for RNA-Seq?!

    Genome Res. 2011 May 2. [Epub ahead of print]
    RNA-sequence analysis of human B-cells.
    Toung JM, Morley M, Li M, Cheung VG.
    Genomics and Computational Biology Program, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
    RNA-sequencing (RNA-seq) allows quantitative measurement of expression levels of genes and their transcripts. In this study, we sequenced complementary DNA fragments of cultured human B-cells and obtained 879 million 50-bp reads comprising 44 Gb of sequence. The results allowed us to study the gene expression profile of B-cells and to determine experimental parameters for sequencing-based expression studies. We identified 20,766 genes and 67,453 of their alternatively spliced transcripts. More than 90% of the genes with multiple exons are alternatively spliced; for most genes, one isoform is predominantly expressed. We found that while chromosomes differ in gene density, the percentage of transcribed genes in each chromosome is less variable. In addition, genes involved in related biological processes are expressed at more similar levels than genes with different functions. Besides characterizing gene expression, we also used the data to investigate the effect of sequencing depth on gene expression measurements. While 100 million reads are sufficient to detect most expressed genes and transcripts, about 500 million reads are needed to measure accurately their expression levels. We provide examples in which deep sequencing is needed to determine the relative abundance of genes and their isoforms. With data from 20 individuals and about 40 million sequence reads per sample, we uncovered only 21 alternatively spliced, multi-exon genes that are not in databases; this result suggests that at this sequence coverage, we can detect most of the known genes. Results from this project are available on the UCSC Genome Browser to allow readers to study the expression and structure of genes in human B-cells.

    PMID: 21536721 [PubMed - as supplied by publisher]

  • #2
    Sounds scary, doesn't it? Anthony Fejes commented on this paper on his blog:



    • #3
      From the paper: Our data are available as the ‘‘B-Cell Transcriptome
      (RNA-seq)’’ track on the UCSC Genome Browser.

      Anybody know where this is? What's the track label? What's the build?

      I looked but did not find it. Maybe it's not there yet or it's under a different track label.


      • #4
        I couldn't find the B-Cell Trancriptome track on the UCSC browser either. Maybe we should email the authors to ask?


        • #5

          It's at cgwb.nci.nih.gov under the hg19 tracks ; I got impatient and aligned the sra reads using a "gene,alt-splice/est" model and posted it under UPENN_BCELL_NG_RNA , near bottom. I figure somebody needed to make hg19 rna-seq data publicly available and use-able.

          Try this URL :

          You view the reads using bambino here :
          Last edited by Richard Finney; 06-04-2012, 10:17 AM. Reason: correct URL


          • #6
            Thank you, Dr. Finney!


            • #7
              No prob. Lemme know how I can improve the track (like auto-normalize, right now it's not normalized between samples).