Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cancer research - RNA seq design

    Hi all,

    I am quite new to both bioinformatics (originally an MD) and to RNA-seq. I am currently doing a cancer research project where I have RNA-seq data from twelve brain tumors (from 12 different patients). The tumor type is the same, but there are several diagnostic subgroups present among these 12.

    My first project was to look for fusion gene events based on known chromosomal rearrangements. This paper is nearly finished. But now I would like to investigate the gene expression in these tumors.

    Now: 3 of these 12 tumors are "grade 1" tumors. These represent a quite distinct subgroup: they have a very characteristic histologic appearance compared to the others. They are cytogenetically normal (which the others are not). The prognosis for these grade 1 tumors is very good compared to the other subgroups. Furthermore, The three tumors stem from the same anatomic location. Based on current literature, it also appears that they differ from the others on the molecular genetic level.

    My question is: I would like to look at differential gene expression in these twelve tumors. What would be the best way to design this experiment? Forgive me if this is a stupid or naive question, but I am quite new to all of this.. and hey, you've gotta start somewhere

    So far I've come up with three alternatives:

    1) To buy commercially available normal human RNA from brain tissue (if I want 3 replicates, do I buy RNA from three different companies...?), and send these for sequencing. Then compare each of the 12 tumors to these normal DNA sequences

    2) To acquire three RNA seq datasets from normal brain tissue (TCGA...? Or is this commercially available somewhere?). Then compare each tumor to this dataset

    3) To use the grade 1 tumors as "baseline" and then compare each of the remaining nine tumors to this group.

    ...Or maybe something completely different?

    I have a feeling that alternative 1 is the best one, but this will take about four months. Is there any way I can defend going with alternative 3?

    Hope this was clear, and if it isn't, I'll do my best to explain further.
    Last edited by thaleko; 02-12-2014, 07:55 AM.

  • #2
    I think my problem with 3) would be that using it as the "control" or baseline for DGE, it assumes it is a viable baseline for comparison. But, it is inherently an aberrant condition, not a true "wild type" condition, since it is from a known tumor type.

    So, I would not accept 3) as a basis for inferring differential expression from true wild-type, or normal brain tissue (differential expression being a purely relative metric). However, it would be valid for telling you what is different between this particular (relatively) benign tumor type and the other three more aggressive tumor types.

    In order to say anything about any of the tumor samples though, in terms of DGE from non-tumorous brain tissue, then you really must have truely normal brain tissue to compare any of them to.

    Your ideal comparison, to my mind, is DGE in the tumors relative to true, normal brain. Then you can look for genes that are unique or shared amongst any subset of the tumor samples. Without that normal brain baseline though, you cannot say which genes are different amongst any of the tumors and significantly different from normal brain.

    As far as 2) goes, that could work, if you can get data for the specific brain tissue you want to compare. If these are a highly localized type of brain tumors, you'd want to keep everything the same if at all possible. If they are not a particularly localized form of tumor, then you could broaden your choice of a suitable normal brain candidate sample.

    If going for 1), which is clearly the optimal choice here, then all you need is samples from different donors. I would prefer to work with one company, and see if they can provide samples for 3 random donors (one company, so all the preps are handled, presumably, the same).

    So, ultimately, it comes down to what you are willing to live with in terms of the limits of your conclusions. Your option 3) would let you discuss differences between the aggressive and the benign tumor types, but you have no way of knowing how any of that relates to normal brain gene expression (so you may be chasing red herrings with the results). I just think that not going for option 1) really constrains your interpretation of your results (so, the pragmatist in me would say, the difference between a really good publication in the end, versus a mediocre one).
    Michael Black, Ph.D.
    ScitoVation LLC. RTP, N.C.


    • #3
      Thanks for your valuable feedback @mbblack! Yes, I was thinking along those lines too. I'll have a chat with my supervisor and see where we end up going.


      • #4

        You might want to check out the genomics reference library that is freely available in GenePool. It is a growing resource that we're making freely available to the community, and it currently contains the gene-level mRNA-Seq data + sample metadata for the TCGA and GTEx projects. That amounts to over 7,000 samples of RNA-Seq data covering 25 cancer indications (primary tumor samples, adjacent normal samples, metastases) as well as healthy controls (1600 samples available in the GTEx RNA-Seq data). I think you'll find it some of the answers you are looking for by focusing in on the brain cancer TCGA projects and the healthy brain tissue sequenced as part of the GTEx project.

        For more information about GenePool's growing genomics reference library that is freely available to the community, check out:

        GenePool is making genomics data management, analysis, and sharing easier!
        Products @


        Latest Articles


        • seqadmin
          The Impact of AI in Genomic Medicine
          by seqadmin

          Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
          02-26-2024, 02:07 PM
        • seqadmin
          Multiomics Techniques Advancing Disease Research
          by seqadmin

          New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

          A major leap in the field has
          02-08-2024, 06:33 AM





        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:12 AM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 02-23-2024, 04:11 PM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 02-21-2024, 08:52 AM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 02-20-2024, 08:57 AM
        0 responses
        Last Post seqadmin