Hi all,
I am quite new to both bioinformatics (originally an MD) and to RNA-seq. I am currently doing a cancer research project where I have RNA-seq data from twelve brain tumors (from 12 different patients). The tumor type is the same, but there are several diagnostic subgroups present among these 12.
My first project was to look for fusion gene events based on known chromosomal rearrangements. This paper is nearly finished. But now I would like to investigate the gene expression in these tumors.
Now: 3 of these 12 tumors are "grade 1" tumors. These represent a quite distinct subgroup: they have a very characteristic histologic appearance compared to the others. They are cytogenetically normal (which the others are not). The prognosis for these grade 1 tumors is very good compared to the other subgroups. Furthermore, The three tumors stem from the same anatomic location. Based on current literature, it also appears that they differ from the others on the molecular genetic level.
My question is: I would like to look at differential gene expression in these twelve tumors. What would be the best way to design this experiment? Forgive me if this is a stupid or naive question, but I am quite new to all of this.. and hey, you've gotta start somewhere
So far I've come up with three alternatives:
1) To buy commercially available normal human RNA from brain tissue (if I want 3 replicates, do I buy RNA from three different companies...?), and send these for sequencing. Then compare each of the 12 tumors to these normal DNA sequences
2) To acquire three RNA seq datasets from normal brain tissue (TCGA...? Or is this commercially available somewhere?). Then compare each tumor to this dataset
3) To use the grade 1 tumors as "baseline" and then compare each of the remaining nine tumors to this group.
...Or maybe something completely different?
I have a feeling that alternative 1 is the best one, but this will take about four months. Is there any way I can defend going with alternative 3?
Hope this was clear, and if it isn't, I'll do my best to explain further.
I am quite new to both bioinformatics (originally an MD) and to RNA-seq. I am currently doing a cancer research project where I have RNA-seq data from twelve brain tumors (from 12 different patients). The tumor type is the same, but there are several diagnostic subgroups present among these 12.
My first project was to look for fusion gene events based on known chromosomal rearrangements. This paper is nearly finished. But now I would like to investigate the gene expression in these tumors.
Now: 3 of these 12 tumors are "grade 1" tumors. These represent a quite distinct subgroup: they have a very characteristic histologic appearance compared to the others. They are cytogenetically normal (which the others are not). The prognosis for these grade 1 tumors is very good compared to the other subgroups. Furthermore, The three tumors stem from the same anatomic location. Based on current literature, it also appears that they differ from the others on the molecular genetic level.
My question is: I would like to look at differential gene expression in these twelve tumors. What would be the best way to design this experiment? Forgive me if this is a stupid or naive question, but I am quite new to all of this.. and hey, you've gotta start somewhere
So far I've come up with three alternatives:
1) To buy commercially available normal human RNA from brain tissue (if I want 3 replicates, do I buy RNA from three different companies...?), and send these for sequencing. Then compare each of the 12 tumors to these normal DNA sequences
2) To acquire three RNA seq datasets from normal brain tissue (TCGA...? Or is this commercially available somewhere?). Then compare each tumor to this dataset
3) To use the grade 1 tumors as "baseline" and then compare each of the remaining nine tumors to this group.
...Or maybe something completely different?
I have a feeling that alternative 1 is the best one, but this will take about four months. Is there any way I can defend going with alternative 3?
Hope this was clear, and if it isn't, I'll do my best to explain further.
Comment