Dear All,
I am relatively new to RNA-seq data analysis. I did my share of literature but theory and practice are two different things. So I encounter small problems/questions on which I cannot find straight forward answers. Here is one, hope you can help.
I have aligned my reads to the reference genome (hg19) with TopHat2 and now I want to use DeSeq(2) to identify differentially expressed genes. Obviously a genome annotation file (in this case GFF3) is needed. I wonder what the best solution is. Getting the annotation file from e.g. UCSC? Or generate one from the BAM-files I have, thus converting BAM to BED to GFF3 (to GTF)? In the latter case, how do I deal with the fact that I have multiple BAM-files, i.e. one per sample. I expect differential gene expression so I guess that a GFF3 generated from a BAM file of condition 1 will be different from those for condition 2.
Thanks for your help.
Steven
I am relatively new to RNA-seq data analysis. I did my share of literature but theory and practice are two different things. So I encounter small problems/questions on which I cannot find straight forward answers. Here is one, hope you can help.
I have aligned my reads to the reference genome (hg19) with TopHat2 and now I want to use DeSeq(2) to identify differentially expressed genes. Obviously a genome annotation file (in this case GFF3) is needed. I wonder what the best solution is. Getting the annotation file from e.g. UCSC? Or generate one from the BAM-files I have, thus converting BAM to BED to GFF3 (to GTF)? In the latter case, how do I deal with the fact that I have multiple BAM-files, i.e. one per sample. I expect differential gene expression so I guess that a GFF3 generated from a BAM file of condition 1 will be different from those for condition 2.
Thanks for your help.
Steven
Comment