Generation of BED/GTF/GFF-files

StevenVanLaere

Junior Member

Join Date: Jun 2014

Posts: 3
- Share
- Tweet
#1

Generation of BED/GTF/GFF-files

07-04-2014, 12:00 AM

Dear All,

I am relatively new to RNA-seq data analysis. I did my share of literature but theory and practice are two different things. So I encounter small problems/questions on which I cannot find straight forward answers. Here is one, hope you can help.

I have aligned my reads to the reference genome (hg19) with TopHat2 and now I want to use DeSeq(2) to identify differentially expressed genes. Obviously a genome annotation file (in this case GFF3) is needed. I wonder what the best solution is. Getting the annotation file from e.g. UCSC? Or generate one from the BAM-files I have, thus converting BAM to BED to GFF3 (to GTF)? In the latter case, how do I deal with the fact that I have multiple BAM-files, i.e. one per sample. I expect differential gene expression so I guess that a GFF3 generated from a BAM file of condition 1 will be different from those for condition 2.

Thanks for your help.

Steven
Tags: None
GenoMax

Senior Member

Join Date: Feb 2008

Posts: 7140
- Share
- Tweet
#2

07-04-2014, 12:25 PM

Where did you get your hg19 genome from? It would be best to get the GTF file from the same source.

One solution is to get GTF file from iGenomes. The downloads are large but the files (sequences, annotation, indexes are all in sync as far as the names etc goes) and this will save you time down the road by avoiding problems with annotations etc.
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 20 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 25 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad

Announcement

Generation of BED/GTF/GFF-files

Comment

Latest Articles

ad_right_rmr

News