I have generated a VCF file from a kinome capture experiment after all the filtering steps. I have a 9.5Mb vcf file from GATK and I don't know what to do with it. Basically I would like to generate a file for each chr that has columns such as position, number of reads, how many times allele is called etc. I don't have any idea what program to use. I don't really care about annotating at this point, just want to see allele calls in a readable text file. Please help.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
The VCF file is a text file with that information in it (largely separated into columns in fact). You can just open it in your favorite text document view/editor (MS Word, Notepad, gvim, Preview.app). Frankly, the easiest way to split the calls by chromosome is probably just to use grep (assuming you're using Linux or a Mac). So "grep chrX file.vcf > chrX.vcf" for each chromosome. I'm sure someone can think of a short one line command with awk and sed to avoid having to grep for each chromosome, but frankly you probably have a small number of chromosomes so this won't be very labor intensive. Note that the result of the grep command won't really be a valid VCF file, since you'll miss much of the header, but that won't matter if you just want to go through it by hand.
Comment
-
Originally posted by dpryan View PostThe VCF file is a text file with that information in it (largely separated into columns in fact). You can just open it in your favorite text document view/editor (MS Word, Notepad, gvim, Preview.app). Frankly, the easiest way to split the calls by chromosome is probably just to use grep (assuming you're using Linux or a Mac). So "grep chrX file.vcf > chrX.vcf" for each chromosome. I'm sure someone can think of a short one line command with awk and sed to avoid having to grep for each chromosome, but frankly you probably have a small number of chromosomes so this won't be very labor intensive. Note that the result of the grep command won't really be a valid VCF file, since you'll miss much of the header, but that won't matter if you just want to go through it by hand.
Comment
-
Originally posted by shawpa View PostThanks! I get that this is probably a stupid question, but if I type grep chr1, it pulls chr1, 11, 12, 13 etc. What am I doing wrong?
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 05-02-2024, 08:06 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
05-02-2024, 08:06 AM
|
||
Started by seqadmin, 04-30-2024, 12:17 PM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-30-2024, 12:17 PM
|
||
Started by seqadmin, 04-29-2024, 10:49 AM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
04-29-2024, 10:49 AM
|
||
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
Comment