I was wondering what the best tool is to use if you have a set of SNPs and a list of intervals on chromosomes, how do you determine how many SNPs are in each interval? So far I have been writing my own code, which has been very inefficient. It takes days to run. I was wondering if there is a program that will do this?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
If you have the chromosome co-ordinates of your SNPs, you could use Bedtools.
Save your list of SNPs in bed format as SNP.bed
chr1 100 101 rs1
chr1 105 106 rs2
chr1 110 111 rs3
chr1 5000 5001 rs_not_in_interval
chr2 100 101 rs4
chr2 105 106 rs5
chr2 110 111 rs6
chr2 120 121 rs7
chr2 400 401 rs_not_in_interval
Save your list of intervals in bed format as Intervals.bed
chr1 99 120
chr2 11 130
Then use bedtools intersectBed:
Code:intersectBed -a SNP.bed -b Intervals.bed -wb >SNPs.in.intervals.bed
chr1 100 101 rs1 chr1 99 120
chr1 105 106 rs2 chr1 99 120
chr1 110 111 rs3 chr1 99 120
chr2 100 101 rs4 chr2 11 130
chr2 105 106 rs5 chr2 11 130
chr2 110 111 rs6 chr2 11 130
chr2 120 121 rs7 chr2 11 130
To go one further and count how many SNPs are in each interval:
Code:intersectBed -a SNP.bed -b Interval.bed -wb | awk -F"\t" '{print$5" "$6" "$7}' | uniq -c
4 chr2 11 130
First column gives counts of SNPs in each intervalLast edited by rbagnall; 02-05-2014, 06:49 PM.
-
that's what I usually do
Originally posted by rbagnall View PostIf you have the chromosome co-ordinates of your SNPs, you could use Bedtools.
Save your list of SNPs in bed format as SNP.bed
chr1 100 101 rs1
chr1 105 106 rs2
chr1 110 111 rs3
chr1 5000 5001 rs_not_in_interval
chr2 100 101 rs4
chr2 105 106 rs5
chr2 110 111 rs6
chr2 120 121 rs7
chr2 400 401 rs_not_in_interval
Save your list of intervals in bed format as Intervals.bed
chr1 99 120
chr2 11 130
Then use bedtools intersectBed:
Code:intersectBed -a SNP.bed -b Intervals.bed -wb >SNPs.in.intervals.bed
chr1 100 101 rs1 chr1 99 120
chr1 105 106 rs2 chr1 99 120
chr1 110 111 rs3 chr1 99 120
chr2 100 101 rs4 chr2 11 130
chr2 105 106 rs5 chr2 11 130
chr2 110 111 rs6 chr2 11 130
chr2 120 121 rs7 chr2 11 130
To go one further and count how many SNPs are in each interval:
Code:intersectBed -a SNP.bed -b Interval.bed -wb | awk -F"\t" '{print$5" "$6" "$7}' | uniq -c
4 chr2 11 130
First column gives counts of SNPs in each interval
Comment
Latest Articles
Collapse
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
169 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
256 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
636 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
265 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Comment