Unconfigured Ad

**pko** · 07-18-2009, 10:12 PM

BEDTools 2.1 fails to compile under linux because of the following line [bedFile.cpp]:

bedEntry.minOverlapStart = INT_MAX;

**quinlana** · 07-20-2009, 03:30 PM

Thanks for finding this, pko. Ostensibly my system and those of other users allow for me to get away with omitting limits.h from that source file. I'll post a new version as soon as I get back from vacation. In the interim, if others face this problem, add the following to bedFile.h on line 12:

#include <limits.h>

Save, re-make and you should be good to go.

Apologies and thanks much for pointing this out.

Best,
Aaron

**dlepp** · 07-21-2009, 11:09 AM

I'm having some strange issues with complementBed - it appears to be highly sensitive to convention used in the chromosome field. For example, this works:

Bed file:

chr21 32345 65443

genome file:

chr21 48099781

but this gives no output:

Bed file:

hr21 32345 65443

genome file:

hr21 48099781

Maybe there are some restrictions in the bed format that I'm unaware of? Haven't tested any of the other tools.

Thanks,

Dion

**quinlana** · 07-22-2009, 07:28 AM

complementBed

Hi dlepp,
Thanks for your post, this is a strange problem. I was able to recreate it as well. There is nothing that explicitly limits what can be used for the "chrom" field. The intent is that any string could be used. Oddly, it seems to be a problem with the C++ string tokenizing function I wrote, which is basically just lifted from a "best practices" book. To make things more odd, the following works (not h22 instead of hr22):

Bed file:

h21 32345 65443

genome file:

h21 48099781

I tried using other tokenizing methods and the problem persists. I am on vacation until early August and will fix it when I return. In the meantime, if you just use chr22 or 22, all should be well.

Thanks for pointing this out as it is a strange error that needs to be addressed.

Best,
Aaron

**quinlana** · 07-22-2009, 10:59 AM

BEDTools v2.1.1

Hi,
I have posted a new version (2.1.1) that addresses the issues that dlepp and pko have so kindly pointed out.

I've posted it to http://people.virginia.edu/~arq5x/bedtools.html and will update sourceforge soon.

Thanks again for letting me know of these problems.

Best,
Aaron

**ohofmann** · 08-15-2009, 07:25 PM

Aaron,

been trying bedTools for mapping SNPs to genomic features -- which often overlap. How does 'closestBed' handle these cases? E.g., two genes that overlap, and an SNP in the overlap region -- does it pick one gene at random? Amount of overlap is going to be identical in these cases.

Thanks!

**quinlana** · 08-17-2009, 05:54 AM

closestBed

Hi ohofmann,

Currently, in such situations, closestBed will return the first feature that occurs in the feature file. This works well for larger intervals (e.g. genes, not SNPs), but in the case you describe, it really isn't ideal.

My guess is that in this case, you'd prefer more control. For example:
a) return _all_ features that overlap with the SNP.
b) return the largest feature that overlaps with the SNP.
c) return the smallest feature that overlaps with the SNP.
d) randomly select a feature.

All of these options are quite easy to implement. I can likely implement them this week or early next week if it helps you. To be precise, cases a-d will only be invoked when there are multiple features in B that have 100% overlap with the interval in A (in your case, a SNP). Otherwise, only the closest (i.e. closest non-overlapping or most overlapping) feature will be reported.

Thanks for pointing this out.
Aaron

**ohofmann** · 08-17-2009, 08:21 AM

Aaron,

not sure it's worth the hassle -- just adding the information to the man page should be more than enough. My current workflow, using the mapping of SNPs to genes within a 25kb window as an example:

* Run windowBed on all SNPs (streamed) vs a gene file, +/- 25kb, printing out all hits
* Cutting out the overlapping gene regions from the result file
* Sort/Unique to remove duplicate genes (not sure how closest handles those, just in case), likewise for SNPs (ensures to remove SNPs that do not have a gene within 25kb which otherwise might end up mapped to genes a few megabases away)

Take those files as input for closestBed. If an SNP actually overlaps more than one gene it probably makes sense to return all since closest really isn't defined. Closest to .. the start of a gene (depends on strand)? The UTR? Etc.

All features is quite likely the only alternative that makes sense in this context.

Best, Oliver

**quinlana** · 08-18-2009, 05:14 AM

Hi Oliver,
I agree that returning all features either optionally or by default is best in this case. Such behavior would allow the user to "pipe" to a downstream Perl/AWK/Python/Ruby/VogueLanguageOfTheMonth in order to choose max, min, random, etc.

I'll try to knock this out in the next couple of weeks. Not hard, just difficult to find time at the moment.

Aaron

**ohofmann** · 08-18-2009, 07:08 AM

No rush at all, and thanks!

-- Oliver

Topics	Statistics	Last Post
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, Today, 08:59 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 Today, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 21 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 29 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM

Unconfigured Ad

BEDTools Version 2.1

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News