Header Leaderboard Ad

Collapse

BEDTools Version 2.0

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BEDTools Version 2.0

    Hi all,
    Version 2 of BEDTools has been released. These tools allow one to answer common questions of genomic features in BED format. Version 2 has two major improvements:

    1. Enforcing "strandedness". The previous version of BEDTools reported overlaps between BED features regardless of the strand of two features. Now, with the "-s" option, all relevant utilities (e.g. intersectBed, mergeBed, windowBed, closestBed, etc.) will enforce that overlaps are reported ONLY if they are on the same strand. By default, stand is ignored.

    2. Intersecting paired-end reads/SV calls to regular BED files. There is now a program called peIntersectBed that compares features (e.g. paired-end reads, SV calls, etc.) to a regular BED file (e.g. RefSeq genes). In order to do such comparisons, I have defined a new BEDPE format that is very similar to traditional BED formats. The new utility allows one to ask for:

    1. All cases where _either_ end of a BEDPE entry overlaps a BED file.
    2. All cases where _both_ ends of a BEDPE entry overlaps a BED file.
    3. All cases where _neither_ end of a BEDPE entry overlaps a BED file.
    4. All cases where _one and only one_ (i.e. xor) end of a BEDPE entry overlaps a BED file.
    5. All cases where the "inner span" of a BEDPE entry overlaps a BED file.
    6. All cases where the "outer span" of a BEDPE entry overlaps a BED file.

    peIntersectBed is really useful for screening paired-end sequencing reads against genomic annotations.

    The source code for BEDTools Version 2.0 is posted on sourceforge at:
    https://sourceforge.net/projects/bedtools/

    Examples and high-level descriptions can be found here:
    http://people.virginia.edu/~arq5x/bedtools.html

    The USAGE_EXAMPLES document in the BEDTools package contains more detailed examples of common usage. If you have used Galaxy, many of the concepts should be familiar.

    All the best,
    Aaron
    Last edited by quinlana; 05-12-2009, 06:07 PM. Reason: typos

  • #2
    Hi Aaron,
    I am trying to compare two bed files. For example I started exploring a small example as below to test the usuage of the tool.
    Code:
    track name=pairedReads2 description="Clone Paired Reads2" useScore=1
    chr22   1000    5000    cloneA  960     +       1000    5000    0       2       567,488,        0,3512
    chr22   2000    6000    cloneB  900     -       2000    6000    0       2       433,399,        0,3601
    But I get the error as below:

    HTML Code:
     ./mergeBed -i ../../chr22_data/test2.bed 
    Only one BED field detected: 1.  Verify that your files are TAB-delimited.  Exiting... 
    
    or 
    
     ./mergeBed -i ../../chr22_data/test1.bed 
    Unexpected number of fields: 1.  Verify that your files are TAB-delimited and that your BED file has 3,4,5 or 6 fields.  Exiting...
    How do I proceed further. I have a bed file with 12 columns. B'cos each line in the bed file contains 2 blocks of sequence. Is it possible to use the tool for this kind of analysis. Please verify. Thanks.

    Comment


    • #3
      Originally posted by seq_GA View Post
      Hi Aaron,
      I am trying to compare two bed files. For example I started exploring a small example as below to test the usuage of the tool.
      Code:
      track name=pairedReads2 description="Clone Paired Reads2" useScore=1
      chr22   1000    5000    cloneA  960     +       1000    5000    0       2       567,488,        0,3512
      chr22   2000    6000    cloneB  900     -       2000    6000    0       2       433,399,        0,3601
      But I get the error as below:

      HTML Code:
       ./mergeBed -i ../../chr22_data/test2.bed 
      Only one BED field detected: 1.  Verify that your files are TAB-delimited.  Exiting... 
      
      or 
      
       ./mergeBed -i ../../chr22_data/test1.bed 
      Unexpected number of fields: 1.  Verify that your files are TAB-delimited and that your BED file has 3,4,5 or 6 fields.  Exiting...
      How do I proceed further. I have a bed file with 12 columns. B'cos each line in the bed file contains 2 blocks of sequence. Is it possible to use the tool for this kind of analysis. Please verify. Thanks.
      Hi,
      BEDTools only supports tab-delimited BED files with a minimum of 3 (chrom, start and end) fields and a maximum of 6 (optionally adding name, score and strand).

      For example, if you extracted the first 6 columns of your example file, it could be merged as follows:
      PHP Code:
      cut -f 1-6 test.bed mergeBed -i stdin
      chr22    1000    6000 
      I also note that you seem to be dealing with paired sequences. BEDTools has a utility (peIntersectBed) that will intersect paired-end fearures with normal BED files. The file format paired-end BED entries can be found by using the "-h" option with peIntersectBed.

      Lastly, if you are using exactly version 2.0.0, there is a much newer version available here:
      http://code.google.com/p/bedtools.

      All the best,
      Aaron

      Comment


      • #4
        I should also note that one can track the names of which entries were merged (separated by a semicolon) by using the "-names" option.

        From your example:

        PHP Code:
        cut -f 1-6 test.bed mergeBed -i stdin -names
        chr22    1000    6000    cloneA
        ;cloneB 
        This is undocumented in the help and I am changing this as we "speak".
        --Aaron

        Comment


        • #5
          Hi Aaron,

          Thanks for your response. I have downloaded the recent version and start using.

          Code:
          ./mergeBed -n -i ../newdata/full.bed > /../newdata/merged.bed
          The above command works.

          When I try to force with -s options to check the strand information, I don't get any output.

          Code:
          ./mergeBed -n -s -i ../newdata/full.bed > /../newdata/merged.bed
          Without strand, it works fine. Even in the example you have give above no strand info is being printed in the output. Why is it so?

          Basically I am trying to remove duplicate records and merge them as 1 record.

          Thanks and Regards
          Last edited by seq_GA; 10-28-2009, 03:02 AM.

          Comment


          • #6
            Originally posted by seq_GA View Post
            Hi Aaron,

            Thanks for your response. I have downloaded the recent version and start using.

            Code:
            ./mergeBed -n -i ../newdata/full.bed > /../newdata/merged.bed
            The above command works.

            When I try to force with -s options to check the strand information, I don't get any output.

            Code:
            ./mergeBed -n -s -i ../newdata/full.bed > /../newdata/merged.bed
            Without strand, it works fine. Even in the example you have give above no strand info is being printed in the output. Why is it so?

            Basically I am trying to remove duplicate records and merge them as 1 record.

            Thanks and Regards
            Hmm, it works as expected for me using Version 2.2.4. test.bed below is the same as your file above.

            __without__ strand, thus ignores the fact that the two entries are on different strands and combines them:
            PHP Code:
            cut -f 1-6 test.bed mergeBed -i stdin -names
            chr22    1000    6000    cloneA
            ;cloneB 

            __with__ strand, thus observes the fact that the two entries are on different strands and does not combines them:
            PHP Code:
            cut -f 1-6 test.bed mergeBed -i stdin -s
            chr22    1000    5000    
            +
            chr22    2000    6000    

            Comment


            • #7
              Hi Aaron,

              How would you like me to cite your tools if we use them in a publication?

              Thanks!
              Lizzy

              Comment


              • #8
                Originally posted by ewilbanks View Post
                Hi Aaron,

                How would you like me to cite your tools if we use them in a publication?

                Thanks!
                Lizzy
                Hi Lizzy,
                We are working on the manuscript, but until then, please cite it as: Aaron R. Quinlan and Ira M. Hall, unpublished: http://code.google.com/p/bedtools/).
                Thanks for asking and good luck with your manuscript.
                Aaron

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
                  by seqadmin




                  Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
                  03-10-2023, 05:31 AM
                • seqadmin
                  Expert Advice on Automating Your Library Preparations
                  by seqadmin



                  Using automation to prepare sequencing libraries isn’t a new concept, and most researchers are aware that there are numerous benefits to automating this process. However, many labs are still hesitant to switch to automation and often believe that it’s not suitable for their lab. To combat these concerns, we’ll cover some of the key advantages, review the most important considerations, and get real-world advice from automation experts to remove any lingering anxieties....
                  02-21-2023, 02:14 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-17-2023, 12:32 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-15-2023, 12:42 PM
                0 responses
                17 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-09-2023, 10:17 AM
                0 responses
                66 views
                1 like
                Last Post seqadmin  
                Started by seqadmin, 03-03-2023, 12:03 PM
                0 responses
                64 views
                0 likes
                Last Post seqadmin  
                Working...
                X