Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BEDTools Version 2.0

    Hi all,
    Version 2 of BEDTools has been released. These tools allow one to answer common questions of genomic features in BED format. Version 2 has two major improvements:

    1. Enforcing "strandedness". The previous version of BEDTools reported overlaps between BED features regardless of the strand of two features. Now, with the "-s" option, all relevant utilities (e.g. intersectBed, mergeBed, windowBed, closestBed, etc.) will enforce that overlaps are reported ONLY if they are on the same strand. By default, stand is ignored.

    2. Intersecting paired-end reads/SV calls to regular BED files. There is now a program called peIntersectBed that compares features (e.g. paired-end reads, SV calls, etc.) to a regular BED file (e.g. RefSeq genes). In order to do such comparisons, I have defined a new BEDPE format that is very similar to traditional BED formats. The new utility allows one to ask for:

    1. All cases where _either_ end of a BEDPE entry overlaps a BED file.
    2. All cases where _both_ ends of a BEDPE entry overlaps a BED file.
    3. All cases where _neither_ end of a BEDPE entry overlaps a BED file.
    4. All cases where _one and only one_ (i.e. xor) end of a BEDPE entry overlaps a BED file.
    5. All cases where the "inner span" of a BEDPE entry overlaps a BED file.
    6. All cases where the "outer span" of a BEDPE entry overlaps a BED file.

    peIntersectBed is really useful for screening paired-end sequencing reads against genomic annotations.

    The source code for BEDTools Version 2.0 is posted on sourceforge at:
    Download BEDTools for free. BEDTools is a suite of utilities for comparing genomic features in BED format. These utilities allow one to quickly address tasks such as: 1.


    Examples and high-level descriptions can be found here:


    The USAGE_EXAMPLES document in the BEDTools package contains more detailed examples of common usage. If you have used Galaxy, many of the concepts should be familiar.

    All the best,
    Aaron
    Last edited by quinlana; 05-12-2009, 06:07 PM. Reason: typos

  • #2
    Hi Aaron,
    I am trying to compare two bed files. For example I started exploring a small example as below to test the usuage of the tool.
    Code:
    track name=pairedReads2 description="Clone Paired Reads2" useScore=1
    chr22   1000    5000    cloneA  960     +       1000    5000    0       2       567,488,        0,3512
    chr22   2000    6000    cloneB  900     -       2000    6000    0       2       433,399,        0,3601
    But I get the error as below:

    HTML Code:
     ./mergeBed -i ../../chr22_data/test2.bed 
    Only one BED field detected: 1.  Verify that your files are TAB-delimited.  Exiting... 
    
    or 
    
     ./mergeBed -i ../../chr22_data/test1.bed 
    Unexpected number of fields: 1.  Verify that your files are TAB-delimited and that your BED file has 3,4,5 or 6 fields.  Exiting...
    How do I proceed further. I have a bed file with 12 columns. B'cos each line in the bed file contains 2 blocks of sequence. Is it possible to use the tool for this kind of analysis. Please verify. Thanks.

    Comment


    • #3
      Originally posted by seq_GA View Post
      Hi Aaron,
      I am trying to compare two bed files. For example I started exploring a small example as below to test the usuage of the tool.
      Code:
      track name=pairedReads2 description="Clone Paired Reads2" useScore=1
      chr22   1000    5000    cloneA  960     +       1000    5000    0       2       567,488,        0,3512
      chr22   2000    6000    cloneB  900     -       2000    6000    0       2       433,399,        0,3601
      But I get the error as below:

      HTML Code:
       ./mergeBed -i ../../chr22_data/test2.bed 
      Only one BED field detected: 1.  Verify that your files are TAB-delimited.  Exiting... 
      
      or 
      
       ./mergeBed -i ../../chr22_data/test1.bed 
      Unexpected number of fields: 1.  Verify that your files are TAB-delimited and that your BED file has 3,4,5 or 6 fields.  Exiting...
      How do I proceed further. I have a bed file with 12 columns. B'cos each line in the bed file contains 2 blocks of sequence. Is it possible to use the tool for this kind of analysis. Please verify. Thanks.
      Hi,
      BEDTools only supports tab-delimited BED files with a minimum of 3 (chrom, start and end) fields and a maximum of 6 (optionally adding name, score and strand).

      For example, if you extracted the first 6 columns of your example file, it could be merged as follows:
      PHP Code:
      cut -f 1-6 test.bed mergeBed -i stdin
      chr22    1000    6000 
      I also note that you seem to be dealing with paired sequences. BEDTools has a utility (peIntersectBed) that will intersect paired-end fearures with normal BED files. The file format paired-end BED entries can be found by using the "-h" option with peIntersectBed.

      Lastly, if you are using exactly version 2.0.0, there is a much newer version available here:
      http://code.google.com/p/bedtools.

      All the best,
      Aaron

      Comment


      • #4
        I should also note that one can track the names of which entries were merged (separated by a semicolon) by using the "-names" option.

        From your example:

        PHP Code:
        cut -f 1-6 test.bed mergeBed -i stdin -names
        chr22    1000    6000    cloneA
        ;cloneB 
        This is undocumented in the help and I am changing this as we "speak".
        --Aaron

        Comment


        • #5
          Hi Aaron,

          Thanks for your response. I have downloaded the recent version and start using.

          Code:
          ./mergeBed -n -i ../newdata/full.bed > /../newdata/merged.bed
          The above command works.

          When I try to force with -s options to check the strand information, I don't get any output.

          Code:
          ./mergeBed -n -s -i ../newdata/full.bed > /../newdata/merged.bed
          Without strand, it works fine. Even in the example you have give above no strand info is being printed in the output. Why is it so?

          Basically I am trying to remove duplicate records and merge them as 1 record.

          Thanks and Regards
          Last edited by seq_GA; 10-28-2009, 03:02 AM.

          Comment


          • #6
            Originally posted by seq_GA View Post
            Hi Aaron,

            Thanks for your response. I have downloaded the recent version and start using.

            Code:
            ./mergeBed -n -i ../newdata/full.bed > /../newdata/merged.bed
            The above command works.

            When I try to force with -s options to check the strand information, I don't get any output.

            Code:
            ./mergeBed -n -s -i ../newdata/full.bed > /../newdata/merged.bed
            Without strand, it works fine. Even in the example you have give above no strand info is being printed in the output. Why is it so?

            Basically I am trying to remove duplicate records and merge them as 1 record.

            Thanks and Regards
            Hmm, it works as expected for me using Version 2.2.4. test.bed below is the same as your file above.

            __without__ strand, thus ignores the fact that the two entries are on different strands and combines them:
            PHP Code:
            cut -f 1-6 test.bed mergeBed -i stdin -names
            chr22    1000    6000    cloneA
            ;cloneB 

            __with__ strand, thus observes the fact that the two entries are on different strands and does not combines them:
            PHP Code:
            cut -f 1-6 test.bed mergeBed -i stdin -s
            chr22    1000    5000    
            +
            chr22    2000    6000    

            Comment


            • #7
              Hi Aaron,

              How would you like me to cite your tools if we use them in a publication?

              Thanks!
              Lizzy

              Comment


              • #8
                Originally posted by ewilbanks View Post
                Hi Aaron,

                How would you like me to cite your tools if we use them in a publication?

                Thanks!
                Lizzy
                Hi Lizzy,
                We are working on the manuscript, but until then, please cite it as: Aaron R. Quinlan and Ira M. Hall, unpublished: http://code.google.com/p/bedtools/).
                Thanks for asking and good luck with your manuscript.
                Aaron

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Developments in Metagenomics
                  by seqadmin





                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                  09-23-2024, 06:35 AM
                • seqadmin
                  Understanding Genetic Influence on Infectious Disease
                  by seqadmin




                  During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                  Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                  09-09-2024, 10:59 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 10-02-2024, 04:51 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-01-2024, 07:10 AM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-30-2024, 08:33 AM
                0 responses
                25 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-26-2024, 12:57 PM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Working...
                X