Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Making a Constant-Step bedGraph File

    Hello all,

    I have two questions about making a bedGraph from an alignment file (bam or sam). I've used "bedtools genomecov" and then "bedtools map" (using a 200bp genome-wide "windows" file) to make a constant-step bedGraph file. This seems somewhat clumsy and also produces strange errors at times (that seem to be related to the chromosome order - despite both variable-step bedGraph and windows file being sorted with the same chr order, chromosomes 10 to 19 seem to get no mapping at all).

    So, first of all, what is the "right" way to get a constant-step bedGraph from a BAM/SAM file?

    And secondly, is there a way to do windowed mapping in which overlaps with bins would be considered? E.g. if certain read is 72% in bin 1 and 28% in bin2, bin 1 would get 0.72 added to its read count, and bin 2 would add 0.28?

    The best option I've found so far is -f option in "bedtools map", when a certain read only "counts" when more than a certain fraction of it belongs to a window.

    Thank you in advance for any input.

  • #2
    1. It sounds like your files are not sorted in the order "bedtools map" requires (both are in the same, unsupported order). Probably karyotypic chr1,chr2,chr3... instead of lexical chr1,chr10,chr11... ?
    See http://bedtools.readthedocs.org/en/l...tools/map.html


    2. If your reads are all the same length you could pull out read locations with "bedtools bamtobed", sort them, then use "bedmap" (http://code.google.com/p/bedops/) to report the total read bases overlapping each window, and divide by read length (36bp in this example):
    bedmap --ec --delim '\t' --echo --bases windows.bed read_locations.bed | awk 'BEGIN{OFS="\t"}{print $1,$2,$3,$4/36}'

    That "--ec" flag is for error checking, slower but tells you if the sort order is the problem instead of just omitting chromosomes 10-19.

    Comment


    • #3
      Eric, thank you for you answer, it was very helpful.

      Yes I did suspect that sorting was a problem - however, no error messages were generated, and even with both .bed and .bedGraph files sorted alphabetically the mapping produced was plain wrong. Oh well.

      "bedmap" approach did work like a charm.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Exploring the Dynamics of the Tumor Microenvironment
        by seqadmin




        The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
        07-08-2024, 03:19 PM
      • seqadmin
        Exploring Human Diversity Through Large-Scale Omics
        by seqadmin


        In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
        06-25-2024, 06:43 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 07-10-2024, 07:30 AM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-03-2024, 09:45 AM
      0 responses
      201 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-03-2024, 08:54 AM
      0 responses
      210 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 07-02-2024, 03:00 PM
      0 responses
      192 views
      0 likes
      Last Post seqadmin  
      Working...
      X