No announcement yet.

*** Per base coverage ***

  • Filter
  • Time
  • Show
Clear All
new posts

  • *** Per base coverage ***

    This seems like a silly question, but I can't quite find the answer for it anywhere. I have a list of single nucleotide position, ie,
    chr 1 72342
    chr 2 32912
    chr X 184343
    to which I would like to find their coverage, ie, how many reads cover that position and what are those bases.
    This is almost like calling SNP, but not quite because I just like to know the coverage even if no SNP is present. For this reason, I can't just use a SNP caller. Does anyone have a solution for this?
    Thanks so much.

  • #2
    I loaded all my short read data into an SQL database so that this information can be pulled out with a quick query.

    For example, to find how many reads cover position 72342 of chr 1, just type in the following query. This assumes 50bp reads.

    select count(*) from short_reads
    where (chromosome = 'chr1')
    and (start_position >= 72293)
    and (start_position <= 72342)

    To perform this query for many SNPs, a PERL script can be written.


    • #3
      The coverageBed function in bedtools ( should be able to do what you want.


      • #4
        you can use samtools pileup -c from on sorted indexed bam file of your alignment..this pileup consensus will give you read depth in each position and base composition in each also


        • #5
          Thanks, everyone. bedTools seems to work the best and it's fully documented so very easy to use! Much thanks to Aaron (the author) & Pete.


          • #6

            samtools view -b <in.bam> <chr>:<pos>-<pos> | samtools pileup - | grep <pos> | awk '{print $4}'
            Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
            Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
            Projects: U87MG whole genome sequence [Website] [Paper]