Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • StephaniePi83
    Member
    • Sep 2011
    • 52

    Extract reads sequence based on position from bam file

    Dear all,
    I'm trying to recover reads sequences from specific region in bam file.
    Reads were align using bowtie2.

    I have reads align to one specific sequence and I would like to get sequences of reads mapped to a specific region of 3 nucleotides, and count the occurence of each tri-nucleotide at this position.

    How would I achieve this ?

    Thanks for your help !
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    You can extract the reads in the region you are interested in using samtools view (e.g. samtools view -b reads.bam chr1:10420000-10421000 > subset.bam) and then use bedtools/bam-readcount.
    Last edited by GenoMax; 10-10-2016, 04:53 AM.

    Comment

    • StephaniePi83
      Member
      • Sep 2011
      • 52

      #3
      Thanks for your answer.
      Which command from bedtools ?
      What is bamcount ?

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        Sorry. Should be bam-readcount. Added a link to post above. When you say tri-nucloetide (you want to consider the counts as a trinucleotide or just counts for bases at those three positions). If latter then bam-readcount will work.

        Comment

        • StephaniePi83
          Member
          • Sep 2011
          • 52

          #5
          bamcount is not doing exactly what i'm asking, i don't want readcounts for each base at each position.
          I want the occurence of each tri-nucleotide at this specific site.
          For example :
          ACA = 5 reads
          GCA = 8 reads
          etc ....

          The problem is that there are INDELS in the alignment and I don't know how to deal with it.

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            If the indel spans that position there is nothing you can do. How deep is the pileup (not practical to count in a genome viewer (e.g. IGV)?

            Comment

            • StephaniePi83
              Member
              • Sep 2011
              • 52

              #7
              there are not that much reads mapping to the sequence of interest (maximum is 800 000 reads ).
              You are suggesting to do it with IGV ? How ?

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #8
                This tool by Pierre Lindenbaum may be useful in this case: https://github.com/lindenb/jvarkit/wiki/SAM4WebLogo It should give you aligned fasta files.

                If it was a small number of reads I thought you could count them in IGV but sounds like that may not be the case.

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                22 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                28 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                39 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                61 views
                0 reactions
                Last Post SEQadmin2  
                Working...