Header Leaderboard Ad

Collapse

A feature request for short read aligners

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • A feature request for short read aligners

    I find that the major short read aligners map poorly for highly polymorphic genes like HLA or Cytochrome P450. This is expected because for alleles that differ a lot from the reference allelle, there will be too many mismatches.

    I think we can solve this problem by reading a dbsnp vcf that contains possible alleles and allele frequencies during alignment, then treat calls differ from the reference to be matches instead of mismatches if they exceed a certain allele frequency, e.g. 1%. I think this feature can improve the percentage of reads mapped greatly.

    Can any short read aligner authors add this feature? Or is it already available in some aligners?

    Thanks!

  • #2
    It can be done in http://seqanswers.com/wiki/GSNAP and probably some other aligners.

    Comment


    • #3
      Originally posted by kopi-o View Post
      It can be done in http://seqanswers.com/wiki/GSNAP and probably some other aligners.
      Thanks for pointing it out. I will give it a try.

      Do you know if GSNAP gives higher mapping percentage than bwa?

      Comment


      • #4
        Originally posted by ymc View Post
        I find that the major short read aligners map poorly for highly polymorphic genes like HLA or Cytochrome P450. This is expected because for alleles that differ a lot from the reference allelle, there will be too many mismatches.
        you can set up the mismatches in every aligner to a number of your choice but keep in mind that highering the acceptable number of mismatches will raise runtime more or less dramatically!

        However, I like the idea adding some DB-information to the mapping algorithms!

        Comment

        Working...
        X