Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Shishir
    Member
    • Nov 2012
    • 22

    Clustering sequences based on sequence similarity

    I have 8000 protein sequences that I want to cluster based on similarity (not identity) and select the longest representative sequence from each cluster. I checked several tools like HiFix, SiliX, ClusTR but could not find the optimal solution. I want to do clustering as like CD-Hit does to reduce dataset but based on sequence similarity rather that sequence identity.
  • rhinoceros
    Senior Member
    • Apr 2013
    • 372

    #2
    Perhaps USEARCH or if you want something much more complicated, OrthoMCL..
    Last edited by rhinoceros; 07-09-2013, 02:58 AM.
    savetherhino.org

    Comment

    • JackieBadger
      Senior Member
      • Mar 2009
      • 385

      #3
      So you want to group by functional similarity?
      You can do this based on physio-chemical properties of the amino acids. You translate each amino acid into 5 different metrics and then use Discriminant Analysis of Principle Components to cluster based on these properties.Described here:http://www.biomedcentral.com/1471-2148/12/68

      I can provide you with file formats/tips if needs be
      Last edited by JackieBadger; 07-09-2013, 03:55 AM.

      Comment

      • Shishir
        Member
        • Nov 2012
        • 22

        #4
        Thanks, it seems interesting and I planned to use it later as currently I am not focused on functional similarity!
        Originally posted by JackieBadger View Post
        So you want to group by functional similarity?
        You can do this based on physio-chemical properties of the amino acids. You translate each amino acid into 5 different metrics and then use Discriminant Analysis of Principle Components to cluster based on these properties.Described here:http://www.biomedcentral.com/1471-2148/12/68
        I can provide you with file formats/tips if needs be

        Comment

        • Shishir
          Member
          • Nov 2012
          • 22

          #5
          Thanks for the reply!

          Originally posted by rhinoceros View Post
          Perhaps USEARCH or if you want something much more complicated, OrthoMCL..

          Comment

          Latest Articles

          Collapse

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-09-2026, 11:58 AM
          0 responses
          26 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          33 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-04-2026, 08:59 AM
          0 responses
          39 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 12:03 PM
          0 responses
          62 views
          0 reactions
          Last Post SEQadmin2  
          Working...