Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Smith-Waterman surprise

    Hello all, it is my first post on the seqanswers.com forum.

    I am studying various alignment cases of Smith-Waterman algorithm.

    I have tried the next two sequences for alignment:
    >ref1
    CAGGCT СТAAAАААСААААААССAAG GACTCG
    >r1
    CAGGCT СТАААААCAAAAACCAAG GACTCG

    Surprisingly I get from http://www.ebi.ac.uk/Tools/psa/embos...ucleotide.html
    an alignment with edited original sequences:

    Code:
    ref1               1 CAGGCTAAA-----AAGGACTCG     18
                         ||||||.||     |||||||||
    read1              1 CAGGCTCAAAAACCAAGGACTCG     23

    The most surprising thing is that flanks are intact, but middle part of sequences partially disappeared.

    The result from http://insilico.ehu.es/align/ differs but also is shorter then source sequnces:
    Code:
    CAGGCTAAAAAGG-ACTC-G----  24
    |||||| ||||   |         
    CAGGCTCAAAAACCAAG-GACTCG  24

  • #2
    Weird, you might just use R:

    Code:
    library(Biostrings)
    read <- DNAString("CAGGCTCTAAAAACAAAAACCAAGGACTCG")
    ref <- DNAString("CAGGCTCTAAAAAACAAAAAACCAAGGACTCG")
    pairwiseAlignment(ref, read, type="local") #Default parameters
    pairwiseAlignment(ref, read, type="local", gapOpening=-1) #Change gap penalties
    The first one will yield
    Code:
    CAGGCTCTAAAAAACAAAAAACCAAGGACTCG
    CAGGCTCTAAAAAC--AAAAACCAAGGACTCG
    and the second one
    Code:
    CAGGCTCTAAAAAACAAAAAACCAAGGACTCG
    CAGGCTCTAAAAA-CAAAAA-CCAAGGACTCG
    since I changed the default gap penalties. You can also play with the match and mismatch penalties more easily than with the cumbersome (and slow) web interfaces. Remember that this is a local alignment, so you aren't guaranteed to get the full sequences in the output (try increasing the gapExtension penalty to -5).

    Comment


    • #3
      Wow! This case and this information about R is great! Thank you two very much
      To do is to be (Nietzsche) - To be is to do (Kant) - Do be do be do (Frank Sinatra)

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Non-Coding RNA Research and Technologies
        by seqadmin




        Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

        Nobel Prize for MicroRNA Discovery
        This week,...
        10-07-2024, 08:07 AM
      • seqadmin
        Recent Developments in Metagenomics
        by seqadmin





        Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
        09-23-2024, 06:35 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 10-02-2024, 04:51 AM
      0 responses
      104 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-01-2024, 07:10 AM
      0 responses
      112 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-30-2024, 08:33 AM
      1 response
      116 views
      0 likes
      Last Post EmiTom
      by EmiTom
       
      Started by seqadmin, 09-26-2024, 12:57 PM
      0 responses
      22 views
      0 likes
      Last Post seqadmin  
      Working...
      X