Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • indels using single end short reads!

    We had a sample with known 4bp deletion, but no tool would help me detect that...

    any suggestions?

    SSAHA supposedly does gapped alignment, but it gave me some 'novel' 1 or 2 base indels... not the one we know
    --
    bioinfosm

  • #2
    Originally posted by bioinfosm View Post
    We had a sample with known 4bp deletion, but no tool would help me detect that...

    any suggestions?

    SSAHA supposedly does gapped alignment, but it gave me some 'novel' 1 or 2 base indels... not the one we know
    SOAP may do it...it seems when you compile it, you specify how large a gap you are allowed to call for in the command line.

    "3) Maximum gap size
    -DMAXGAP=3
    Maximum size of a gap allowed in a read, then "-g" option during running should not exceed this definition."

    On the home page, they show 3 as an example, but 4 might work. I don't know how much it will slow down SOAP to allow it to try large gaps.

    I know it finds plenty of 2 bp insertions when I use -g 2.

    Comment


    • #3
      Indels with 4 bases are on the border of what I would consider "sane" when aligning/assembling short sequences. E.g., a 36mer aligned against the same sequence but with 4 bases deletion gives you a score ratio (= score/expected_score) of barely above 70%.

      I normally allow only 1 or 2 errors in Solexa mapping assemblies, but I quickly hacked together a change that will allow you to find indels or base changes with up to 4 bases in a Solexa mapping assembly. Grab http://www.chevreux.org/tmp/mira_2.9...x86_64.tar.bz2
      and run the Solexa demo. Have a look at the results in gap4 and decide for yourself whether this would fit your needs.

      Warning: Work in progress. Works for me, but not necessarily for you

      Regards,
      B.

      Comment


      • #4
        myrialign

        Maybe MyriAlign would be of use to you?
        Savannah is a central point for development, distribution and maintenance of free software, both GNU and non-GNU.

        Comment


        • #5
          SOAP worked nicely on the data... Thanks to the person who shared his script to use soap results and generate indel calls

          I was able to see the 4bp known deletion in the sample

          Torst - are you the author of Myrialign? I will check it out as well
          --
          bioinfosm

          Comment


          • #6
            Depending on your coverage, you can try assembling the reads, then simply blasting the contigs against the genome. I know of a few groups trying to do this, but I haven't heard of success, so I'm curious if you try this how far you get.

            -mark

            Comment


            • #7
              Aligning with Indels

              I've just finished a new aligner that will do indels up to 7bp. I don't have a web site for downloading it but if you'd like to try email novoalign @ gmail.com and I'll send you a copy. It's also at least as speedy as the best of the other aligners.

              Comment


              • #8
                Originally posted by bioinfosm View Post
                SOAP worked nicely on the data... Thanks to the person who shared his script to use soap results and generate indel calls

                I was able to see the 4bp known deletion in the sample
                Would said person be willing to share the scripts for using soap results? thanks in advance.

                Comment


                • #9
                  Novoalign and novopaired will do gapped alignments and is a fair bit faster than SOAP.
                  I've just released V1.03, this update improves quality scores for novopaired and also fixes a illegal instruction fault reported by one user.
                  You can download at www.novocraft.com
                  I've also changed the license term so it's free for any non-profit even if you don't publish in open journals.
                  Colin

                  Comment


                  • #10
                    Originally posted by ECO View Post
                    Would said person be willing to share the scripts for using soap results? thanks in advance.
                    Sorry but I never noticed your message in the new posts!

                    Sure, I would be happy to share. I used the soap algorithm, and then used a parsing perl script to get the results.

                    soap -a input -d reference -o prefix -s 10 -g 4

                    The parser is modified from Liu's script (BGI). You may PM me, and I will mail that to you, but would not want to put it up here..

                    sm
                    --
                    bioinfosm

                    Comment


                    • #11
                      Originally posted by bioinfosm View Post
                      We had a sample with known 4bp deletion, but no tool would help me detect that...

                      any suggestions?

                      SSAHA supposedly does gapped alignment, but it gave me some 'novel' 1 or 2 base indels... not the one we know
                      Hi!

                      Glad to read that you managed the task. Is it from a mammalian genome? If so, would you be willing to share your data set with us ( of course NDA can be done)?
                      We would love to test our mapping on that challenge.

                      Klaus

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Exploring the Dynamics of the Tumor Microenvironment
                        by seqadmin




                        The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                        07-08-2024, 03:19 PM
                      • seqadmin
                        Exploring Human Diversity Through Large-Scale Omics
                        by seqadmin


                        In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                        06-25-2024, 06:43 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:53 AM
                      0 responses
                      12 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 07-10-2024, 07:30 AM
                      0 responses
                      34 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 07-03-2024, 09:45 AM
                      0 responses
                      204 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 07-03-2024, 08:54 AM
                      0 responses
                      213 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X