Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Too many mismatches?

    Hello guys,

    I've just been hit with my first SOLiD data...
    Reading the posts here, I already feel better to see that other people are struggling as well

    I'm trying to map the reads (75bp) to prokaryotic reference genomes and detect SNPs. Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment. I'm getting on average about 10 mismatches per read. Some have as low as 2 mismatches, but others have above 20. Because I was using ECC chemistry I did not think this would turn out so bad...

    My question is this: does it even make sense to try and detect SNPs if I have that many mismatches in my reads? Should I rather focus on getting the alignment to work in color-space?

    thanks for your help

  • #2
    I would focus on getting the color aligners to work. I would just use lifescope because it knows what to do with the ECC data and I think it has a few scripts for converting reference genomes to color space.

    If you just convert reads from color to space the ECC is useless.

    Comment


    • #3
      Originally posted by BambooGarden View Post
      Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment.
      A very hesitant +1 for lifescope, because they know the most about colour-space.

      Are you aware that Bowtie (v1) can do colour-space alignment and has very similar input/output parameters to Bowtie2?

      How are you converting to base-space? If you're doing a naive conversion in the absence of a reference sequence (e.g. G1122330 = GTGAGCGG, regardless of error), then you're going to end up with plenty of rubbish sequence every time there's a colour error. At the risk of repeating myself too much, colour-space is not an intuitive way of representing sequence, and you'll save yourself a lot of pain and time by shifting to a different sequencing platform.

      Comment


      • #4
        I am using NGS plumbing to convert to base-space. I guess this is what you call a naive conversion because I'm not putting any reference sequence in at that point. What would be a software to convert with taking a reference sequence into account?

        Yeah, I agree. Definitely next time another platform. But for now I'll have to make do with this data somehow.

        Thanks for the help.

        Comment


        • #5
          Originally posted by BambooGarden View Post
          What would be a software to convert with taking a reference sequence into account?
          Bowtie can do this, you just have to map the reads to your reference first (which is a bit of a chicken/egg thing). The base-space sequence reported by bowtie is corrected to match the reference sequence (but including any discovered SNPs).

          Comment


          • #6
            Older versions of BWA worked with "SOLiD".
            Colorspace was disabled in 0.6.1, I don't know if it was re-enabled.
            As I remember, it required using solid2fastq.pl program.
            The "bioscope" aligner was too aggressive in aligning reads; BWA did a better job of dropping and clipping reads that had mis-transitions in the middle of the reads.

            The newer "lifescope" (?) software may have improved the situation.

            I'd recommend getting and old copy of BWA and using it.

            Comment


            • #7
              We did do some testing with comparing bioscope, lifescope, bwa, bowite and shimp for color space alignments and found that Shrimp2 worked the best. Although all of these tests were done before my arrival so I don't have all of the details, but generally Shrimp2 seems to work well and maps in color space.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X