Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Too many mismatches?

    Hello guys,

    I've just been hit with my first SOLiD data...
    Reading the posts here, I already feel better to see that other people are struggling as well

    I'm trying to map the reads (75bp) to prokaryotic reference genomes and detect SNPs. Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment. I'm getting on average about 10 mismatches per read. Some have as low as 2 mismatches, but others have above 20. Because I was using ECC chemistry I did not think this would turn out so bad...

    My question is this: does it even make sense to try and detect SNPs if I have that many mismatches in my reads? Should I rather focus on getting the alignment to work in color-space?

    thanks for your help

  • #2
    I would focus on getting the color aligners to work. I would just use lifescope because it knows what to do with the ECC data and I think it has a few scripts for converting reference genomes to color space.

    If you just convert reads from color to space the ECC is useless.

    Comment


    • #3
      Originally posted by BambooGarden View Post
      Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment.
      A very hesitant +1 for lifescope, because they know the most about colour-space.

      Are you aware that Bowtie (v1) can do colour-space alignment and has very similar input/output parameters to Bowtie2?

      How are you converting to base-space? If you're doing a naive conversion in the absence of a reference sequence (e.g. G1122330 = GTGAGCGG, regardless of error), then you're going to end up with plenty of rubbish sequence every time there's a colour error. At the risk of repeating myself too much, colour-space is not an intuitive way of representing sequence, and you'll save yourself a lot of pain and time by shifting to a different sequencing platform.

      Comment


      • #4
        I am using NGS plumbing to convert to base-space. I guess this is what you call a naive conversion because I'm not putting any reference sequence in at that point. What would be a software to convert with taking a reference sequence into account?

        Yeah, I agree. Definitely next time another platform. But for now I'll have to make do with this data somehow.

        Thanks for the help.

        Comment


        • #5
          Originally posted by BambooGarden View Post
          What would be a software to convert with taking a reference sequence into account?
          Bowtie can do this, you just have to map the reads to your reference first (which is a bit of a chicken/egg thing). The base-space sequence reported by bowtie is corrected to match the reference sequence (but including any discovered SNPs).

          Comment


          • #6
            Older versions of BWA worked with "SOLiD".
            Colorspace was disabled in 0.6.1, I don't know if it was re-enabled.
            As I remember, it required using solid2fastq.pl program.
            The "bioscope" aligner was too aggressive in aligning reads; BWA did a better job of dropping and clipping reads that had mis-transitions in the middle of the reads.

            The newer "lifescope" (?) software may have improved the situation.

            I'd recommend getting and old copy of BWA and using it.

            Comment


            • #7
              We did do some testing with comparing bioscope, lifescope, bwa, bowite and shimp for color space alignments and found that Shrimp2 worked the best. Although all of these tests were done before my arrival so I don't have all of the details, but generally Shrimp2 seems to work well and maps in color space.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Developments in Metagenomics
                by seqadmin





                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                09-23-2024, 06:35 AM
              • seqadmin
                Understanding Genetic Influence on Infectious Disease
                by seqadmin




                During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                09-09-2024, 10:59 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 10-02-2024, 04:51 AM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-01-2024, 07:10 AM
              0 responses
              20 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-30-2024, 08:33 AM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-26-2024, 12:57 PM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Working...
              X