Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Too many mismatches?

    Hello guys,

    I've just been hit with my first SOLiD data...
    Reading the posts here, I already feel better to see that other people are struggling as well

    I'm trying to map the reads (75bp) to prokaryotic reference genomes and detect SNPs. Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment. I'm getting on average about 10 mismatches per read. Some have as low as 2 mismatches, but others have above 20. Because I was using ECC chemistry I did not think this would turn out so bad...

    My question is this: does it even make sense to try and detect SNPs if I have that many mismatches in my reads? Should I rather focus on getting the alignment to work in color-space?

    thanks for your help

  • #2
    I would focus on getting the color aligners to work. I would just use lifescope because it knows what to do with the ECC data and I think it has a few scripts for converting reference genomes to color space.

    If you just convert reads from color to space the ECC is useless.

    Comment


    • #3
      Originally posted by BambooGarden View Post
      Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment.
      A very hesitant +1 for lifescope, because they know the most about colour-space.

      Are you aware that Bowtie (v1) can do colour-space alignment and has very similar input/output parameters to Bowtie2?

      How are you converting to base-space? If you're doing a naive conversion in the absence of a reference sequence (e.g. G1122330 = GTGAGCGG, regardless of error), then you're going to end up with plenty of rubbish sequence every time there's a colour error. At the risk of repeating myself too much, colour-space is not an intuitive way of representing sequence, and you'll save yourself a lot of pain and time by shifting to a different sequencing platform.

      Comment


      • #4
        I am using NGS plumbing to convert to base-space. I guess this is what you call a naive conversion because I'm not putting any reference sequence in at that point. What would be a software to convert with taking a reference sequence into account?

        Yeah, I agree. Definitely next time another platform. But for now I'll have to make do with this data somehow.

        Thanks for the help.

        Comment


        • #5
          Originally posted by BambooGarden View Post
          What would be a software to convert with taking a reference sequence into account?
          Bowtie can do this, you just have to map the reads to your reference first (which is a bit of a chicken/egg thing). The base-space sequence reported by bowtie is corrected to match the reference sequence (but including any discovered SNPs).

          Comment


          • #6
            Older versions of BWA worked with "SOLiD".
            Colorspace was disabled in 0.6.1, I don't know if it was re-enabled.
            As I remember, it required using solid2fastq.pl program.
            The "bioscope" aligner was too aggressive in aligning reads; BWA did a better job of dropping and clipping reads that had mis-transitions in the middle of the reads.

            The newer "lifescope" (?) software may have improved the situation.

            I'd recommend getting and old copy of BWA and using it.

            Comment


            • #7
              We did do some testing with comparing bioscope, lifescope, bwa, bowite and shimp for color space alignments and found that Shrimp2 worked the best. Although all of these tests were done before my arrival so I don't have all of the details, but generally Shrimp2 seems to work well and maps in color space.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Choosing Between NGS and qPCR
                by seqadmin



                Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                10-18-2024, 07:11 AM
              • seqadmin
                Non-Coding RNA Research and Technologies
                by seqadmin




                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                Nobel Prize for MicroRNA Discovery
                This week,...
                10-07-2024, 08:07 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 11-01-2024, 06:09 AM
              0 responses
              21 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-30-2024, 05:31 AM
              0 responses
              20 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-24-2024, 06:58 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-23-2024, 08:43 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X