Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • shuang
    Senior Member
    • Jul 2011
    • 100

    Protocol for SNP from Sanger sequences

    My project is to find SNP from Sanger sequences. I've never done before. Here is some steps I could think of to achieve my purpose. Please suggest appropriate free tools/software for each step. Please let me know if I miss any steps.

    Step 1: quality trim a ABI file (to Fasta).

    Step 2: align with a reference genome (input Fasta, output Sam?)

    Step 3: parse the output file to retrieve SNP?
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    Why use FASTA? I'd use FASTQ so that any quality scores in the read can be taken into consideration in the mapping and SNP calling. EMBOSS seqret can do this conversion (taking the sequence and quality scores as is from the ABI file), but I'd suggest using a base caller like trace tuner which seems to do a better job than the ABI default pipeline. I wrote a patch to get FASTQ from TraceTuner directly, not sure if it has been integrated yet.

    For the mapping step, do you have DNA or RNA reads? And if RNA does your organism do gene splicing? If so, you'll want an intron/exon aware read mapper.
    Last edited by maubp; 07-28-2011, 12:13 PM. Reason: Added more details

    Comment

    • DZhang
      Senior Member
      • Jun 2010
      • 177

      #3
      mutation surveyor or DNAStar should work for you but both are commercial software.

      Comment

      • gavin.oliver
        Senior Member
        • Jan 2010
        • 110

        #4
        I have the precise same problem (DNA-based) - can anyone recommend how to achieve this with open source tools?

        Comment

        • DZhang
          Senior Member
          • Jun 2010
          • 177

          #5
          Hi gavin.oliver, There are other ways around but it depends on your project scope. Can you share how many Sanger reads and how big the reference sequence you have?

          Comment

          • gavin.oliver
            Senior Member
            • Jan 2010
            • 110

            #6
            It will only be a single human gene.

            Comment

            • DZhang
              Senior Member
              • Jun 2010
              • 177

              #7
              I assume you are doing exon sequencing via PCR. Any multiple alignment program (e.g., CLUSTALW) or BLAST should do if you do not have too many traces and are not expecting too many types of SNPs. Other more powerful programs, like MIRA or polyphred, may be too much for a one-time small project but can handle SNP detection extremely well.

              Comment

              • gavin.oliver
                Senior Member
                • Jan 2010
                • 110

                #8
                Originally posted by DZhang View Post
                I assume you are doing exon sequencing via PCR. Any multiple alignment program (e.g., CLUSTALW) or BLAST should do if you do not have too many traces and are not expecting too many types of SNPs. Other more powerful programs, like MIRA or polyphred, may be too much for a one-time small project but can handle SNP detection extremely well.
                The plan was actually to use Sanger sequencing to sequence an entire 25KB gene (40 samples). There doesn't seem to be a workable NGS solution on offer.
                Last edited by gavin.oliver; 08-11-2011, 05:35 AM.

                Comment

                • DZhang
                  Senior Member
                  • Jun 2010
                  • 177

                  #9
                  You may look into BWA or Bowtie to align Sanger reads; both can handle long reads. Then proceed to SNP/smINDEL calls (e.g., with samtools.) I would trim Sanger reads at both ends first.

                  Before NGS emerged a few years ago, Sanger was the most popular way of obtaining sequence so there are many tools to perform alignments and SNP calls. In your case, I believe Consed (free to academic) or MIRA should work.

                  Comment

                  • gavin.oliver
                    Senior Member
                    • Jan 2010
                    • 110

                    #10
                    Originally posted by DZhang View Post
                    You may look into BWA or Bowtie to align Sanger reads; both can handle long reads. Then proceed to SNP/smINDEL calls (e.g., with samtools.) I would trim Sanger reads at both ends first.

                    Before NGS emerged a few years ago, Sanger was the most popular way of obtaining sequence so there are many tools to perform alignments and SNP calls. In your case, I believe Consed (free to academic) or MIRA should work.
                    Thanks a lot. Do you think the Sanger reads could be converted to FASTQ to work with BWA etc?

                    Comment

                    • DZhang
                      Senior Member
                      • Jun 2010
                      • 177

                      #11
                      I am not aware of any program that can convert quality scores in Sanger trace to FastQ. But you may simply convert Sanger traces in fasta to fastq. One took I know is from the MAQ package; it is a simple perl script.

                      Comment

                      • gavin.oliver
                        Senior Member
                        • Jan 2010
                        • 110

                        #12
                        Originally posted by DZhang View Post
                        I am not aware of any program that can convert quality scores in Sanger trace to FastQ. But you may simply convert Sanger traces in fasta to fastq. One took I know is from the MAQ package; it is a simple perl script.
                        I'll give it a look - thanks again!

                        Comment

                        • maubp
                          Peter (Biopython etc)
                          • Jul 2009
                          • 1544

                          #13
                          Originally posted by gavin.oliver View Post
                          Thanks a lot. Do you think the Sanger reads could be converted to FASTQ to work with BWA etc?
                          Yes, if you have ABI files for your "Sanger" capillary sequence reads you can convert them to FASTQ.

                          You can use EMBOSS seqret as described here - note you have the new EMBOSS 6.4.0 release you'll want the patch for this bug:



                          You will also be able to use the next release of Biopython to convert ABI to FASTQ (and other formats).


                          You can also use a base caller like TraceTuner which should give slightly better base predictions than the default ABI pipeline. I wrote a patch to offer FASTQ output from TraceTuner, but by default you can get FASTA + QUAL, and convert that to FASTQ.

                          Comment

                          • gavin.oliver
                            Senior Member
                            • Jan 2010
                            • 110

                            #14
                            Originally posted by maubp View Post
                            Yes, if you have ABI files for your "Sanger" capillary sequence reads you can convert them to FASTQ.

                            You can use EMBOSS seqret as described here - note you have the new EMBOSS 6.4.0 release you'll want the patch for this bug:



                            You will also be able to use the next release of Biopython to convert ABI to FASTQ (and other formats).


                            You can also use a base caller like TraceTuner which should give slightly better base predictions than the default ABI pipeline. I wrote a patch to offer FASTQ output from TraceTuner, but by default you can get FASTA + QUAL, and convert that to FASTQ.
                            Brilliant stuff - thanks a lot

                            Comment

                            Latest Articles

                            Collapse

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, Today, 10:09 AM
                            0 responses
                            9 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, Yesterday, 08:59 AM
                            0 responses
                            16 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-02-2026, 12:03 PM
                            0 responses
                            24 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-02-2026, 11:40 AM
                            0 responses
                            21 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...