  • how to use soap?

    I can't understand how to use it!

    I don't kown
    $soap-a is a database fa. or your query fa?

    who can help me

    I'M only a undergraduate student!

    -a <str> your query file,
    -d <str> reference sequences file, is a database file


      but I fetch a sequence from my database to be a query file,then I use
      soap -a query file -d mydatabaes file -o ......
      no result!!!!!

      so I converse the database to query file,just like
      soap -a mydatabases file -d query file -o......
      6 sequences in the result!

      I can't understand!


        Your query sequence should be one or more short reads. The database should contain one or more very large sequences e.g. contigs, chromosomes, etc.

        So :

        soap -a <short-reads> -d <contigs/chromosome>

        For -a sequences in FASTA or FASTQ formats are accepted. Database should be in FASTA only.


          Commonly we are using ELAND for Illumina data alignment. But for the some projects it's too short frame length available by ELAND because of its restriction (32bp as max). I need to align 40bp fragments and tried to use SOAP. The problem is that I couldn't force SOAP to taking in account quality's data (*prb files). It's working only with default '40'. I didn't find any information in official SOAP documentation or here, except notification about fastq. So the questions:
          1. Is it possible to directly attach set of *.prb files to SOAP alignment process?
          2. If not, how to convert *.prb and *.seq files to fastq? Are some tools available?

          Thanks in advance!
          Slava, MPIMG Berlin.
          Slava, MPIMG Berlin.


            2. If not, how to convert *.prb and *.seq files to fastq? Are some tools available?
            Yup, that's what you need to do. Maq may come with a script that'll do the job. Otherwise, it's pretty straightforward to write one yourself.

            I don't think that the alignment part actually pays attention to the quality scores, though. I think they only come into play in the columns of the output where the SNPs are listed, the alternate letter is listed with its quality score.

            So if you aren't looking for SNPs, it might not matter much.
            


              Maybe you can try ZOOM. It handles *_seq.txt and *_prb.txt automatically.

                Does anybody ever use SOAP in color space ?

                - First Alignment : a bank against itself
                /opt/soap_1.11/soap -a test.fa -d test.reference.fa -s 8 -o soap.out
                -> OK. All sequences matched

                - First Alignment : a bank against itself
                /opt/soap_1.11/soap -a test.csfasta -d test.reference.csfasta -s 8 -o soap.out
                ->0 alignments !

                Any ideas ?


                  Originally posted by nservant View Post
                  Does anybody ever use SOAP in color space ?
                  How did you convert your data to color space? Usually I use script provided with MAQ to generate pseudo base reads. Beside you need special reference. It's mean less to try align on both strands of typical reference by the way of SOAP (because of color space!). Just obtain reference sequence in reverse order and after concatenate "+" and "-" to one fasta file. When starting SOAP point -n 1 option to preserve attempts of program to look matches on "complimentary" strand.
                  Cheers, Slava


                    I made a mistake ! I don't remember why, but i was convinced SOAP could work in color space.
                    And it should not be able to do it.

                    Just for answer your question Slava, I use the program from ABI to convert my sequences from base to color space.


