Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Make my own blast DB

    Hi everyone

    I have a question about how to make my own BLASTdb using the result of my Ref-seq work before the annotation step, i mean not like this

    >CL1Contig1
    ACGGGGGAGGCACCATTATTTGGGCTGCAGACAACAAACTGAAATTCTGGCGGCCCGA

    I want it like this with the annotation of the sequennce
    >CL1Contig1 Nascent polypeptide associated complex alpha
    ACGGGGGAGGCACCATTATTTGGGCTGCAGACAACAAACTGAAATTCTGGCGGCCCGA

    Any script of bioperl that could help for my question? or any different solution.

    Note: I known that i have to use format db or makeblastdb to meke the DB

    Thanks you all

  • #2
    Any sugestion

    Please help me with that

    Comment


    • #3
      It sounds like you are asking for help in generating a FASTA file with a useful description line (which you can then turn into a BLAST database). How are you making the FASTA file at the moment?

      Comment


      • #4
        using perl

        Im using a perl script but i cant get the original query (cDNA) insted im getting out the the protein query (blastx), i don want the protein sequence.

        Code:
        #!/usr/bin/perl  
        use Bio::SearchIO;
        $report_obj = new Bio::SearchIO(-format => 'blast',                                   
                                          -file   => 'C:\blast-2.2.25+\Lib3_consensus_dbUp.xml');   
        while( $result = $report_obj->next_result ) {     
            while( $hit = $result->next_hit ) {       
              while( $hsp = $hit->next_hsp ) {
                 if ( $hsp->evalue < 0.0001 ) {            
                   print $result->query_name(),"\t",$hit->description(),"\n",$hsp->seq_str('query'),
                   "\n";         
                 }       
               }     
             }   
        }
        How can i put this simbol ">" before the query name?

        Comment


        • #5
          anyone try to make a Db with the description+sequence?

          Comment


          • #6
            Originally posted by detq182 View Post
            anyone try to make a Db with the description+sequence?
            Of course. Just not in the way you are doing it. It is the weekend. The question you are posing is both simple yet so specific to how you are approaching it that I do not think that anyone wanted to take the time over the weekend to try solving your problem. Especially when you post something like:

            How can i put this simbol ">" before the query name?
            Ah. Did you even try a
            Code:
            print '>'
            ???? People generally help others who show some initiative in solving their own problems.

            Comment


            • #7
              Originally posted by westerman View Post
              Of course. Just not in the way you are doing it. It is the weekend. The question you are posing is both simple yet so specific to how you are approaching it that I do not think that anyone wanted to take the time over the weekend to try solving your problem. Especially when you post something like:



              Ah. Did you even try a
              Code:
              print '>'
              ???? People generally help others who show some initiative in solving their own problems.
              Im sorry if i dont show some initiative in solving my problem, im in finals on the college and i started just a few days ago learning "Unix and Perl Primer for Biologists", im new in this just 2 month doing some bioinformatics, if the question is stupid im really sorry, im just starting.

              hope that we are OK.

              Comment


              • #8
                Hi detq182,
                Why not delimit your fasta header with a special character such as colon or vertical bar?
                For example:
                >header | supplemental-info

                This way, you can embed many annotations adjacent to your fasta header.
                If I'm not mistaken, EBI-GOA follows the above convention.

                Comment


                • #9
                  thanks

                  Im going to try that

                  Comment


                  • #10
                    Originally posted by detq182 View Post
                    Im using a perl script but i cant get the original query (cDNA) insted im getting out the the protein query (blastx), i don want the protein sequence.

                    Code:
                    #!/usr/bin/perl  
                    use Bio::SearchIO;
                    $report_obj = new Bio::SearchIO(-format => 'blast',                                   
                                                      -file   => 'C:\blast-2.2.25+\Lib3_consensus_dbUp.xml');   
                    while( $result = $report_obj->next_result ) {     
                        while( $hit = $result->next_hit ) {       
                          while( $hsp = $hit->next_hsp ) {
                             if ( $hsp->evalue < 0.0001 ) {            
                               print $result->query_name(),"\t",$hit->description(),"\n",$hsp->seq_str('query'),
                               "\n";         
                             }       
                           }     
                         }   
                    }
                    How can i put this simbol ">" before the query name?
                    This is a great start, but you will need to add a couple of steps if are trying to add annotations to your original fasta file of sequences. What I mean is that printing the HSP string for the query and hit will not be the entire sequence, just the part involved in the match. If you are only interested in the match part, then just add

                    Code:
                    ">".
                    to the beginning of your print string (following the word "print" of course). Spaces outside of the quotes don't matter, but spaces inside the quotes are important. One more thing is that you will want to delimit your header with something other than a tab, as was previously suggested. That is as easy as replacing the "\t" in the print string with "|".

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM
                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    21 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    23 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    18 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    49 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X