Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • detq182
    Member
    • Feb 2012
    • 20

    Make my own blast DB

    Hi everyone

    I have a question about how to make my own BLASTdb using the result of my Ref-seq work before the annotation step, i mean not like this

    >CL1Contig1
    ACGGGGGAGGCACCATTATTTGGGCTGCAGACAACAAACTGAAATTCTGGCGGCCCGA

    I want it like this with the annotation of the sequennce
    >CL1Contig1 Nascent polypeptide associated complex alpha
    ACGGGGGAGGCACCATTATTTGGGCTGCAGACAACAAACTGAAATTCTGGCGGCCCGA

    Any script of bioperl that could help for my question? or any different solution.

    Note: I known that i have to use format db or makeblastdb to meke the DB

    Thanks you all
  • detq182
    Member
    • Feb 2012
    • 20

    #2
    Any sugestion

    Please help me with that

    Comment

    • maubp
      Peter (Biopython etc)
      • Jul 2009
      • 1544

      #3
      It sounds like you are asking for help in generating a FASTA file with a useful description line (which you can then turn into a BLAST database). How are you making the FASTA file at the moment?

      Comment

      • detq182
        Member
        • Feb 2012
        • 20

        #4
        using perl

        Im using a perl script but i cant get the original query (cDNA) insted im getting out the the protein query (blastx), i don want the protein sequence.

        Code:
        #!/usr/bin/perl  
        use Bio::SearchIO;
        $report_obj = new Bio::SearchIO(-format => 'blast',                                   
                                          -file   => 'C:\blast-2.2.25+\Lib3_consensus_dbUp.xml');   
        while( $result = $report_obj->next_result ) {     
            while( $hit = $result->next_hit ) {       
              while( $hsp = $hit->next_hsp ) {
                 if ( $hsp->evalue < 0.0001 ) {            
                   print $result->query_name(),"\t",$hit->description(),"\n",$hsp->seq_str('query'),
                   "\n";         
                 }       
               }     
             }   
        }
        How can i put this simbol ">" before the query name?

        Comment

        • detq182
          Member
          • Feb 2012
          • 20

          #5
          anyone try to make a Db with the description+sequence?

          Comment

          • westerman
            Rick Westerman
            • Jun 2008
            • 1104

            #6
            Originally posted by detq182 View Post
            anyone try to make a Db with the description+sequence?
            Of course. Just not in the way you are doing it. It is the weekend. The question you are posing is both simple yet so specific to how you are approaching it that I do not think that anyone wanted to take the time over the weekend to try solving your problem. Especially when you post something like:

            How can i put this simbol ">" before the query name?
            Ah. Did you even try a
            Code:
            print '>'
            ???? People generally help others who show some initiative in solving their own problems.

            Comment

            • detq182
              Member
              • Feb 2012
              • 20

              #7
              Originally posted by westerman View Post
              Of course. Just not in the way you are doing it. It is the weekend. The question you are posing is both simple yet so specific to how you are approaching it that I do not think that anyone wanted to take the time over the weekend to try solving your problem. Especially when you post something like:



              Ah. Did you even try a
              Code:
              print '>'
              ???? People generally help others who show some initiative in solving their own problems.
              Im sorry if i dont show some initiative in solving my problem, im in finals on the college and i started just a few days ago learning "Unix and Perl Primer for Biologists", im new in this just 2 month doing some bioinformatics, if the question is stupid im really sorry, im just starting.

              hope that we are OK.

              Comment

              • phoss
                Member
                • Aug 2011
                • 12

                #8
                Hi detq182,
                Why not delimit your fasta header with a special character such as colon or vertical bar?
                For example:
                >header | supplemental-info

                This way, you can embed many annotations adjacent to your fasta header.
                If I'm not mistaken, EBI-GOA follows the above convention.

                Comment

                • detq182
                  Member
                  • Feb 2012
                  • 20

                  #9
                  thanks

                  Im going to try that

                  Comment

                  • SES
                    Senior Member
                    • Mar 2010
                    • 275

                    #10
                    Originally posted by detq182 View Post
                    Im using a perl script but i cant get the original query (cDNA) insted im getting out the the protein query (blastx), i don want the protein sequence.

                    Code:
                    #!/usr/bin/perl  
                    use Bio::SearchIO;
                    $report_obj = new Bio::SearchIO(-format => 'blast',                                   
                                                      -file   => 'C:\blast-2.2.25+\Lib3_consensus_dbUp.xml');   
                    while( $result = $report_obj->next_result ) {     
                        while( $hit = $result->next_hit ) {       
                          while( $hsp = $hit->next_hsp ) {
                             if ( $hsp->evalue < 0.0001 ) {            
                               print $result->query_name(),"\t",$hit->description(),"\n",$hsp->seq_str('query'),
                               "\n";         
                             }       
                           }     
                         }   
                    }
                    How can i put this simbol ">" before the query name?
                    This is a great start, but you will need to add a couple of steps if are trying to add annotations to your original fasta file of sequences. What I mean is that printing the HSP string for the query and hit will not be the entire sequence, just the part involved in the match. If you are only interested in the match part, then just add

                    Code:
                    ">".
                    to the beginning of your print string (following the word "print" of course). Spaces outside of the quotes don't matter, but spaces inside the quotes are important. One more thing is that you will want to delimit your header with something other than a tab, as was previously suggested. That is as easy as replacing the "\t" in the print string with "|".

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM
                    • SEQadmin2
                      Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                      by SEQadmin2


                      With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                      Introduction

                      Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                      05-22-2026, 06:42 AM
                    • SEQadmin2
                      Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                      by SEQadmin2

                      Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                      Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                      05-06-2026, 09:04 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Yesterday, 08:59 AM
                    0 responses
                    13 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 12:03 PM
                    0 responses
                    22 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 11:40 AM
                    0 responses
                    19 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 05-28-2026, 11:40 AM
                    0 responses
                    32 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...