Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get FASTA sequences from GI number

    Hey guys I need help

    I need to download a large amount of FASTA sequences from a set of GI number.

    Is there any script to do this??

    I know I could do it with http://www.ncbi.nlm.nih.gov/sites/batchentrez , but I have too many sequences (and It says to split if they are too many) and I really don't want to do it via browser


    Thank you

  • #2
    Did you mean to cross post this in Biostars? Maybe remove the question from this list or from that list.

    Comment


    • #3
      Originally posted by bt27uk View Post
      Did you mean to cross post this in Biostars? Maybe remove the question from this list or from that list.
      Since I'm blocked with this problem for a couple of days, I tryied to ask in different forums with different people in order to get an answer as soon as possible.

      I can't see the problem.

      If it is contrary to any rules of seqanswer, I'll delete it

      Comment


      • #4
        cross posted on biostars: https://www.biostars.org/p/112410/

        Comment


        • #5
          One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

          You can put a list of your GI numbers in a file like so (one per line):

          Code:
          $ more gi_list.txt 
          4
          7
          78
          324
          
          $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa

          Comment


          • #6
            Originally posted by GenoMax View Post
            One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

            You can put a list of your GI numbers in a file like so (one per line):

            Code:
            $ more gi_list.txt 
            4
            7
            78
            324
            
            $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa
            Thank you very much.

            It is exactly what I was looking for!!!

            Comment


            • #7
              Originally posted by fefe89 View Post
              Since I'm blocked with this problem for a couple of days, I tryied to ask in different forums with different people in order to get an answer as soon as possible.

              I can't see the problem.

              If it is contrary to any rules of seqanswer, I'll delete it
              As has been said before it creates more work for folks who are answering the questions.

              It is ok to cross-post but please close your post out on all forums (cross-referencing the solution, once you find one that you like).

              Comment


              • #8
                Originally posted by GenoMax View Post
                As has been said before it creates more work for folks who are answering the questions.

                It is ok to cross-post but please close your post out on all forums (cross-referencing the solution, once you find one that you like).
                OK. The other post has been already closed.

                Comment


                • #9
                  Originally posted by fefe89 View Post
                  I need to download a large amount of FASTA sequences from a set of GI number.

                  Is there any script to do this??
                  The recommended way to do this is with Eutils. Eutils is a Web-service offert by the NCBI.

                  There already exist several threads about using Eutils as well in this forum as in Biostars.

                  Comment


                  • #10
                    Originally posted by GenoMax View Post
                    One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

                    You can put a list of your GI numbers in a file like so (one per line):

                    Code:
                    $ more gi_list.txt 
                    4
                    7
                    78
                    324
                    
                    $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa
                    Hi !

                    I'm beginner in bioinformatics (and new on the forum) and I have the same problem as fefe89. Your answer (here above) seems totally appropriate for my problem but I have a very naive question (seems simple but I don't find an adequate answer on google) : how can I use the blastdbcmd command line if I don't want to download the (heavy) nt databases on my own computer ? Or am I forced to download the nt locally before running the command line ?

                    Thank you in advance for your understanding (certainly a newbies question...)

                    Comment


                    • #11
                      Originally posted by ericaf View Post
                      Hi !
                      how can I use the blastdbcmd command line if I don't want to download the (heavy) nt databases on my own computer ? Or am I forced to download the nt locally before running the command line ?

                      Thank you in advance for your understanding (certainly a newbies question...)
                      If you don't want to download the blast database locally take look at the NCBI e-utils option (referred to in one of the posts above). You will need to do some additional work to create the right query URL's.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Genetic Variation in Immunogenetics and Antibody Diversity
                        by seqadmin



                        The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                        11-06-2024, 07:24 PM
                      • seqadmin
                        Choosing Between NGS and qPCR
                        by seqadmin



                        Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                        10-18-2024, 07:11 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 11:09 AM
                      0 responses
                      31 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:13 AM
                      0 responses
                      26 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 11-01-2024, 06:09 AM
                      0 responses
                      32 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 10-30-2024, 05:31 AM
                      0 responses
                      22 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X