Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get FASTA sequences from GI number

    Hey guys I need help

    I need to download a large amount of FASTA sequences from a set of GI number.

    Is there any script to do this??

    I know I could do it with http://www.ncbi.nlm.nih.gov/sites/batchentrez , but I have too many sequences (and It says to split if they are too many) and I really don't want to do it via browser


    Thank you

  • #2
    Did you mean to cross post this in Biostars? Maybe remove the question from this list or from that list.

    Comment


    • #3
      Originally posted by bt27uk View Post
      Did you mean to cross post this in Biostars? Maybe remove the question from this list or from that list.
      Since I'm blocked with this problem for a couple of days, I tryied to ask in different forums with different people in order to get an answer as soon as possible.

      I can't see the problem.

      If it is contrary to any rules of seqanswer, I'll delete it

      Comment


      • #4
        cross posted on biostars: https://www.biostars.org/p/112410/

        Comment


        • #5
          One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

          You can put a list of your GI numbers in a file like so (one per line):

          Code:
          $ more gi_list.txt 
          4
          7
          78
          324
          
          $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa

          Comment


          • #6
            Originally posted by GenoMax View Post
            One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

            You can put a list of your GI numbers in a file like so (one per line):

            Code:
            $ more gi_list.txt 
            4
            7
            78
            324
            
            $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa
            Thank you very much.

            It is exactly what I was looking for!!!

            Comment


            • #7
              Originally posted by fefe89 View Post
              Since I'm blocked with this problem for a couple of days, I tryied to ask in different forums with different people in order to get an answer as soon as possible.

              I can't see the problem.

              If it is contrary to any rules of seqanswer, I'll delete it
              As has been said before it creates more work for folks who are answering the questions.

              It is ok to cross-post but please close your post out on all forums (cross-referencing the solution, once you find one that you like).

              Comment


              • #8
                Originally posted by GenoMax View Post
                As has been said before it creates more work for folks who are answering the questions.

                It is ok to cross-post but please close your post out on all forums (cross-referencing the solution, once you find one that you like).
                OK. The other post has been already closed.

                Comment


                • #9
                  Originally posted by fefe89 View Post
                  I need to download a large amount of FASTA sequences from a set of GI number.

                  Is there any script to do this??
                  The recommended way to do this is with Eutils. Eutils is a Web-service offert by the NCBI.

                  There already exist several threads about using Eutils as well in this forum as in Biostars.

                  Comment


                  • #10
                    Originally posted by GenoMax View Post
                    One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

                    You can put a list of your GI numbers in a file like so (one per line):

                    Code:
                    $ more gi_list.txt 
                    4
                    7
                    78
                    324
                    
                    $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa
                    Hi !

                    I'm beginner in bioinformatics (and new on the forum) and I have the same problem as fefe89. Your answer (here above) seems totally appropriate for my problem but I have a very naive question (seems simple but I don't find an adequate answer on google) : how can I use the blastdbcmd command line if I don't want to download the (heavy) nt databases on my own computer ? Or am I forced to download the nt locally before running the command line ?

                    Thank you in advance for your understanding (certainly a newbies question...)

                    Comment


                    • #11
                      Originally posted by ericaf View Post
                      Hi !
                      how can I use the blastdbcmd command line if I don't want to download the (heavy) nt databases on my own computer ? Or am I forced to download the nt locally before running the command line ?

                      Thank you in advance for your understanding (certainly a newbies question...)
                      If you don't want to download the blast database locally take look at the NCBI e-utils option (referred to in one of the posts above). You will need to do some additional work to create the right query URL's.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Advanced Tools Transforming the Field of Cytogenomics
                        by seqadmin


                        At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                        09-26-2023, 06:26 AM
                      • seqadmin
                        How RNA-Seq is Transforming Cancer Studies
                        by seqadmin



                        Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                        09-07-2023, 11:15 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 09-29-2023, 09:38 AM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-27-2023, 06:57 AM
                      0 responses
                      12 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-26-2023, 07:53 AM
                      1 response
                      25 views
                      0 likes
                      Last Post seed_phrase_metal_storage  
                      Started by seqadmin, 09-25-2023, 07:42 AM
                      0 responses
                      17 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X