Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Searching databases

    Dear all,

    I am new to whole genome sequencing, I never sequenced complete genomes. I used to sequence small pieces of DNA.
    However I have been sequencing lots of DNA lately and I realised I never knew how to look for multiple DNA sequences at the same time in for example the NCBI database.

    Normally I only have 1 or maybe 2, max 3 sequences and then I simple enter then manually and search for matches, but now I got 98 contigs/sequences, how can I enter those all at once so that I dont need to enter 98 sequences one by one manuelly?

    There must be programs out there, but I am guessing they cost a lot of money?
    And since I am not good with writing my own programs, programming such thing isnt much of an option I am afraid.

    Or is there an option to enter more sequences at once at the ncbi website?

    thanks in advance.

  • #2
    Originally posted by phillie View Post
    Or is there an option to enter more sequences at once at the ncbi website?
    If you hve sequence IDs for which you want to retrieve information, see if Batch Entrez suites you

    If you have a file of sequences (e.g. fasta) to blast then the web interface of blast ( lets you upload that file and search different databases.

    If can be more specific about what you want to achieve maybe you can get better answers...



    • #3
      Hallo dariober,

      WHat I have, are simple faste files with nucleotide sequences.
      And what I do is pretty simple: I use the second link you gave me to search for matches in the databases.

      However: I want to know how I can make it easier for myself when I have for example 50 different files/contigs.

      At this moment: I manually copy and paste each file in the ncbi database, search for matches, check the matches, and repeat this for the second sequence (untill know I just sequence perhaps 1 or 2 contigs each week).
      But what when I get 50 contigs at once.. I dont want to repeat the search 50 times for each file..
      So I wonder if I can simple "load" the 50 files all at once...

      PS at the website of ncbi, I can upload a file and it says I can upload a list of sequences, however I seem not to be able to do this.. when I copy the information from file 2 in file 1, it does not seem to work.
      + I still would need to open each file and copy the sequence and paste it in another file (with all the sequences).
      so I wonder whether I can simple upload all the files at once or something like that.

      ("Use the browse button to upload a file from your local disk. The file may contain a single sequence or a list of sequences. The data may be either a list of database accession numbers, NCBI gi numbers, or sequences in FASTA format")
      Last edited by phillie; 07-15-2012, 01:48 AM.


      • #4
        Originally posted by phillie View Post
        So I wonder if I can simple "load" the 50 files all at once...
        Hi- (I hope I'm not misunderstanding your question...) I don't think loading more than one file is possible, however the workaround is quite simple: Concatenate all the files to a single one and upload this big file to blast.
        If you are on Mac or Linux or Windows/Cygwin it's very simple to do on the command line:
        cd /path/to/my/fastas ## Move to dir with your FASTA files
        ## Concatenate files (assuming you have 3 files):
        cat myfile1.fasta myfile2.fasta myfile3.fasta > catfile.fasta
        ## Or, if all the files whose name ending in "fasta" have to be concatenated:
        cat *.fasta > catfile.fasta
        Now, just upload catfile.fasta to blast using "Browse" "Upload file"
        (If you end up with different sequences having the same name, I'm not sure how BLAST is going to handle it though...)

        Good luck!


        • #5
          Originally posted by dariober View Post
          Hi- (I hope I'm not misunderstanding your question...) I don't think loading more than one file is possible, however the workaround is quite simple: Concatenate all the files to a single one and upload this big file to blast.
          If you are on Mac or Linux or Windows/Cygwin it's very simple to do on the command line:
          cd /path/to/my/fastas ## Move to dir with your FASTA files
          ## Concatenate files (assuming you have 3 files):
          cat myfile1.fasta myfile2.fasta myfile3.fasta > catfile.fasta
          ## Or, if all the files whose name ending in "fasta" have to be concatenated:
          cat *.fasta > catfile.fasta
          Now, just upload catfile.fasta to blast using "Browse" "Upload file"
          (If you end up with different sequences having the same name, I'm not sure how BLAST is going to handle it though...)

          Good luck!

          I tried this with some txt files (just changed the .fasta with .txt, because I dont have the fasta files on my home computer, I only have some txt files, however it did not work: it created the new files, but those files are empty...

          the command prom also says it doesnt not recognize cat as an internal or external ...
          If I leave the "cat" out of the command, it does the same: it just created a text file that is empty...


          Latest Articles


          • seqadmin
            Understanding Genetic Influence on Infectious Disease
            by seqadmin

            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
            09-09-2024, 10:59 AM
          • seqadmin
            Addressing Off-Target Effects in CRISPR Technologies
            by seqadmin

            The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
            08-27-2024, 04:44 AM





          Topics Statistics Last Post
          Started by seqadmin, Today, 06:25 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 01:02 PM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 09-18-2024, 06:39 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 09-11-2024, 02:44 PM
          0 responses
          Last Post seqadmin  