Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BLAST+ and blasting against the NCBI database

    Hey guys, I'm trying to run blastx on a transcriptome against NCBI's database so I can get an annotation. I'm using a HPC with trinotate on it, and I'm assuming that I've installed blast+ correctly.

    How do I BLAST against the NCBI website's database? I'm aware that you can configure any database file for blast+ to run data against, but what if I want to run a large amount of RNA contigs (like 60,000) against NCBI's website database? I'm aware of the -remote option, but this option will default my computer to using one core to run this job. I assume that the -remote option will let my data run on NCBI's HPCs, but I don't want my job to get timed out.

    Any help? How do you approach using blastx on a large dataset?

    Thanks!

  • #2
    For that large number of input sequences you would want to run this on a local cluster by splitting your input into multiple file and then running the jobs in parallel. I don't think the "-remote" option is meant for huge datasets.

    You may want consider against swissprot/trembl or refseq databases to limit your search space. Is this search for annotating a new transcriptome?

    Comment


    • #3
      Yes, it's a new transcriptome for copidosoma. I've tried running this data against uniprot, but I keep on getting low quality hits for drosophila and humans. Of course, I don't know what good outputs for this animal would look like. I'd assume I'd get good hits for Nasonia, assuming that the uniprot database has Nasonia vitripennis in it.

      So,in essence, I can either divide my file and run -remote or use an existing database from an animal like Nasonia to get my annotations?

      Comment


      • #4
        If Nasonia is the closest relative you can use then search against this Ensembl protein set: ftp://ftp.ensemblgenomes.org/pub/met...tripennis/pep/ or from Nasoniabase: http://hymenopteragenome.org/nasonia...v1.2_pep.fa.gz Annotation appears to be there too: http://hymenopteragenome.org/nasonia...GSv1.2.gff2.gz

        The -remote option is probably not meant for 60K items. NCBI may ban your IP if you try to launch too many jobs.

        Comment


        • #5
          Take a look to Blast2Go
          The pro version will speed your searching by using their Cloud faciltites, and you can use your own local databases as well

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          45 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X