Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate .xml file from BLAST (Blast2Go)

    Hello,

    I need to generate an .xml file from BLAST starting with a list of genes (currently in .txt format). I need this to use it as input to Blast2Go.

    Do I require a script to do this, or is there an easier way to do this directly on the online BLAST website?

    Many many thanks....

  • #2
    When you do the blast at NCBI on the blast results page you have the option of saving the alignments as an XML file. See this post for a screenshot: http://seqanswers.com/forums/showpos...92&postcount=7

    Comment


    • #3
      Thank you!

      Thanks so much for the reply!

      I'm actually unsure how to put my whole list into BLAST NCBI as well! (

      If you know of how to do that, I will hopefully be able to reach the results page you kindly provided the screenshot for! Thank you!!!!

      Comment


      • #4
        Those numbers you posted in this thread (http://seqanswers.com/forums/showthread.php?t=45711) are not going to work at NCBI. You will need to parse out the protein sequences from this file http://marinegenomics.oist.jp/genome...0.1.prot.fa.gz.

        If the ID's you have match ones in this protein fasta sequence file then you can use faSomeRecords program from Kent utilities to extract the protein sequences of your interest (http://seqanswers.com/forums/showpos...0&postcount=13).

        You can then use them for the Blast search at NCBI.

        Comment


        • #5
          Thanks so much for your help!

          I am not sure how to parse out the proteins, though. I need to do this pretty fast (unfortunately). Is this a simple process? Any scripts online that I could borrow?

          This would make things much easier, because then I could just work witht he proteins in BLAST.... Thanks again...

          Comment


          • #6
            I posted the procedure in #4 above. Here is a step-wise version. You would need to have access to a linux machine (or OS X) for this to work though.

            1. You will need to download the protein sequence file and then gunzip it (uncompress).

            2. Have your ID's of interest in a text file.

            3. Run this program http://hgdownload.soe.ucsc.edu/admin.../faSomeRecords (this is the linux version) like below.

            Code:
            $   faSomeRecords protein_file.fa yourlistFile output_with_proteins_of_interest.fa

            Comment


            • #7
              Thanks. Does MacOSX count as Linux?

              Comment


              • #8
                Originally posted by PurplePancake View Post
                Thanks. Does MacOSX count as Linux?
                OS X is a certified variant of unix.

                Use the Mac version of the faSomeRecords program in that case in step 3 above: http://hgdownload.soe.ucsc.edu/admin.../faSomeRecords

                Comment


                • #9
                  Thanks so much! I did as you said, but get the error

                  "-bash: faSomeRecords: command not found"

                  I am *really bad* with Linux, and especially installation. I hope to take a course in two semesters so I can do this... because this happens a lot, and I never figure it out... (
                  Last edited by PurplePancake; 08-10-2014, 06:35 PM. Reason: clarity

                  Comment


                  • #10
                    Basically, all I did was download the faSomeRecords file, and moved the other two files into the Download folder. Then I typed:

                    faSomeRecords prot.fa geneList.rtf outProteins.fa

                    I know there are additional steps to do to "prepare the command"? But I always mess things up, and really freeze when it comes to installation....
                    Last edited by PurplePancake; 08-10-2014, 06:34 PM. Reason: clarity

                    Comment


                    • #11
                      Hello,

                      I don't feel that I really changed much, but for some reason, there is no error any more. I just did this instead:

                      ./faSomeRecords prot.fa geneList.odt outProteins.fa

                      There is no error, but there is also nothing in outProteins.fa.

                      Does this mean there is not enough information to determine proteins and genes identification in geneList?\

                      Thanks!!

                      Comment


                      • #12
                        Also, is there any reason why nothing would match the proteins? I had asked similar questions elsewhere, and was told that since coral was so old, and branched off before mammals, then some of this RNAseq can be difficult/impossible?

                        Okay, I will stop asking so many questions now

                        Comment


                        • #13
                          The ID's you posted in the other thread do not seem to match the protein ID's for both sets here: http://marinegenomics.oist.jp/genome...s?project_id=3 What file did you get your ID's from?
                          Last edited by GenoMax; 08-11-2014, 05:01 AM.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Recent Developments in Metagenomics
                            by seqadmin





                            Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                            09-23-2024, 06:35 AM
                          • seqadmin
                            Understanding Genetic Influence on Infectious Disease
                            by seqadmin




                            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                            09-09-2024, 10:59 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 04:51 AM
                          0 responses
                          8 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 10-01-2024, 07:10 AM
                          0 responses
                          13 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 09-30-2024, 08:33 AM
                          0 responses
                          18 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 09-26-2024, 12:57 PM
                          0 responses
                          16 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X