Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BLASTP on linux

    Medtr3g064580.1 gi|356551984|ref|XM_003544304.1| 81.58 152 12 10 1873 2010 1777 1926 2e-20 111 12-T3G-1GT2-T-G1-GAG14AG7-T-C-A2-A-T1-ATG5T-TATC4-T1TC12GA5AC5CT2AC2TC35-G-A-A10

    Medtr3g064580.1 gi|356499059|ref|XM_003518314.1| 79.03 1865 237 105 1 1803 1 1773 0.0 1136 10T-2C-A-8TC5TC2GA8CT5TC12TG10CT11TC8TA1GA9GA8CA17CT29GA6CT4GA5AG8AG8TC1-A1C-5ATAC13TC17CA2TCAC1AT5AC20AC8AG3CA1CA2GA2GA3GT1TC3AG4CG2CT2TG2AC24AC13AC5AC5CT2TAGAGA15GA2TC11AG8TC2C-2T-2TATAC-3TA12GT17AG1AGTG9GC1AG8GC8AG10-T-C1-C9A-A-T-3GC3G-G-C-4AC2TC2TA14AT8CT5GATA1GA3TC10AG1AGTC4-A2A-3GAGT1CG1TA10GT5AG-G3G-2CG1ATAT2AT2T-2-A2AC4A-5-A-A1TG1-G1-C1AT4-C1-C3A-A-4GA6A-1-C2CG9GC3GA2TATA3AT4GC4TC4GC5-C-C2GTCA1CT-T10-C1A-7C-A-5CA4A-4TG3CT1CG1T-1A-1T-10CT22GA12AG9-G1C-9CT11AT2TC14TA2GCA-A-1T-1TA8GA9T-T-A-1TC3AC9-C1GAT-4-G1T-4AC27AT2TA2AT7CT6AG5GA5CT5AC8AG2TG1CA2TGTG9GA1-T-G1-T9GC2G-A-T-1AG4AG10TC2TC8AC1AC13G-1GC1CAT-G-5-G-G1-A3AGTC2CA5AG3A-T-T-1AC7GA11CG10-G1T-4TA2AG2TC2GA3CG1TG3-G2T-5CT5TA2AG8AG29CT-G2T-6GC1G-2G-2T-T-T-T-8G-C-1T-6GA4TC5TA2GA10CG4A-A-A-A-5ATG-G-2A-G-A-G-A-A-11T-1G-1A-11A-A-A-2TG6AG2AG-C1T-6TC1G-T-1A-2TC3-A-A1-C2-C-G-A7AT4GC2TC8TC5-T-G-G1AC4TC5TC2TC5TA3AG7-A-G-A11AG4AC1AT4AT3AG2TC10CT2C-1AGA-A-2TC1GA12TA3GT1TG2AT1GACT7CTAG2CG18AGCTAT11ATAG1ATAC7AC4A-2C-5TA1-T2ATAGAG2-C1-T2-C4-A1-T2-G-T12AGAG5AC1TA2-G1CTC-2-T-G-G4T-3-A2AG1A-2C-2TAC-8-T-A-A1TG8AT2TG5AGAG1TA1-G1AT1G-1-T4T-2AT7TC3AT14GA3AG1AC9TG1TG3GA3TA18-A1C-8

    Medtr3g064580.1 gi|357491760|ref|XM_003616120.1| 76.50 2068 280 143 1 2001 47 1975 0.0 937 23TA5TC11CT18TG7CT2C-3-T2AG13GA2GA6GA11CACG16CT11AG17GA11-A3A-2AT3TA4AG8-C3C-5ATAC1AT11TC2AG8AT5CG3AC1AT5AC23GT5AG5CA9GT3-A2A-4CA2CA2TA2AGAG23AC13AC5AC5CT2TAGAGA15GA5TA8AG2AG8C-2T-2T-TA2TC2GA1CT27AG1-G1T-21C-1-G6AG2G-G-4C-A-A-T-A-T-1T-T-A-1TC3G-TA1G-C-C-A-2C-A-1G-C-A-A-1T-C-10TC2TC6TC4C-1-G5TC1GA9GA4AG1AG5-A2A-3GAGT1CGCT16-ACA3-A-A4C-C-A-AT1CTAT2T-2-A2AC2C-AT4-TGA1CA1-G1-G1-C1AT4AC4AT7G-G-G-A-1TC2C-A-8GC4GA1TA4ATCTAG2GA4TC3ATGC4-C-C1-A2GACT1CT10-C1A-5AT8CT9TG3CT1CA1T-A-1T-23AT33TG2TA7CA11AG2-A5C-9TA2G-1ACTC2T-G-1GC1T-G-1A-2T-G-A-5TCT-1G-T-C-T-C-3T-T-C-C-1T-G-2TCGA13A-A-C-A-G-A-4TA6AG6AT2TAGC1-T1G-4AT8AG1AG6AG5AG2AC1AGCT2CT5TA2AC2T-2-A1GCGA6AC9GC2G-A-1C-5AG13TC8AC1AC3TC8TCGA1GA1CA4C-2A-T-2AGTC6AG5AG4TCG-1C-3G-1AT5AG3CTCTAG11AT2TA2AG2TA2GT3C-2-A2TGAG2GC2TG-A-G4-A3TCG-2-T3AG13AG6GA8TC4CA-G2T-5AT2G-2G-2T-T-T-T-8G-C-1T-5GT5TC5TC12ATCG2TA1AC5AT-T1C-3AG2GC1GC4A-1GC-A6T-1G-1A-1CA5AGAG2AC4TA4AGATAT3A-T-3TAG-3GATC2CT-G2-G1-C3-G4T-4A-5TA2-C9AGAG4AC1TC5GT11TA3AG7TG2-C-G-G3TC9AC6AT3A-1TA1GC-A12C-1AGA-A-9GA7TG3GT1TA2AT1GACA8-C1-T4C-A-2CA9AG1ATAGCTAT12AC1AT-C3G-16TGCA2GCAC6AT-C-C3-C4TA8CT3ATAG4GT2TG2-G1CTC-2-T-G-G12A-2CT3C-3A-7TC6AT2TG5-G-G-G4-G-G1-A1AT1G-1ATAT-C12-G1T-10GTCTAGAG4GA2CTAG11TC2TG2GA3TA1TC11AG4-A1C-8G-5-C8GCTC2GA2TA8-G2G-3T-1TG1G-C-T-3-A3C-T-3-C1G-2TA2A-2AG1AG3TATA8GA-T3GA3AGAGA-14AG1CT3AT3-A1T-5TA2TAG-2-T3AGTC3GT3-G2G-5AC8AC2TC4CT6TG10TCGA7AG4

    Medtr3g064580.1 gi|224079645|ref|XM_002305867.1| 81.72 465 70 13 1 459 1 456 2e-99 374 5AT20TC3AC1GT8CT2AC2TC8TC3TG4TC7AG9TC3G-2-A1GA9GA2GA35AG8CT2GA5GA2GA6CT1AG2GAGA1AC6TC1GA2AG5TC2-C3C-2CT2ATAC4TCAC1GC8AG8AG2AG2CT3AC7AT3TG16AC2GT2GC2AG3CA1CA2GA4A-1G-3-C-C1GA1C-1-G2CG2CT5AC23TC11AT2AC5A-1-G2CA1CA2TA17GA5TC8AGAC7TC2C-3C-C-1TACA1TA22TA2TA2CT3

    Medtr3g064580.1 gi|224135104|ref|XM_002327531.1| 79.77 514 85 17 1 509 1 500 7e-94 355 5AT4T-2C-A-8TC6AC1GT8CT2AC2TC12TG4TC17TC5CA1GA2GA6GA2GA35AG8CT2GA5GA2GA6CT4GTGA1AC6TC1GA2AG5TC2-C3C-2CT2ATAC4TCAC1GC8AG11AG2CT3AC4AT2AT3TG4AG2TC8AC2GA2GC2AG3CA1CA2GA5-C1GC1T-2GA1C-1-G2CT2CT2TA2AC23TC5AT5AC2AC6TG4CA2TA14TC2GA5TC2TG2TC3AC1AC5TC2C-2T-1C-1TACA3TC20TG5CT3AG1AG1T-1-G2GA4-C1T-3GATCGC2T-G-1T-5AG5GA5AT5

    Medtr4g012510.1 gi|357468330|ref|XM_003604402.1| 100.00 369 0 0 1 369 1 369 0.0 682 369
    I've got result by running blastp on linux, which looks like the above.
    With this result, I don't know what I have to do in the next step.

    I wonder whether there are some kinds of online services or softwares which take the output file produced from blastp as an input, and which gives the final output containing functions or pathways or relationships of the genes.

    Or to extract some insights or interpretation of blastp, how do I have to process the output file produced from blastp?


    This is my first time experience of analyzing data.
    Please help me!
    Thank you in advance.
    Last edited by syintel87; 02-27-2013, 08:20 AM.

  • #2
    What version of blast did you use to get this output? Can you give us the command line you used?

    With blastp you are trying to identify genes/proteins from the database are that homologous (in terms of sequence and/or function) to your query. You would not be able to get pathway mapping/relationship information by a standalone blastp search.

    If you can provide some context as to what you are trying to do then perhaps it would be easy for us to offer suggestions.

    Comment


    • #3
      Originally posted by GenoMax View Post
      What version of blast did you use to get this output? Can you give us the command line you used?

      With blastp you are trying to identify genes/proteins from the database are that homologous (in terms of sequence and/or function) to your query. You would not be able to get pathway mapping/relationship information by a standalone blastp search.

      If you can provide some context as to what you are trying to do then perhaps it would be easy for us to offer suggestions.
      Code:
      blastp [B]-query[/B] Mt_unat_splcds.fa [B]-db[/B] ~/z_blast_ncbi/nr [B]-evalue[/B] 0.001 [B]-out[/B] ncbi_blast_Mt_unat.out [B]-outfmt[/B] "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore btop" 2>> nr.log &
      Thank you so much for your kind reply.
      I used the command above. The version that I used is "ncbi-blast-2.2.27+".

      My goal is to see the differentially expressed genes in plants(infected vs uninfected ones) across five different time points.
      - The blast input file is a fasta file that is made by extracting CDS region based on annotation file.
      - The blast result that I uploaded previously is captured only in the beginning part.
      - Also I have a list of hundreds of differentially expressed genes. (gained by using htseq-count and edgeR) According to this list of genes, I extracted corresponding rows from blast result.

      Even though I ran blast and got result, I am missed where to go for the next step.
      Your tips will really help a lot. Thank you again!!!
      Last edited by syintel87; 02-27-2013, 12:42 PM.

      Comment


      • #4
        Is this a NGS sequencing experiment (RNA-seq)? In that case you would want to do something like this: http://en.wikibooks.org/wiki/Next_Ge..._%28NGS%29/RNA

        Maybe I am missing something but I am not sure how you are going to find differentially expressed genes using blastp searches.

        EDIT: I think you added some additional information to the post above after I wrote this. Unfortunately blastp is not going to help with mapping the genes onto pathways/build relationships.

        I suppose what you are looking for is a tool like DAVID (http://david.abcc.ncifcrf.gov/) which would be useful if you are working with an organism that is available there. There are other options that have been discussed on this forum before along with commercial tools like Ingenuity Pathway Analysis (IPA). Refer to this thread for other suggestions: http://seqanswers.com/forums/showthread.php?t=26992
        Last edited by GenoMax; 02-27-2013, 12:57 PM. Reason: Added clarification

        Comment


        • #5
          Originally posted by GenoMax View Post
          Is this a NGS sequencing experiment (RNA-seq)? In that case you would want to do something like this: http://en.wikibooks.org/wiki/Next_Ge..._%28NGS%29/RNA

          Maybe I am missing something but I am not sure how you are going to find differentially expressed genes using blastp searches.

          EDIT: I think you added some additional information to the post above after I wrote this. Unfortunately blastp is not going to help with mapping the genes onto pathways/build relationships.

          I suppose what you are looking for is a tool like DAVID (http://david.abcc.ncifcrf.gov/) which would be useful if you are working with an organism that is available there. There are other options that have been discussed on this forum before along with commercial tools like Ingenuity Pathway Analysis (IPA). Refer to this thread for other suggestions: http://seqanswers.com/forums/showthread.php?t=26992


          The RNA-seq data was produced by Illumina.

          With the list of genes, I expected to see
          - whether they responded to the infection across different time points,
          - what these genes' function is,
          - whether these genes are related to each other,
          - what is implied in terms of biological aspects rather than just simply say differentially expressed genes were observed.
          - how and what kinds of genes are working together.

          I am going to looking into those tools that you recommended.
          Additionally, I have just started running blast2GO, which provides some functions such as blast or interproscan or pathways or others. Do you think this might be also helpful?

          Really really thank you for your tips!

          Comment


          • #6
            Originally posted by syintel87 View Post
            Code:
            blastp [B]-query[/B] [COLOR="Red"]Mt_unat_splcds.fa[/COLOR] [B]-db[/B] ~/z_blast_ncbi/nr [B]-evalue[/B] 0.001 [B]-out[/B] ncbi_blast_Mt_unat.out [B]-outfmt[/B] "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore btop" 2>> nr.log &
            Thank you so much for your kind reply.
            I used the command above. The version that I used is "ncbi-blast-2.2.27+".

            My goal is to see the differentially expressed genes in plants(infected vs uninfected ones) across five different time points.
            - The blast input file is a fasta file that is made by extracting CDS region based on annotation file.
            If I am reading your description correctly, that your query fasta consists of CDS regions therefore is nucleic acid sequence, not amino acid (protein) sequence. You can not use blastp to compare a nucleic acid query to a protein database.

            You could either...

            1) use blastx instead of blastp. blastx will translate your query in all 6 reading frames to amino acid sequence and compare those to the protein database, or...

            2) translate your CDS sequences from nucleic acid to amino acid and then use blastp to compare the resultant protein sequences to the protein database.

            Since you seem to already know the coding sequence option #2 would probably be the better choice.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            8 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            8 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            66 views
            0 likes
            Last Post seqadmin  
            Working...
            X