No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Blast > parsing result in Exel

    Hy everybody,

    in this situation froma blast (-m 1) result file :

    Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
    Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
    "Gapped BLAST and PSI-BLAST: a new generation of protein database search
    programs", Nucleic Acids Res. 25:3389-3402.

    Query= 132-291
    (59 letters)

    Database: Scrivania/orchidea/mature_mirBase.fa
    21,643 sequences; 470,608 total letters


    Score E
    Sequences producing significant alignments: (bits) Value

    mtr-miR2644b MIMAT0013413 Medicago truncatula miR2644b 28 0.031
    mtr-miR2644a MIMAT0013412 Medicago truncatula miR2644a 28 0.031
    gga-miR-1704 MIMAT0007596 Gallus gallus miR-1704 22 1.9
    gga-miR-1557 MIMAT0007414 Gallus gallus miR-1557 22 1.9
    mmu-miR-880-5p MIMAT0017266 Mus musculus miR-880-5p 22 1.9

    132_0 8 cagccgctcagattgatggtgcctacagccttgccagcccgctcagattgat 59
    12631 5 .............. 18
    12630 5 .............. 18
    7826 5 ........... 15
    7644 19 ........... 9
    5394 3 ........... 13
    5394 3 ........... 13
    BLASTN 2.2.21 [Jun-14-2009]

    Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,

    I need to parse in an exel sheet :

    1)ID 2)Name of the hit 3)E-value 4)Score 5)Species

    1) 132-291 2)mir2644b 3) 0,031 4)28 5) Medicago truncatula

    Is possible from a big blast result file obtain an exel with 5 columns where every field is the first hit of the blast result. Can anyone halp me to fix this problem ??? Also with a little script in perl.

    Thank you very much

  • #2
    use -m 8 for tabular output and then import in excel


    • #3
      I know the -m 8 view but give me another result respect to m1 with lack of information. So i ask you a little script to handle the txt file and parse it on exel.


      • #4
        You really don't want to try and use Excel for parsing plain text BLAST output. Parsing plain text BLAST output is annoying enough in a proper language like Perl or Python - BioPerl, Biopython and the NCBI don't recommend it. Rather they recommend to use the tabular output (simpler) or the XML ouput (richer).

        Note BLAST+ lets you request quite a lot of extra columns of information in the tabular output. If that still isn't enough, I would write a script (not using Excel) to parse the extra information from the XML BLAST output.

        In fact, you really shouldn't want to use Excel for Bioinformatics in the first place. One very nicely documented reason is here
        Last edited by maubp; 11-15-2011, 04:57 AM. Reason: Note about BLAST+ extra columns in tabular output; recommendation


        • #5
          Thanks you very much, i did not know about this limit, i'll read the paper.



          • #6
            I see you're opting for Perl, in which case using BioPerl to parse the BLAST text output is a very good idea:


            • #7
              Yes, infact !!! Thanks you very much !!!


              Latest Articles


              • seqadmin
                Advanced Tools Transforming the Field of Cytogenomics
                by seqadmin

                At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                09-26-2023, 06:26 AM
              • seqadmin
                How RNA-Seq is Transforming Cancer Studies
                by seqadmin

                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                09-07-2023, 11:15 PM
              • seqadmin
                Methods for Investigating the Transcriptome
                by seqadmin

                Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                Whole Transcriptome RNA-seq
                Whole transcriptome sequencing...
                08-31-2023, 11:07 AM





              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:57 AM
              0 responses
              Last Post seqadmin  
              Started by seqadmin, 09-26-2023, 07:53 AM
              0 responses
              Last Post seqadmin  
              Started by seqadmin, 09-25-2023, 07:42 AM
              0 responses
              Last Post seqadmin  
              Started by seqadmin, 09-22-2023, 09:05 AM
              0 responses
              Last Post seqadmin