Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Markiyan
    replied
    Just use space separated HASH - like list after the seq_id kile key="vallue"

    In this case I would use the hash - like list:

    >[asm_id.contig_id] key1="vallue1" key2="valllue2" and so on

    example:

    >SP_AS1.CO0001 gene="dnaA" product="replication initiation protein" colour="255 128 0" db_xref="234234,623461,123634"
    ATG.....

    The most critical bits are the ID format/structure and the consistent fields list.

    Then it can be written as csv, embl, etc files:

    ID,gene,product,colour,db_xref,seq,seq_aa

    PS: if you encounter problems with blast+, than use the NCBI's legacy blast.

    see more at: blastedbio.blogspot.co.uk
    Last edited by Markiyan; 03-03-2016, 05:40 AM. Reason: update about blast+ issues.

    Leave a comment:


  • GenoMax
    replied
    Since the exact format of the csv file you have may be different see if you can find a local informatics person who can help. This should be a simple script (or even an awk/sed solution may be enough).

    If you post a few example lines of your headers/annotation someone here can help.

    Leave a comment:


  • What is the best/easiest way to apply annotation info to a fasta defline?

    I've been doing some de novo transcriptome work and using standalone BLAST and other programs to identify gene homologs, features, etc. and I would like to know what the best way to add some of this information after the > in the fasta/fastq defline.

    I often have my BLAST results output to .csv so it's possible to do a lookup-append kind of script (which is what I usually do), but I want to know if there is an actual toolkit or more versatile script for attaching this information.

    What do you usually use for this?

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Technologies
    by seqadmin







    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

    Long-Read Sequencing
    Long-read sequencing has...
    12-02-2024, 01:49 PM
  • seqadmin
    Genetic Variation in Immunogenetics and Antibody Diversity
    by seqadmin



    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
    11-06-2024, 07:24 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 12-02-2024, 09:29 AM
0 responses
132 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 09:06 AM
0 responses
48 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 08:03 AM
0 responses
38 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-22-2024, 07:36 AM
0 responses
68 views
0 likes
Last Post seqadmin  
Working...
X