Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • C Linux command line use of gene_info.gz

    Hi.

    I am working with Affy MicroArray data with R and bioconductor for a course and for my project I am reanalyzing the U133A and U133B data published by the group on this website:

    NCBI's Gene Expression Omnibus (GEO) is a public archive and resource for gene expression data.


    I have a clear idea about partitioning the data based on my thesis work which was written in C and utilizes the UniProt text flatfile downloadable from this site:



    My program can be downloaded from:



    I also have a tar vxfz extractable at http://kayve.net/promog.tgz but I have since tweaked the promog.c and Makefile. The tarball also contains writeups and power point and various runs, etc. A lot of stuff including an old uniprot_sprot.dat file.

    I was attracked by this post:

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    specifically, the posting of gene_info.gz. I am not sure this is the data I need, but even if it is I do not understand it fully. I was investigating the bioDBnet tools, but I am not sure it has what I want.



    I have found web based tools out there and tutorials that suggest cut and paste and XML, but for me the path of least resistance is to just write some C code to do what I want, since my code is not trivial in its algorithm I devised to come up with my partitioning. I partition all 432,660 proteins in the file, so I won't be cutting and pasting. I'm not really interested in running a bunch of low performance object oriented stuff, I just want to understand the data so I can modify my c programs.

    Is conversion data for the Affy arrays I have mentioned above in that gene_info.gz file to UniProt flatfile text accession IDs that so that I may dovetail my partition results with the software I have written myself, or do I need to look to other data? What is the column header information so I can understand the data adequately to peform this task?
    *----------------------------------------------------------*
    Kayven Riese, MSCS,
    MS (Physiology and Biophysics)
    (415) 902 5513 cellular
    http://kayve.net
    Webmaster http://ChessYoga.org
    *----------------------------------------------------------*

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM
  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:57 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-06-2024, 07:17 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-02-2024, 08:06 AM
0 responses
19 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-30-2024, 12:17 PM
0 responses
24 views
0 likes
Last Post seqadmin  
Working...
X