Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I automate the graphing of these data?

    Hi, everybody,

    I have result files generated by blastn which then were sorted based on the second field. A typical file looks like:

    360 miR156a
    1 miR156a
    9 miR156a
    1 miR156a
    10 miR156a
    7 miR156a
    1 miR156a
    705 miR157a
    2 miR157a
    1 miR157a
    5 miR157a
    4 miR157a
    67 miR157a
    5 miR157a
    11 miR157a
    2 miR157a
    34 miR159
    3 miR162
    3 miR166a
    17 miR166a
    4 miR166a
    103 miR167a
    1 miR167a
    ... .....

    The first column is the deepseq read counts for each unique sequence. The 2nd column is the miR IDs that the sequence was aligns to.
    I would like to:
    1)
    Sum the total read counts for each miR IDs (e.g. for miR156a, sum row1-row7);
    Generate a bar graph to show the total read counts for each miR ID.


    I have more than 20 files like this. I would like to use an automated way of doing this. The R package came to my minds.
    But I have not used R before. Can you guys give me some tips or suggestions as about which R package or tools to use? (I can then learn those and figure out)


    2)
    If possible, generate a table that summarize all the total reads info from the 20 files.
    The table that I would like to have is as follows:

    miRID sample1 sample2 sample3 ......... sample 20
    miR156 103 300 450 .......... 33
    miR157 205 300 ..........
    miR167 .....
    .... .......


    Thanks a lot!!

    Jian
    Last edited by yangjianhunt; 06-29-2012, 09:14 AM.

  • #2
    For 1), the bar plot part is easy in R; just use barplot() !

    Summing the counts can be done in a lot of different ways. Here is one that is maybe a bit cryptic but will teach you the table() command. Assume you have the table you pasted in a text file called mirna.txt. Try to run the following in R, with the mirna.txt file in the current working directory:

    m <- read.table("mirna.txt")
    q <- table(m)
    totcounts <- as.numeric(rownames(q)) %*% q
    barplot(totcounts)

    There are of course more transparent ways of summing the counts, but I'm too lazy to type them out :-)

    Comment


    • #3
      Thanks a lot, kopi-o.

      This looks awesome. I will try it out.

      Jian

      Comment


      • #4
        solved

        I eventually used:
        list.files () function to get all the files
        lapply () to achieve processing for multiple functions.
        read.table () to read data.frame from each file
        tapply (SeqCounts, miRNA, sum) to get a counting for each "class"
        write.table () to write data into a file, append=TRUE
        also used paste() and cat () to write a name before each appendage.
        barplot () to draw polt

        It took me a couple of days to learn the introductory basics of R. But it was fun and will be useful in the future I hope.

        Again, thanks to Kopi-o for point the way: I haven't learned how to used the table () function yet...But I feel confident to be able to learn it now.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Quality Control Essentials for Next-Generation Sequencing Workflows
          by seqadmin




          Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

          Nucleic Acid Quality Control
          Preparing for NGS starts with isolating the...
          02-10-2025, 01:58 PM
        • seqadmin
          An Introduction to the Technologies Transforming Precision Medicine
          by seqadmin


          In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...
          01-27-2025, 07:46 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 02-07-2025, 09:30 AM
        0 responses
        72 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-05-2025, 10:34 AM
        0 responses
        113 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-03-2025, 09:07 AM
        0 responses
        90 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 01-31-2025, 08:31 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Working...
        X