Seqanswers Leaderboard Ad

**ECO** · 10-03-2011, 08:51 AM

This is cake in R...

Depending on how large the datasets are, read them into R. The example below (using read.table) is just for using the clipboard in MacOSX (highlight in excel, copy, then run that line). The equiv in windows is read.table("clipboard",sep='\t',header=TRUE).

Code:

[B]> df1 <- read.table(pipe("pbpaste"),header=TRUE)
> df1[/B]
  Gene Change    Padj
1    1    212 0.00060
2    2     99 0.09000
3    3     15 0.70000
4    4     59 0.45600
5    5    418 0.00001
6    6    566 0.00000
[B]> df2 <- read.table(pipe("pbpaste"),header=TRUE)
> df2[/B]
  Gene Statistic    FDR
1    1        28 0.0005
2    2        17 0.0070
3    3         3 0.8100
4    4         8 0.2800
5    5        48 0.0020
6    6       111 0.0000
[B]> df3 <- merge(df1,df2)
> df3[/B]
  Gene Change    Padj Statistic    FDR
1    1    212 0.00060        28 0.0005
2    2     99 0.09000        17 0.0070
3    3     15 0.70000         3 0.8100
4    4     59 0.45600         8 0.2800
5    5    418 0.00001        48 0.0020
6    6    566 0.00000       111 0.0000
[B]> df3[df3$Padj< 0.05 & df3$FDR < 0.05,][/B]
  Gene Change  Padj Statistic   FDR
1    1    212 6e-04        28 5e-04
5    5    418 1e-05        48 2e-03
6    6    566 0e+00       111 0e+00
[B]> write.csv(df3[df3$Padj< 0.05 & df3$FDR < 0.05,])[/B]
"","Gene","Change","Padj","Statistic","FDR"
"1",1,212,6e-04,28,5e-04
"5",5,418,1e-05,48,0.002
"6",6,566,0,111,0

**byou678** · 10-03-2011, 11:57 AM

Thanks ECO, I run the similar codes in R, too. And I can get the result like yours Just based on the Spreadsheets showed in this thread. Sorry for not telling more info about my data:But I have more Genes(37970) in Spreadsheet2 than Spreadsheet1 (17980). It supposed to have 17980 Genes left after merging [the 19990(37970-17980) Genes which only in Spreadsheet2 will not appear in the merged sheets]. Unfortunately, it messed up after merging. Any idea will be greatly appreciated.

Originally posted by ECO View Post

This is cake in R...

Depending on how large the datasets are, read them into R. The example below (using read.table) is just for using the clipboard in MacOSX (highlight in excel, copy, then run that line). The equiv in windows is read.table("clipboard",sep='\t',header=TRUE).

Code:

[B]> df1 <- read.table(pipe("pbpaste"),header=TRUE)
> df1[/B]
  Gene Change    Padj
1    1    212 0.00060
2    2     99 0.09000
3    3     15 0.70000
4    4     59 0.45600
5    5    418 0.00001
6    6    566 0.00000
[B]> df2 <- read.table(pipe("pbpaste"),header=TRUE)
> df2[/B]
  Gene Statistic    FDR
1    1        28 0.0005
2    2        17 0.0070
3    3         3 0.8100
4    4         8 0.2800
5    5        48 0.0020
6    6       111 0.0000
[B]> df3 <- merge(df1,df2)
> df3[/B]
  Gene Change    Padj Statistic    FDR
1    1    212 0.00060        28 0.0005
2    2     99 0.09000        17 0.0070
3    3     15 0.70000         3 0.8100
4    4     59 0.45600         8 0.2800
5    5    418 0.00001        48 0.0020
6    6    566 0.00000       111 0.0000
[B]> df3[df3$Padj< 0.05 & df3$FDR < 0.05,][/B]
  Gene Change  Padj Statistic   FDR
1    1    212 6e-04        28 5e-04
5    5    418 1e-05        48 2e-03
6    6    566 0e+00       111 0e+00
[B]> write.csv(df3[df3$Padj< 0.05 & df3$FDR < 0.05,])[/B]
"","Gene","Change","Padj","Statistic","FDR"
"1",1,212,6e-04,28,5e-04
"5",5,418,1e-05,48,0.002
"6",6,566,0,111,0

**ECO** · 10-03-2011, 01:06 PM

Just need to tell merge what to do...(note I added two extra genes into df1)

Code:

[B]> df1[/B]
  Gene Change    Padj
1    1    212 0.00060
2    2     99 0.09000
3    3     15 0.70000
4    4     59 0.45600
5    5    418 0.00001
6    6    566 0.00000
7    7    123 0.00000
8    8    125 0.04000
[B]> df2[/B]
  Gene Statistic    FDR
1    1        28 0.0005
2    2        17 0.0070
3    3         3 0.8100
4    4         8 0.2800
5    5        48 0.0020
6    6       111 0.0000
[B]> merge(df1,df2,on=Gene)[/B]
  Gene Statistic    FDR Change    Padj
1    1        28 0.0005    212 0.00060
2    2        17 0.0070     99 0.09000
3    3         3 0.8100     15 0.70000
4    4         8 0.2800     59 0.45600
5    5        48 0.0020    418 0.00001
6    6       111 0.0000    566 0.00000

edit: Looks like "on" isn't necessary...something else going on with your data? Merge is pretty smart by default...

**byou678** · 10-04-2011, 07:42 PM

Thanks again ECO. Yes, I like using R too and I have figured it out.

Originally posted by ECO View Post

Just need to tell merge what to do...(note I added two extra genes into df1)

Code:

[B]> df1[/B]
  Gene Change    Padj
1    1    212 0.00060
2    2     99 0.09000
3    3     15 0.70000
4    4     59 0.45600
5    5    418 0.00001
6    6    566 0.00000
7    7    123 0.00000
8    8    125 0.04000
[B]> df2[/B]
  Gene Statistic    FDR
1    1        28 0.0005
2    2        17 0.0070
3    3         3 0.8100
4    4         8 0.2800
5    5        48 0.0020
6    6       111 0.0000
[B]> merge(df1,df2,on=Gene)[/B]
  Gene Statistic    FDR Change    Padj
1    1        28 0.0005    212 0.00060
2    2        17 0.0070     99 0.09000
3    3         3 0.8100     15 0.70000
4    4         8 0.2800     59 0.45600
5    5        48 0.0020    418 0.00001
6    6       111 0.0000    566 0.00000

edit: Looks like "on" isn't necessary...something else going on with your data? Merge is pretty smart by default...

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 18 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

How to extract Common Genes from 2 spreadsheets

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News