Unconfigured Ad

**dpryan** · 02-16-2014, 06:57 AM

There are NAs in the data, which "cor()" doesn't handle how you likely want by default. See the "use=" option.

**sindrle** · 02-16-2014, 07:22 AM

Hi again!
I have updated the script, handling a lot of rows and columns with variance = 0.

Should not be NAs anymore.. But don't know.

Still, the heat map looks very black, and the row clusters seams hard to interpret?

Screen Shot 2014-02-16 at 16.20.05.pdf

library(clusterGenomics)
data <- read.table(file = "~/RNAseq/INFSTK-5010/Oblig1/NEJM_Web_Fig1data.txt", header = FALSE, skip = 1, sep = "\t")

dim(data)

test <- count.fields("~/RNAseq/INFSTK-5010/Oblig1/NEJM_Web_Fig1data.txt", sep="\t")
which(test != 295)

header = scan("~/RNAseq/INFSTK-5010/Oblig1/NEJM_Web_Fig1data.txt", "", n = 295)
colnames(data) = header
fixdata <- data[-7401,-275]

x <- data.matrix(fixdata[,-1:-2], rownames.force = NA)

rownames(x) <- fixdata[,1]
x[is.na(x)] = 0

ind <- apply(x, 2, var) == 0
x <- x[,!ind]
ind <- apply(x, 1, var) == 0
x <- x[!ind,]

colclust <- hclust(as.dist(1-cor(x, method="pearson")), method="average")

rowclust <- hclust(as.dist(1-cor(t(x), method="pearson")), method="average")

z <- x[rev(rowclust$labels[rowclust$order]), colclust$labels[colclust$order]]

plotHeatmap(z, fast = TRUE)

res = part(t(x), B=10, Kmax=10, minSize=40, dist.method="cor")

plotTreeCol(clust=colclust, groups=res$lab.hatK[colclust$order])

res2 = part(x, B=10, Kmax=10, minSize=40, dist.method="cor")

plotTreeRow(clust=rowclust, groups=res2$lab.hatK[rowclust$order])

groups = cutree(colclust, k=3)

groups2 = cutree(colclust, h=2)

comparison <- cbind(res$lab.hatK, groups)

colnames(comparison) <- c("PART", "cutree")

test <- ifelse(comparison[,1]==comparison[,2], 1,NA)

table(is.na(test))["TRUE"]

**dpryan** · 02-16-2014, 10:52 AM

This is because ~95% of the values are no more than 10% away from 0 after being normalized. Do a "hist(x)" to see this.

I'd never heard of the "clusterGenomics" package before. I guess the treecutting part is interesting.

**sindrle** · 02-16-2014, 01:09 PM

Thanks for input, it was what I thought, but it don't look like the one in the paper...

I also liked the tree cutting!

Topics	Statistics	Last Post
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 40 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 102 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 123 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 114 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM

Unconfigured Ad

Troubleshot Heatmap

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News