Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Have anyone tried analysis in Baslan 2012 "Genome-wide CNV analysis of Single-cell"?

    Hi All,

    I have been trying to use the analysis flow suggested by the paper, but it seems like I am not getting it. Just a show of hand if anyone of you have successfully replicated the analysis workflow?


    Wilson

  • #2
    peaks <- function(series, span=3, ties.method = "first") {
    ### This peaks function from Petr Pikal https://stat.ethz.ch/pipermail/r-hel...ry/125241.html
    if((span <- as.integer(span)) %% 2 != 1) stop("'span' must be odd")
    z <- embed(series, span)
    s <- span%/%2
    v <- max.col(z, ties.method=ties.method) == 1 + s
    pad <- rep(FALSE, s)
    result <- c(pad, v, pad)
    result
    }

    args=commandArgs(TRUE)

    #df <- read.table("./SRR054616.hg19.50k.k50.nobad.varbin.data.txt", header=T)
    #dfs <- read.table("./SRR054616.hg19.50k.k50.nobad.varbin.short.txt", header=T)

    df <- read.table("hg19.50k.k50.nobad.varbin.data.txt", header=T)
    dfs <- read.table("hg19.50k.k50.nobad.varbin.short.txt", header=T)

    starts <- c()
    ends <- c()
    prevEnd <- 0
    len <- nrow(dfs)
    for (j in 1:len) {
    thisStart = prevEnd + 1
    thisEnd = thisStart + dfs$num.mark[j] - 1
    starts <- c(starts, thisStart)
    ends <- c(ends, thisEnd)
    prevEnd = thisEnd
    }

    amat <- matrix(data=0, nrow=1500000, ncol=1)
    counter <- 1
    for (j in 1len-1)) {
    for (k in (j+1):len) {
    N <- round((starts[j] - ends[j] + 1) * (starts[k] - ends[k] + 1)/1000)
    D <- abs(2^dfs$seg.mean[j] - 2^dfs$seg.mean[k])
    #cat(N, "\t")
    #cat(starts[j],"\t")
    #cat(ends[j],"\t")
    #cat(starts[k],"\t")
    #cat(ends[k],"\t")
    #cat(N,"\t")
    #cat(counter,"\t")
    #cat(D,"\t")
    #cat(len,"\n")
    if (N > 0) {
    amat[(countercounter+N-1)), 1] <- rep.int(D, N)
    counter <- counter+N
    }
    }
    }
    a3 <- amat[(1:counter),1]
    a3.95 <- sort(a3)[round(.95*counter)]
    a3d <- density(a3[which(a3 < a3.95)], n=1000)
    a3d$x[which(peaks(as.vector(a3d$y), span=3))]
    a3d$y[which(peaks(as.vector(a3d$y), span=3))]
    cn0 <- a3d$x[which(peaks(as.vector(a3d$y), span=59))][1]
    cn1 <- a3d$x[which(peaks(as.vector(a3d$y), span=59))][2]
    quantile(a3d$x[which(peaks(as.vector(a3d$y), span=3))],c(0.2,0.4,0.6,0.8,1))
    cn0
    cn1
    df$cn.ratio <- df$lowratio / cn1
    df$cn.seg <- df$seg.mean.LOWESS / cn1
    df$copy.number <- round(df$cn.seg)

    write.table(df, sep="\t", file=paste("sc.varbin.data.copynumber.txt", sep=""), quote=F, row.names=F)

    postscript("sc.wg.cn.density.ps", height=400, width=600)
    par(mar=c(5.1,4.1,4.1,4.1))
    plot(a3d, main="seg.mean difference density")
    dev.off()

    for (a in 1:24) {
    postscript(paste("sc.wg.cn.chr", a, ".ps", sep=""), height=400, width=600)
    par(mar=c(5.1,4.1,4.1,4.1))
    plot(df$cn.ratio[df$chrom==a], main=paste("SRR054616 chr", a, sep=""), xlab="Bin", ylab="Ratio", col="#CCCCCC")
    lines(df$cn.ratio[df$chrom==a], col="#CCCCCC")
    points(df$cn.seg[df$chrom==a], col="#0000DD")
    lines(df$cn.seg[df$chrom==a], col="#0000DD")
    points(df$copy.number[df$chrom==a], col="#DD0000")
    lines(df$copy.number[df$chrom==a], col="#DD0000")
    dev.off()
    }
    This is the part where the author tries to estimate copy number. It is kind of him to share his code, but it comes without any comment or explanation.

    Can anyone familiar with CNV field kind enough to explain what this R code is doing?

    Thank you.


    Wilson

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Non-Coding RNA Research and Technologies
      by seqadmin




      Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

      Nobel Prize for MicroRNA Discovery
      This week,...
      10-07-2024, 08:07 AM
    • seqadmin
      Recent Developments in Metagenomics
      by seqadmin





      Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
      09-23-2024, 06:35 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 10-02-2024, 04:51 AM
    0 responses
    104 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 10-01-2024, 07:10 AM
    0 responses
    112 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 09-30-2024, 08:33 AM
    1 response
    115 views
    0 likes
    Last Post EmiTom
    by EmiTom
     
    Started by seqadmin, 09-26-2024, 12:57 PM
    0 responses
    21 views
    0 likes
    Last Post seqadmin  
    Working...
    X