Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Show Overall Coverage

    Hi,

    is there a possibility to get an overview picture of the coverage? I can do this with IGV only for parts of a chromosome when I am zoomed in very deep.

    Further more, is there a possiblity to filter out regions or exons which don't reach a coverage threshold?
    Thanks
    Last edited by Firebird; 12-17-2010, 02:03 AM.

  • #2
    Hi,

    Are you familiar with R? I once got some R code I think from this forum (I neglected to write that down ).

    I don't know what files you have available but I had a samtools pileup so I ran:
    Code:
    cut -f 2,8 pileup.out > pileup.cov
    to extract the coverage for each position and then in R:
    Code:
    data <- read.table (file="pileup.cov", sep="\t", header=F)
    colnames(data) <- c("position", "coverage")
    depth <- mean(data[,"coverage"])
    window <- 101
    rangefrom <- 0
    rangeto <- length(data[,"position"])
    data.smoothed<-runmed(data[,"coverage"],k=window)
    png(file="coverage.png",width=1900,height=1000)
    plot(x=data[rangefrom:rangeto,"position"],y=data.smoothed[rangefrom:rangeto],pch=".", cex=1,xlab="bp position",ylab="depth",type="l")
    dev.off()
    Then check the png file for a quick coverage overview.

    I did this for a small insect genome. Perhaps you should split per chromosome if human.

    Hope that helps,
    cheers

    Comment


    • #3
      No, unfortunatly I'm not familiar with R.

      I can try it with the code u provided. Is it ok when I just change the path to my file?

      Thank
      Last edited by Firebird; 12-17-2010, 07:47 AM.

      Comment


      • #4
        SeqMonk can display an overview of a quantitation over the whole genome, for any of the quantitation types it supports. Is this anything close to what you're after?
        Attached Files

        Comment


        • #5
          bruins approach does not account for regions without coverage (pileup does not show them). You need to incorporate your reference genome to account for that.
          -drd

          Comment


          • #6
            Try Tablet:

            Homepage: Dan Bolser
            MetaBase the database of biological databases.

            Comment


            • #7
              @drio: true, but the plot can be useful to quickly see if there are peaks in the coverage and the overall depth of the coverage. An example. (sorry for the size...) It shows that there are some extreme peaks and that mostly the coverage is very low.


              @Firebird: I forgot to mention the cut command is a shell command. Er...

              Perhaps (if you're still interested in creating the quick plot and you have a samtools pileup file) you can split the pileup per chromosome and read in all columns. Then copy paste into R:
              Code:
              setwd("/path/to/your/workingdirectory")
              data <- read.table (file="pileup_chr1.cov", sep="\t", header=F)
              As for the path, I'm a Linux-person. If you use Windows you know better than me what the path should look like.
              Now I'm not so sure about the column naming and if R accepts less column names than there are columns. If it complains about it, just add the correct number of column names after column 8 coverage.
              Code:
              colnames(data) <- c("chr", "position", "3", "4", "5", "6", "7", "coverage")
              Code:
              depth <- mean(data[,"coverage"])
              window <- 101
              rangefrom <- 0
              rangeto <- length(data[,"position"])
              data.smoothed<-runmed(data[,"coverage"],k=window)
              png(file="coverage_chr1.png",width=1900,height=1000)
              plot(x=data[rangefrom:rangeto,"position"],y=data.smoothed[rangefrom:rangeto],pch=".", cex=1,xlab="bp position",ylab="depth",type="l")
              dev.off()
              You can close R after this, it has written a png file called coverage_chr1.png.

              Comment


              • #8
                Hi Simon,

                Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                Can it be used to show RNA-seq coverage if so what would be the steps I will be following in Seqmonk, I have tophat out put bam file.
                Thanks.

                Comment


                • #9
                  Originally posted by honey View Post
                  Hi Simon,

                  Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                  Can it be used to show RNA-seq coverage if so what would be the steps I will be following in Seqmonk, I have tophat out put bam file.
                  Thanks.
                  A lot of this is covered in the tutorial video we made for RNA-Seq analysis in SeqMonk, but the basic steps are:
                  1. Import the BAM file as single end (even if your data is paired end) and choose to split spliced reads
                  2. Use the feature probe generator to make probes over mRNA features, and split these into subfeatures (exons)
                  3. Use the base pair quantitation to quantitate your data, and ensure that you correct for probe length (so as to get comparable measures for probes of different lengths)


                  Your data will be now be quantitated by the read density over each exon, and you can zoom right out to see the pattern of expression across the currently selected chromosome. To see the pattern over the whole genome simply select the dataset you want to see in the data view (the folders in the top left), and the genome view in the top right will show you a coverage plot for the whole genome (similar to the one attached to an earlier note in this thread).

                  Hope this helps

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  30 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  32 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  53 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X