No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Depth of Coverage

    I am using the following command to do depthofCoverage

    java -jar /Software/GenomeAnalysisTK.jar -I test_R1R2aln.sam.readgrporg_ordered.bam --outputFormat csv -o test_Coverage_summary -T DepthOfCoverage -R /Run1/H.Sapiens/ucsc.hg19.fasta -L testexonbed.list

    where -I test_R1R2aln.sam.readgrporg_ordered.bam is aligned R1 and R2 for all samples.

    -L testexonbed.list is list of intervals

    my output in Coverage_summary comes out like this

    Locus Total_Depth Average_Depth_sample Depth_for_sample1
    chrx:00004900 34 34.00 34
    chrx:00004909 34 34.00 34
    chrx:00004910 34 34.00 34
    chrx:00004911 34 34.00 34
    chrx:00004912 34 34.00 34
    chrx:00004913 34 34.00 34
    chrx:00004914 34 34.00 34

    1- What does it mean
    2- How do I know what sample this data is referring to
    3- How do I know my forward and reverse reads

    another output file sample_interval_summary looks like this

    Target total_coverage average_coverage sample1_total_cvg sample1_mean_cvg sample1_granular_Q1 sample1_granular_median sample1_granular_Q3 sample1_%_above_15
    chrx:00004900-00005100 9690 46.81 9690 46.81 51 51 51 100.0
    chrx:00006100-00006200 7420 140.00 7420 140.00 141 141 141 100.0
    chrx:00006800-00007000 10660 65.00 10660 65.00 5 87 87 75.0
    chrx:00007000-00007200 23606 159.50 23606 159.50 149 153 153 100.0

  • #2
    Another question I have which is related to my previous questions is if I want to calculate depth of coverage on multiple samples which are contained in one bam file can I give GATK one file and it will do Depth of coverage by sample. I feel like I need to give it a sample information file but can't find the option for it.


    • #3
      Hi Viberance,

      It's a little hard to tell from your output I agree, especially because there's only one sample so all the numbers are the same.

      Here's one of mine:

      Locus Total_Depth Average_Depth_sample Depth_for_Sample1
      1:10385451 2751 250.09 144

      1- What does it mean:
      Locus = the genomic coordinate
      Total_Depth = Cumulative depth across all samples, I had 12
      Average_Depth_sample = Average depth at this position from all samples
      Depth_for_sample1 = The depth for that sample

      2- How do I know what sample this data is referring to:
      It will name them based on the names of the BAMs, if you expected more than one you probably need to work on the format of you list of input BAMs

      3- How do I know my forward and reverse reads:
      You need a different tool - correct me if I'm wrong

      4- multiple samples which are contained in one bam file can I give GATK one file and it will do Depth of coverage by sample:
      I've never used multiple samples per BAM but from what I understand if the BAM if formatted right then it shouldn't be a problem. If you can't get it to work, split your BAM into multiple BAMs, one per sample

      Hope that helps


      • #4
        Hi Shimbalama

        I think the problem is in my bam file, the way I created it, I have about 80 samples in R1.fastq.gz and R2.fastq.gz format. I did cat on these fastq.gz files and made one all80R1.fastq.gz and all80R2.fastq.gz.

        I used BWA mem to create one bam file from these paired end files (all80R1.fastq.gz and all80R2.fastq.gz), but when I go further down in my analysis it doesn't show the samples. Every thing appears like my previous post for output in Coverage_summary.

        Thanks for the help


        • #5
          Thinking the problem is in my bam file I want to run BWA mem on individual samples. I have R1 and R2 reads in fastq.gz format I want to run BWA mem paired end parallel on all the files once finished each R1 and R2 complementary file should produce one sam file. Right now I am making two sam file from the two reads

          This is what I have come up with but it’s not doing what I need it to do

          for i in find -maxdepth 2 -iname *fastq.gz -type f; do echo "bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta ${i}_R1_001.fastq.gz ${i}_R2_001.fastq.gz > ${i}_R1_R2.sam"; done

          when it runs it looks like this

          bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_001.fastq.gz ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R2_001.fastq.gz > ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_R2.sam

          bwa mem -t 12 H.Sapiens/ucsc.hg19.fasta ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_001.fastq.gz ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R2_001.fastq.gz > ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_R2.sam
          I understand the problem is in iname but how do I fixit?
          Thank you so much


          Latest Articles


          • seqadmin
            Advanced Methods for the Detection of Infectious Disease
            by seqadmin

            The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
            11-27-2023, 01:15 PM
          • seqadmin
            Strategies for Investigating the Microbiome
            by seqadmin

            Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
            11-09-2023, 07:02 AM





          Topics Statistics Last Post
          Started by seqadmin, Today, 10:48 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 08:26 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 08:12 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 11-27-2023, 08:12 AM
          0 responses
          Last Post seqadmin