Header Leaderboard Ad

Collapse

Why is an indel clearly visible in IGV, but not reported?

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why is an indel clearly visible in IGV, but not reported?

    I have a BAM file where there are several indels that look real when viewed in IGV, but for some reason they are not being called by any of the variant callers that I tried. Even using samtools mpileup with bcftools does not produce them. The coverage is over 1000x and the indels are fairly frequent, so this is not based on just a few poor quality reads.

    I know they are there. I can see them. All I want is the frequency. Why is this so difficult?

  • #2
    1000x plus is extremely high coverage.

    Are these highly repetitive regions, or do you have this coverage for the entire genome? Extremely read depth (like 1000) is typical of repetitive regions and leads to a lot of false positives. So variants with super read depths are often filtered out.

    It would help if you posted exactly what you commands/arguments you used in calling variants. For instance, if you followed the examples given in the Samtools manual, you probably filtered out everything with a read depth of over 100x.

    Comment


    • #3
      This is targeted resequencing, so high coverage is expected. However, this is indeed a repetitive region.

      The command I used for mpileup (adjusting the depth cutoff):
      samtools mpileup -uf hg19.fa -d 100000 -l interval.bed sample.bam | bcftools view -vcg - > sample.vcf

      Here are the expanded and the collapsed IGV screenshots of one deletion:

      Comment


      • #4
        Is this really "one deletion"? This is obviously a highly repetitive sequence its easy to see that many of these may be misaligned. I don't think you can be so confident looking at these alignments in IGV and say that there is an indel and that the problem is with the variant callers.

        Comment


        • #5
          I meant a deletion in one specific location. Yes, it does seem like there are multiple deletions there. However, the reads reach the end of the repetitive region on both ends. Doesn't that add more confidence?

          Regardless, if they are showing up in the alignment as different, why can't they be quantified?

          Comment


          • #6
            Originally posted by chadn737 View Post
            Is this really "one deletion"? This is obviously a highly repetitive sequence its easy to see that many of these may be misaligned. I don't think you can be so confident looking at these alignments in IGV and say that there is an indel and that the problem is with the variant callers.
            Does a local re-alignment help situation like this?

            Comment


            • #7
              Originally posted by xied75 View Post
              Does a local re-alignment help situation like this?
              I tried with and without realigning. It does not make a significant difference.

              Comment


              • #8
                Can you show a picture of this same region in expanded view, but slightly larger so that we can see more of the sequence to the right?

                Here's the thing. I could take that very bottom read in the expanded view introduce a gap in the same spot all the gaps start and move it over 6 bps and it would match perfectly and now you would have a 6 bp deletion. Its not coincidence that these gaps appear once you hit the start of that repetitive region.

                Comment


                • #9
                  Originally posted by chadn737 View Post
                  Can you show a picture of this same region in expanded view, but slightly larger so that we can see more of the sequence to the right?

                  Here's the thing. I could take that very bottom read in the expanded view introduce a gap in the same spot all the gaps start and move it over 6 bps and it would match perfectly and now you would have a 6 bp deletion. Its not coincidence that these gaps appear once you hit the start of that repetitive region.
                  Here it is...

                  Comment


                  • #10
                    I am having same trouble with Samtools/mpileup

                    Did you get answer to this ? Can anyone comment ?

                    The BAM files for alignment map were sorted and indexed.

                    Then, here is what I did for pileup for processing for a small region of interest:

                    samtools mpileup -r 0:4,000,679-5,000,894 -uf e.coli.fa sorted.bam > sorted.region.mpileup

                    Then to obtain SNPs and InDels from mpileup file, here is what I did:

                    bcftools view -bvcg contigalignsorted.region.mpileup > sorted.region.bcf

                    bcftools view sorted.region.bcf > sorted.region.vcf


                    However, when i see my alignment of BAM file on IGV for this region, I notice that the final VCF file obtained through commands above, has a lot of SNPs and InDels missing! I can show a snapshot of a picture of SNP/InDel missed if required.


                    What went wrong in the calls ?

                    Comment


                    • #11
                      More importantly , it even shows up wrong SNPs which otherwise is not a SNP as per the bam file loaded on IGV.


                      0 4009334 . A C 125 . DP=5;VDB=3.277706e-02;RPB=-1.291774e+00;AF1=0.5;AC1=1;DP4=1,1,2,1;MQ=59;FQ=74;PV4=1,0,1,0.38 GT:PL:GQ 0/1:155,0,101:99
                      0 4009336 . T A 17.1 . DP=5;VDB=6.080000e-02;RPB=1.291774e+00;AF1=0.5;AC1=1;DP4=2,1,1,1;MQ=59;FQ=20.1;PV4=1,1,0.14,1 GT:PL:GQ 0/1:47,0,154:50

                      As above 4009334 is a SNP but not as per IGV picture below:
                      Attached Files

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        How RNA-Seq is Transforming Cancer Studies
                        by seqadmin



                        Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                        09-07-2023, 11:15 PM
                      • seqadmin
                        Methods for Investigating the Transcriptome
                        by seqadmin




                        Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                        Whole Transcriptome RNA-seq
                        Whole transcriptome sequencing...
                        08-31-2023, 11:07 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 07:42 AM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-22-2023, 09:05 AM
                      0 responses
                      23 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-21-2023, 06:18 AM
                      0 responses
                      16 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-20-2023, 09:17 AM
                      0 responses
                      16 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X