Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sfh838t
    Member
    • Apr 2014
    • 29

    variant calling in plant

    I am trying to get all variants of all types for a sequence I put together through an assembly step followed by consensus building using a reference. Now I am looking for variants. I have used samtools/bcf/vcfutils steps and the first time I did so accidentally using contigs, which gave me a list of only indels as a result, and I can visually (IGV) verify these. then I tried to correct this and used the actual reads, again with samtools. this time I got a list of SNPs only, which again I can locate in IGV.
    so now I am wondering what is going on? I was under the impression that samtools would locate both indels and SNPs using reads??? would it be legitimate to use indels found by using contigs in a write up?
    the subject is a plant chloroplast sequence and in the end I will need some locations I can use in the lab to find differences between two related species. I am learning this as I go so any information even links to further information would be most appreciated.
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    Whether or not indels are found depends on the aligner (and, perhaps, ploidy). How did you align the reads, and how did you align the contigs?

    Comment

    • sfh838t
      Member
      • Apr 2014
      • 29

      #3
      All my alignments are done using BWA, and the same file of reads was used for both GATK and samtools.
      I used the samtools mpileup/bcftools/vcfutils steps both times:
      samtools +reads = only SNPs
      samtools + contigs = only indels
      GATK tool: UnifiedGenotyper + reads = only SNPs

      Comment

      • Brian Bushnell
        Super Moderator
        • Jan 2014
        • 2709

        #4
        Do you see indels in the reads when you look at the mapped bam in IGV? And how long are these indels?

        Comment

        • sfh838t
          Member
          • Apr 2014
          • 29

          #5
          yes, I can see both indels and SNPs in IGV. most of them are 3-7 bp long. And the indels found are at different locations than the SNPs found.
          Is it legitimate to use contigs in calling variants?

          Comment

          • sfh838t
            Member
            • Apr 2014
            • 29

            #6
            I look at contigs in IGV and see both SNPs and indels. Have not looked at reads.

            Comment

            • sfh838t
              Member
              • Apr 2014
              • 29

              #7
              ok, I just went and looked at reads and see the same insert that I identified in contigs. this is important for the project, because this insert is present in one possible parent but not the other.

              Comment

              • Brian Bushnell
                Super Moderator
                • Jan 2014
                • 2709

                #8
                Calling indels from the contigs is probably a valid approach as long as these are homozygous events; I'm not really sure how chloroplast genomes work. Also, if the assembly is good. Possibly, someone who knows more about GATK or mpileup can comment on why they seem to be missing the indels in reads.

                Comment

                • sfh838t
                  Member
                  • Apr 2014
                  • 29

                  #9
                  I did get the GATK answer and now have a file with both together. working with plants and all bets are off the samtools output might remain a mystery .
                  Probably a new question: does anyone know how to build a consensus from an alignment which DOES include indels??

                  Comment

                  • sarvidsson
                    Senior Member
                    • Jan 2015
                    • 137

                    #10
                    Did you try the FastaAlternateReferenceMaker in GATK? You need called variants in a VCF file however, not just an alignment - and read the documentation carefully, there are some limitations.

                    Comment

                    • sfh838t
                      Member
                      • Apr 2014
                      • 29

                      #11
                      Thanks, I will try that.

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Today, 10:09 AM
                      0 responses
                      9 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, Yesterday, 08:59 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      24 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      20 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...