Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jebowers
    Member
    • Apr 2013
    • 19

    Vcf genotypes for RILs

    I was wondering if there is any way to change the assumptions for the genotypes calling in .vcf files from mpileup in samtools. I am working with a diploid organism but the individuals are mostly homozygous recombinant inbred lines with only about 1% residual heterozygosity. (or highly inbred lines). The problem is that when calling a SNP with low coverage (1-3 reads) and only one allele is observed in a sample, it often assumes the individual is heterozygous if the observed allele is the less common allele.

    The problem is that it assumes that all the loci in my individuals are in H-W equilibrium, when in fact due to experimental design they are not anywhere close to being in HW eq and most loci are going to be homozygous. Filtering by the quality on genotype calls reduces the problem but also discards much of the data.

    Of course sequencing to a high depth would solve this question with the existing tools, but when I expect >99% homozygous individuals at each loci that should not be necessary, as one or two A "Reads" should be enough to predict an AA genotype.
  • chadn737
    Senior Member
    • Jan 2009
    • 392

    #2
    Could you post what arguments you are using? This is a question I am very interested in knowing the answer to.

    Comment

    • jebowers
      Member
      • Apr 2013
      • 19

      #3
      I made bowtie2 for alignments, followed by samtools to create sorted bam files.

      mpileup -BuDf Refseq.fa differentsorted.bam(100 separate files) | bcftools view -bvcg - > out.bcf

      bcftools view -N output.bcf > output.vcf

      I have also used vcftools option --geno-depth on the .vcf file but the results are all -1 (missing data).

      I have tried various permutiations in addition with similar results.

      Comment

      • chadn737
        Senior Member
        • Jan 2009
        • 392

        #4
        That's what I figured.....I wonder what would happen if you didn't use the -c argument when you run bcftools. This calls the -e argument which does the test for Hardy-Weinberg Equilibrium:

        Consensus/Variant Calling Options:
        -c Call variants using Bayesian inference. This option automatically invokes option -e.

        -d FLOAT When -v is in use, skip loci where the fraction of samples covered by reads is below FLOAT. [0]

        -e Perform max-likelihood inference only, including estimating the site allele frequency, testing Hardy-Weinberg equlibrium and testing associations with LRT.




        Maybe try instead.
        Code:
        mpileup -BuDf Refseq.fa differentsorted.bam(100 separate files) | bcftools view -bvg - > out.bcf
        I'd be interested to know how this affects the results. I have never run bcftools without the -c argument.

        PS. I see you're in Athens, GA.....if you wouldn't mind I'd like to ask you a few questions. I am starting a post-doc at UGA in Aug.
        Last edited by chadn737; 04-22-2013, 02:38 PM.

        Comment

        • jebowers
          Member
          • Apr 2013
          • 19

          #5
          Thanks so much, guess I have to re-run that 2 week mpileup.

          Comment

          • chadn737
            Senior Member
            • Jan 2009
            • 392

            #6
            Could you run it on one or two files instead and test it? 2 weeks is a long time to try something new out if you don't now what the result will be.

            Comment

            • jebowers
              Member
              • Apr 2013
              • 19

              #7
              Yes planning on doing so. But right now our computer cluster is having disk issues so I don't expect quick results.

              I think that I will have to do the -b option only (not -bvg) on the bcftools view as -v and -g invoke -c.
              Last edited by jebowers; 04-22-2013, 03:22 PM. Reason: x

              Comment

              • chadn737
                Senior Member
                • Jan 2009
                • 392

                #8
                You're right, my bad, I should have read that a bit more closely.

                Comment

                • mxr1895
                  Junior Member
                  • Feb 2012
                  • 6

                  #9
                  Hi,
                  I got around a similar problem (I'm working with the yeast equivalent of RI lines) by using Freebayes, which has an option for ploidy. This allows you to genotype your RI samples as if they were haploids.
                  However, in my experience, low coverage will result in poor genoype calls.
                  Cheers,

                  Miguel

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    New Genomics Tools and Methods Shared at AGBT 2025
                    by seqadmin


                    This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                    The Headliner
                    The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                    03-03-2025, 01:39 PM
                  • seqadmin
                    Investigating the Gut Microbiome Through Diet and Spatial Biology
                    by seqadmin




                    The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                    02-24-2025, 06:31 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-20-2025, 05:03 AM
                  0 responses
                  17 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-19-2025, 07:27 AM
                  0 responses
                  18 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-18-2025, 12:50 PM
                  0 responses
                  19 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-03-2025, 01:15 PM
                  0 responses
                  185 views
                  0 reactions
                  Last Post seqadmin  
                  Working...