Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kjaja
    Member
    • Aug 2011
    • 58

    variant filtering

    Hi All,

    I have cases and controls the were exomed sequenced and used GATK to call the variants in all the cases and controls combined. This generated a single vcf file with all the variants. I would like then to keep the variants that are in cases and not in my controls after removing the low quality variants (with scores < 30). I would like to get some ideas on the best way to handle this? The goal is to remove the variant s in the controls and see what is left.

    thanks
  • rjohnp
    Member
    • Jan 2013
    • 16

    #2
    Hi,

    VCF-tools has some useful tools for this sort of thing, see http://vcftools.sourceforge.net/perl...l#vcf-contrast. Looks like exactly what you're looking for.

    Hope that helps.

    Comment

    • kjaja
      Member
      • Aug 2011
      • 58

      #3
      I have tried the vcf tool vcf-contrast from the following link http://vcftools.sourceforge.net/perl...l#vcf-contrast
      but i am getting the following warning

      Argument "." isn't numeric in numeric gt (>) at vcftools_0.1.10/perl/vcf-contrast line 144, <STDIN> line 120414.


      Any idea?

      thanks

      Comment

      • rjohnp
        Member
        • Jan 2013
        • 16

        #4
        Originally posted by kjaja View Post
        I have tried the vcf tool vcf-contrast from the following link http://vcftools.sourceforge.net/perl...l#vcf-contrast
        but i am getting the following warning

        Argument "." isn't numeric in numeric gt (>) at vcftools_0.1.10/perl/vcf-contrast line 144, <STDIN> line 120414.


        Any idea?

        thanks
        When I get that sort of message it's generally a trailing new-line on your input file. You should still have an output? Do a line count on your vcf file, I would suggest it's likely to be 120414 lines long. If this does pose an issue to getting an output file, remove it in vim.

        Comment

        • kjaja
          Member
          • Aug 2011
          • 58

          #5
          Was wondering if this can be done using GATK?

          Comment

          • AJERYC
            Member
            • Jan 2012
            • 26

            #6
            try kggseq

            java -Xms256m -Xmx1300m --buildver hg19 -jar kggseq.jar --no-resource-check --buildver hg19 --vcf-file yourfile.vcf --ped-file pedigree.ped --o-vcf --genotype-filter 2,4

            in file pedigree.ped put there the status of case(2) or control (1)
            for genotype filter option 2 (homozygous variables present in cases and controls) and 4 (heterozygous variables present in cases and controls)

            Comment

            • kjaja
              Member
              • Aug 2011
              • 58

              #7
              I have tried using vcf-contrast in vcf tools using the following command

              vcftools_0.1.10/perl/vcf-contrast +sample1,sample2 -sample3 -n allAllsamples.vcf > insample1or2NOTsample3.vcf

              where I am looking for variants that could be in sample 1 OR sample 2 but not in sample 3. But I found that some of the variants in sample 1 are in sample 3. And the same issue with sample 2.
              Any suggestions will be greatly appreciated

              Comment

              • evakoe
                Member
                • Jul 2012
                • 27

                #8
                I also noticed that vcf-contrast does not return the expected results. One still gets variants that are present in samples given with the minus flag. Does somebody know another program which has the same functionality? I think that GATK SelectVariants cannot be used for this purpose.
                Thank you.

                Comment

                • krawitz
                  Member
                  • Feb 2010
                  • 35

                  #9
                  You could also try the following:
                  upload your multiple vcf file to GeneTalk (www.gene-talk.de). Use the collection tool to asign all the cases the status "affected" and all the controls the status "unaffected". Then proceed with inheritance filtering option "dominant". This will yield variants that are unique to the cases.

                  Comment

                  • evakoe
                    Member
                    • Jul 2012
                    • 27

                    #10
                    Hi Krawitz,
                    thanks for your reply. I also had cosidered this already, I was just hoping for a more direct solution.

                    Comment

                    • MQ-BCBB
                      Member
                      • May 2009
                      • 25

                      #11
                      Since no one has mentioned SnpSIFT, I should way that it works very well for this

                      Comment

                      • aggp11
                        Member
                        • Jun 2011
                        • 87

                        #12
                        Could you print like 5-10 lines from your "combined" vcf file here? Try to include an example of what output you would like to see.

                        Comment

                        Latest Articles

                        Collapse

                        • SEQadmin2
                          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                          by SEQadmin2


                          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                          Here are nine questions we think about, in roughly the order they matter, before...
                          Yesterday, 07:11 AM
                        • SEQadmin2
                          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                          by SEQadmin2


                          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                          ...
                          06-02-2026, 10:05 AM
                        • SEQadmin2
                          Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                          by SEQadmin2


                          With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                          Introduction

                          Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                          05-22-2026, 06:42 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, 06-17-2026, 06:09 AM
                        0 responses
                        20 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-09-2026, 11:58 AM
                        0 responses
                        38 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-05-2026, 10:09 AM
                        0 responses
                        44 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-04-2026, 08:59 AM
                        0 responses
                        49 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...