Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • intersect VCF files

    Hi:

    I like bedtools. but now GATK pipelines are spewing VCF files and I want to intersect 3 files.

    For instance I have 3 VCF files F1, F2 and F3.

    I want to intersect
    newfile = intersect(F1, F2)

    then I want to :

    newsecondfile = intersect(newfile, F3)

    How can I do this on VCF files. I tried vcftools, it is not handy with all the gz files and perl.

    I like something a BEDtools type.

    Any suggestions.

    thanks
    Adrian

  • #2
    IntersectBed claims to accept .vcf files. I'm pretty sure I've used it myself to do just that.

    What I've also done is used mpileup in samtools to take in multiple .bams files together. The downside is that it doesn't keep all the information for each sample together, but it at least give you GT, PL and GQ values for each sample. You can filter by the GT or the PL to find SNPs that are or aren't in whatever combination of samples you want.

    Comment


    • #3
      There is also vcfutils vcf-isec:

      Comment


      • #4
        Has anyone got vcf-isec to work?

        I bgzipped my vcfs and tabix'ed them..

        here I try it on the same vcf:

        vcf-isec -c 26530.snv.vcf.gz 26530.snv.vcf.gz

        but I get:
        Can't use string ("silent") as a HASH ref while "strict refs" in use at /net/home/leparc/bin/VCFtools/perl/Vcf.pm line 542.

        Also, why all the trouble with bgzipping and tabix indexing... it's a lot of hassle just to do something so simple.

        Comment


        • #5
          I've successfully run vcf-isec to compare two related individuals:

          vcf-isec -n +2 -f file1.vcf.gz file2.vcf.gz > file3.vcf.gz

          Comment


          • #6
            Have you tried vcftools?

            Comment


            • #7
              There is also vcfintersect: https://github.com/ekg/vcflib#vcfintersect

              It works with both BED files and VCF files, and can generate inverse intersections (allowing you to find things that are not in one file).

              Comment


              • #8
                Hello,

                Would anyone happen to know how to merge a set a vcf files where you have at least 20% to at most 90% of all candidates reported across all files into one new file?

                Thanks,
                Nino
                Last edited by Nino; 02-20-2014, 11:24 AM. Reason: forgot a word

                Comment


                • #9
                  I can also suggest R. Convert your VCF to tab files, and then intersect the positions where variants are called.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Non-Coding RNA Research and Technologies
                    by seqadmin




                    Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                    Nobel Prize for MicroRNA Discovery
                    This week,...
                    10-07-2024, 08:07 AM
                  • seqadmin
                    Recent Developments in Metagenomics
                    by seqadmin





                    Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                    09-23-2024, 06:35 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 10-11-2024, 06:55 AM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-02-2024, 04:51 AM
                  0 responses
                  109 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-01-2024, 07:10 AM
                  0 responses
                  114 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 09-30-2024, 08:33 AM
                  1 response
                  119 views
                  0 likes
                  Last Post EmiTom
                  by EmiTom
                   
                  Working...
                  X