Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • intersect VCF files

    Hi:

    I like bedtools. but now GATK pipelines are spewing VCF files and I want to intersect 3 files.

    For instance I have 3 VCF files F1, F2 and F3.

    I want to intersect
    newfile = intersect(F1, F2)

    then I want to :

    newsecondfile = intersect(newfile, F3)

    How can I do this on VCF files. I tried vcftools, it is not handy with all the gz files and perl.

    I like something a BEDtools type.

    Any suggestions.

    thanks
    Adrian

  • #2
    IntersectBed claims to accept .vcf files. I'm pretty sure I've used it myself to do just that.

    What I've also done is used mpileup in samtools to take in multiple .bams files together. The downside is that it doesn't keep all the information for each sample together, but it at least give you GT, PL and GQ values for each sample. You can filter by the GT or the PL to find SNPs that are or aren't in whatever combination of samples you want.

    Comment


    • #3
      There is also vcfutils vcf-isec:

      Comment


      • #4
        Has anyone got vcf-isec to work?

        I bgzipped my vcfs and tabix'ed them..

        here I try it on the same vcf:

        vcf-isec -c 26530.snv.vcf.gz 26530.snv.vcf.gz

        but I get:
        Can't use string ("silent") as a HASH ref while "strict refs" in use at /net/home/leparc/bin/VCFtools/perl/Vcf.pm line 542.

        Also, why all the trouble with bgzipping and tabix indexing... it's a lot of hassle just to do something so simple.

        Comment


        • #5
          I've successfully run vcf-isec to compare two related individuals:

          vcf-isec -n +2 -f file1.vcf.gz file2.vcf.gz > file3.vcf.gz

          Comment


          • #6
            Have you tried vcftools?

            Comment


            • #7
              There is also vcfintersect: https://github.com/ekg/vcflib#vcfintersect

              It works with both BED files and VCF files, and can generate inverse intersections (allowing you to find things that are not in one file).

              Comment


              • #8
                Hello,

                Would anyone happen to know how to merge a set a vcf files where you have at least 20% to at most 90% of all candidates reported across all files into one new file?

                Thanks,
                Nino
                Last edited by Nino; 02-20-2014, 11:24 AM. Reason: forgot a word

                Comment


                • #9
                  I can also suggest R. Convert your VCF to tab files, and then intersect the positions where variants are called.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Genetic Variation in Immunogenetics and Antibody Diversity
                    by seqadmin



                    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                    11-06-2024, 07:24 PM
                  • seqadmin
                    Choosing Between NGS and qPCR
                    by seqadmin



                    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                    10-18-2024, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 11:09 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Today, 06:13 AM
                  0 responses
                  20 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 11-01-2024, 06:09 AM
                  0 responses
                  30 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-30-2024, 05:31 AM
                  0 responses
                  21 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X