Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ashwatha
    Member
    • Jul 2011
    • 14

    BAM to VCF conversion

    Hi,

    are there any tools to convert BAM files to VCF (Variant Call Format)? Alternatively, are there any tools to convert pileup files to VCF?

    Looking at the VCF specs, VCF is fairly similar to pileup format, so I can probably write a script for this conversion. I am just wondering if something already exists.

    Any pointers are much appreciated!

    thanks,
    Ashwatha.
  • ulz_peter
    Senior Member
    • Feb 2010
    • 219

    #2
    I can't see why you want to convert a BAM file to a vcf file. The Bam file stores the alignment where the VCF file stores variants. In order to generate a vcf file you would need to do a proper SNP calling (e.g. GATK, VarScan, ...). A direct conversion makes no sense for me...

    Comment

    • ashwatha
      Member
      • Jul 2011
      • 14

      #3
      Hi Peter,

      I see what you mean - my question was not worded correctly. What I am looking for is a way to take a BAM file, and call variants on it and generate a VCF file, the way "samtools pileup" generates a pileup file out of a BAM file.

      If such a tool doesn't exist, I could also use something that can convert a pileup file generated using samtools pileup to a VCF file, considering that pileup files and VCF files contain similar data (at least for coordinates where there is a SNP or other variant).

      thanks,
      Ashwatha.

      Comment

      • ketan_bnf
        Member
        • Oct 2010
        • 59

        #4
        You can use samtools mpileup option, visit this link



        also you can use GATK for variant calling.

        Comment

        • ulz_peter
          Senior Member
          • Feb 2010
          • 219

          #5
          I'd recommend GATK as well:


          like mentioned in my post above VarScan would be another option: http://varscan.sourceforge.net/

          Comment

          • ashwatha
            Member
            • Jul 2011
            • 14

            #6
            Thanks, Ketan and Peter!

            Comment

            • vd4mindia
              Member
              • May 2013
              • 40

              #7
              Thank you for the valuable thread. I have some more query for which I need some suggestions, I am new to GATK and want to use it for my exome sequencing data analysis. I have been a bit lost reading all the blogs , comments and the technical forums. So here is something I want to say and please correct and guide me through the procedure. I have downloaded the hg19 files from the UCSC browser and created the reference genome but do I need to again use the one which is there in GATK repository and then align my samples for downstream analysis? Also I want to run the GATK in my institute cluster. So if am not wrong I should create the directory of the latest GATK version and transfer all the necessary files via Filezilla in the cluster directory with the same name. Now this I have already done. So next thing is to download the bundle from the repository where I see 2 versions , so which one should I download? 2.5 or 2.3? Also once I download the bundle do I have to download anything else? So here it is which I should be downloading right in my cluster. The jar file and the resource folder with the .java files and then in the main directory of the GATK version folder in my cluster I should download the bundle version (2.5 or 2.3) and then unzip all the files that are there in the bundle directory. Right? Please let me know. Then I should be ready to use the GATK for the different downstream processes listed below:

              Identify target regions for realignment (Genome Analysis Toolkit) ->Realign BAM to get better Indel calling (Genome Analysis Toolkit) ->Reindex the realigned BAM (SAM Tools) ->Call Indels (Genome Analysis Toolkit) ->Call SNPs (Genome Analysis Toolkit)->View aligned reads in BAM/BAI (Integrated Genome Viewer)

              Please let me know if this looks correct or not. The VCF files from the 1kG and the DBSNP are already there in compressed form in the bundle repository of the GATK website which I am currently downloading and I can use them directly after unzipping them.

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                Yesterday, 10:05 AM
              • SEQadmin2
                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                by SEQadmin2


                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                Introduction

                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                05-22-2026, 06:42 AM
              • SEQadmin2
                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                by SEQadmin2

                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                05-06-2026, 09:04 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 12:03 PM
              0 responses
              19 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, Yesterday, 11:40 AM
              0 responses
              14 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 05-28-2026, 11:40 AM
              0 responses
              29 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 05-26-2026, 10:12 AM
              0 responses
              31 views
              0 reactions
              Last Post SEQadmin2  
              Working...