Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • extract a chromosome with vcftools

    I tried to extract Chr 1 data and recode a new vcf using the example in the basic documentation and got no data out. All I want to do is break the large input file into small files of one chromosome each.

    this is the log file:

    VCFtools - v0.1.10
    (C) Adam Auton 2009

    Parameters as interpreted:
    --gzvcf /Users/williamcassano/Desktop/Variations/PG0000652-BLD.genome.block.anno.vcf.gz
    --chr 1
    --out chr1
    --recode

    Using zlib version: 1.2.5
    Reading Index file.
    File contains 128054613 entries and 1 individuals.
    Filtering by chromosome.
    Chromosome: chr18
    Chromosome: chr3
    Chromosome: chr20
    Chromosome: chr17
    Chromosome: chr8
    Chromosome: chr19
    Chromosome: chr15
    Chromosome: chr10
    Chromosome: chr12
    Chromosome: chr16
    Chromosome: chrX
    Chromosome: chr14
    Chromosome: chr5
    Chromosome: chr22
    Chromosome: chr2
    Chromosome: chr6
    Chromosome: chr7
    Chromosome: chr1
    Chromosome: chr9
    Chromosome: chr13
    Chromosome: chr21
    Chromosome: chrM
    Chromosome: chrY
    Chromosome: chr4
    Chromosome: chr11
    Keeping 0 entries on specified chromosomes.
    Applying Required Filters.
    After filtering, kept 1 out of 1 Individuals
    After filtering, kept 0 out of a possible 0 Sites
    Error:No data left for analysis!

  • #2
    You may have to put '--chr chr1' instead of '--chr 1'. Your vcf file has chromosoms marked as 'chr$number'.
    Good luck

    Comment


    • #3
      i have applied this vcftools but it showing following program
      My input command :
      vcftools --gzvcf ExAC.r0.3.1.sites.vep.vcf.gz --chr chr21 --out chr21 --recode

      And error is showing like this:

      VCFtools - UNKNOWN
      (C) Adam Auton and Anthony Marcketta 2009

      Parameters as interpreted:
      --gzvcf ExAC.r0.3.1.sites.vep.vcf.gz
      --chr chr21
      --out chr21
      --recode

      Using zlib version: 1.2.8
      After filtering, kept 0 out of 0 Individuals
      Outputting VCF file...
      After filtering, kept 0 out of a possible 0 Sites
      File does not contain any sites
      Run Time = 0.00 seconds

      I am getting all these things as log file as output?

      Could anyone please help me to split the chromosome21.vcf from my input file with proper command line discription?

      I have looked and tried all vcf command and and even gatk too?

      Comment


      • #4
        I think you have the opposite problem as the original poster. They used --chr 1 but the vcf used chr1 instead of 1 to describe the chromosome number. You are using --chr chr21 but it looks like ExAC uses 21 instead of chr21 to describe the chromosome number.
        #CHROM POS ID REF ALT QUAL FILTER INFO
        1 13372 . G C 608.91 PASS
        Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

        Comment


        • #5
          yes you right,
          but vcf tool was not giving output.

          so i tried with tabix

          tabix -h file_name 21

          it vcf file of chr 21 but i am not sure about its ending statement.

          is it right to extract by this command?

          Comment


          • #6
            Keep all chromosomes and remove scaffolds

            Hello,

            I am trying to do something similar, I want to keep information for 10 chromosomes (Chr01-10) and remove the scaffold SNP information in my vcf file.

            This is the code I have run
            vcftools --vcf original.vcf --out varCHR.vcf --chr Chr[01-10]

            And I receive the following
            After filtering, kept 2 out of 2 Individuals
            After filtering, kept 0 out of a possible 166402 Sites
            No data left for analysis!
            Run Time = 0.00 seconds

            Can anyone help me with my problem?
            Thank you

            Comment


            • #7
              You can also use BBMap's filtervcf.sh like this:

              Code:
              filtervcf.sh in=original.vcf out=varCHR.vcf contigs=Chr01,Chr02,Chr03,Chr04,Chr05,Chr06,Chr07,Chr08,Chr09,Chr10
              ...assuming those chromosome names are correct. Please let me know if that does not work.

              Comment


              • #8
                Great tool for splitting vcf files

                Comment


                • #9
                  vcftools can do this, here is a command
                  vcftools --chr chr1 --vcf myfile.vcf --recode --recode -INFO-all --out myfile.chr1.vcf

                  check your vcf file first to assure how chromosome number has been marked. (chr1 or 1) use that in command line.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Genetic Variation in Immunogenetics and Antibody Diversity
                    by seqadmin



                    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                    11-06-2024, 07:24 PM
                  • seqadmin
                    Choosing Between NGS and qPCR
                    by seqadmin



                    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                    10-18-2024, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 11:09 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Today, 06:13 AM
                  0 responses
                  20 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 11-01-2024, 06:09 AM
                  0 responses
                  30 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-30-2024, 05:31 AM
                  0 responses
                  21 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X