Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • capricy
    Senior Member
    • Apr 2012
    • 125

    snpeff : ERROR_CHROMOSOME_NOT_FOUND

    Hello,

    I got an issue with my bacterial snp annotation, that is, chromosome_not_found error. I did snp call using kSNP3.0.


    Here are several lines of my vcf file:
    ====
    ##fileformat=VCFv4.0
    ##Reference genome=GCA_000008865_1_ASM886v1_genomic
    ##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
    ##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT GCA_000006665_1_ASM666v1_genomic GCA_000008865_1_ASM886v1_genomic
    1 494154 AAAAAAACCG.AGCGCAAATA_R G A . . NS=205;AF=0.003 GT 0 0
    1 40998 AAAAAAAGCC.TCTTCGTCGC_F T C . . NS=196;AF=0.003 GT 0 0
    1 4531974 AAAAAAAGCG.AAATCTGGCA_F C T . . NS=212;AF=0.003 GT 0 0
    1 18983 AAAAAAATAG.GCTTCCAGGG_R G T . . NS=197;AF=0.003 GT 0 0
    1 1477420 AAAAAAATCC.GCTGCCGATA_R T C . . NS=200;AF=0.006 GT 0 0
    1 1013276 AAAAAACCCA.CAACCTTGAA_F T C . . NS=200;AF=0.003 GT 0 0
    1 254058 AAAAAACCCA.GGCGGGCGTT_R T C . . NS=206;AF=0.003 GT 0 0
    1 461873 AAAAAACCGG.AAACCGGACT_F A C . . NS=167;AF=0.003 GT 0 0
    1 2363022 AAAAAACCGG.CAGTTTGAGC_R T C . . NS=181;AF=0.006 GT 0 0
    1 494140 AAAAAACCGT.GCGTATTTGC_F G A . . NS=204;AF=0.003 GT 0 0
    ===

    here is my snp call reference ( GCA_000008865_1_ASM886) headers:
    ===
    [login-node04 databaseWorkingO17]$ grep ">" GCA_000008865.1_ASM886v1_genomic.fna
    >BA000007.2 Escherichia coli O157:H7 str. Sakai DNA, complete genome
    >AB011549.2 Escherichia coli O157:H7 str. Sakai plasmid pO157 DNA, complete sequence
    >AB011548.2 Escherichia coli O157:H7 str. Sakai plasmid pOSAK1 DNA, complete sequence
    ===

    Here is the snpeff data bin file (downloaded from snpeff database):
    ===
    [login-node03 Escherichia_coli_o157_h7_str_sakai]$ gunzip -c snpEffectPredictor.bin|head -20
    SnpEff 4.3
    CHROMOSOME 2 1 0 5498449 Chromosome false
    CHROMOSOME 3 1 0 92720 pO157 false
    CHROMOSOME 4 1 0 3305 pOSAK1 false
    GENOME 1 -1 0 2147483647 Escherichia_coli_o157_h7_str_sakai false Escherichia_coli_o157_h7_str_sakai Escherichia_coli_o157_h7_str_sakai 2,3,4
    EXON 7 6 725366 725474 EBG00001089957-1 false cccaaaagaaaaccctcaccgtcaggcggcgagggtttaactcacatgatgatactgactgttgctcactctttgaagtgatttgcgtcacattcagggaattcctcaa -1 1 cccaaaagaaaaccctcaccgtcaggcggcgagggtttaactcacatgatgatactgactgttgctcactctttgaagtgatttgcgtcacattcagggaattcctcaa RETAINED
    TRANSCRIPT 6 5 725366 725474 EBT00001692105 false 7 lincRNA false false false false false 1 -1 -1
    GENE 5 2 725366 725474 EBG00001089957 false 6 rnk_leader lincRNA
    EXON 10 9 2143588 2143965 BAB35560-1 false atggttaatcagaagaaagatcgtctgcttaacgagtatctgtctccgctggatattaccgcggcacagtttaaggtgctctgctctatccgctgcgcggcgtgtattactccggttgaactgaaaaaagtgttgtcggtcgacctgggagcactgacccgtatgctggatcgcctggtctgtaaaggctgggtagaaaggttgccgaacccgaatgataagcgcggcgtactggtaaaacttaccaccagcggcgcggcaatatgtgaacaatgccatcaattagttggccaggacctgcatcaagaattaacaaaaaacctgacggcggacgaagtggcaacacttgagcatttgcttaagaaagtcctgccgtaa 0 1 atggttaatcagaagaaagatcgtctgcttaacgagtatctgtctccgctggatattaccgcggcacagtttaaggtgctctgctctatccgctgcgcggcgtgtattactccggttgaactgaaaaaagtgttgtcggtcgacctgggagcactgacccgtatgctggatcgcctggtctgtaaaggctgggtagaaaggttgccgaacccgaatgataagcgcggcgtactggtaaaacttaccaccagcggcgcggcaatatgtgaacaatgccatcaattagttggccaggacctgcatcaagaattaacaaaaaacctgacggcggacgaagtggcaacacttgagcatttgcttaagaaagtcctgccgtaa RETAINED
    CDS 11 9 2143588 2143965 CDS_Chromosome_2143589_2143963 false 0
    TRANSCRIPT 9 8 2143588 2143965 BAB35560 false 10 protein_coding true false true false false 1 -1 -1 11
    GENE 8 2 2143588 2143965 BAB35560 false 9 ECs2137 protein_coding
    ===

    Some online thread mentioned that when snp call reference has inconsistency with the snpEff database, such error would occur. Is there an easy way to modify the .vcf file to get around it?


    Thank you very much for your time.

    C.
  • gringer
    David Eccles (gringer)
    • May 2011
    • 845

    #2
    A warning flag for me is that the VCF file contains a chromosome name of "1" rather than the SNP call header names ("BA000007.2", etc).

    Comment

    • capricy
      Senior Member
      • Apr 2012
      • 125

      #3
      do you mean the first column?

      This indeed is my first time to handle the .vcf file. But I did read the document and the all the examples I have found has number of chromosome instead of the specific name in the first column.

      Something wrong with my understanding?

      I am actually confused about snpEff database/ensembl record: it starts with chromosome 2, but not chromosome 1...

      Comment

      Latest Articles

      Collapse

      • SEQadmin2
        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by SEQadmin2


        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

        Here are nine questions we think about, in roughly the order they matter, before...
        06-18-2026, 07:11 AM
      • SEQadmin2
        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
        by SEQadmin2


        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
        ...
        06-02-2026, 10:05 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, Yesterday, 11:10 AM
      0 responses
      7 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-17-2026, 06:09 AM
      0 responses
      42 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      103 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      125 views
      0 reactions
      Last Post SEQadmin2  
      Working...