Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Genomic Positions off in db135.b37.vcf

    I noticed some indels in db135.b37.vcf I downloaded from

    ftp://[email protected]

    For example:

    19 51835892 rs11402251 T TG . PASS GENEINFO=VSIG10L:147645;GNO;RSPOS=51835893;SAO=0;SLO;SSR=0;VC=DIV;VP=050100000000000100000200;WGT=0;dbSNPBuildID=120
    19 52004791 rs67024588 G GC . PASS GENEINFO=SIGLEC12:89858;GNO;RSPOS=52004794;RV;S3D;SAO=0;SLO;SSR=0;VC=DIV;VP=050300000000000100000200;WGT=0;dbSNPBuildID=130

    But we can see that the correct position is in "RSPOS=" but the second field is off.

    Is this a bug or a feature???

  • #2
    I found something that I think was similar and concluded was a bug months ago; I emailed them and one of them acknowledged it was a bug. I check back with them a few weeks later and I was told that he passed the information on to the relevant person and I haven't heard back. This was at least a few months ago.

    It may have been completely different then what you just posted but my conclusion is that I would not be surprised if there are numerous errors where things are off by 1 bp or maybe a couple more for indels.

    Comment


    • #3
      Oh I see. I think I will just fix that file with RSPOS positions. Thanks for your reply.

      Comment


      • #4
        Originally posted by ymc View Post
        Oh I see. I think I will just fix that file with RSPOS positions. Thanks for your reply.
        I recall considering that and realizing it wasn't that easy. Make sure you check a decent amount after changing them and confirming they are correct (again, whatever problems I saw may very well have been fixed by now).

        Comment


        • #5
          I am fixing this by hand now. How should I fix rs67024588? Are the alleles also wrong because at chr19:52004794 is C?

          19 52004794 rs67024588 C CC . PASS GENEINFO=SIGLEC12:89858;GNO;RSPOS=52004794;RV;S3D;SAO=0;SLO;SSR=0;VC=DIV;VP=050300000000000100000200;WGT=0;dbSNPBuildID=130

          Is this ok???

          Comment


          • #6
            According to the website it's a G: http://www.ncbi.nlm.nih.gov/projects...gi?rs=67024588, so it depends on what strand you are annotating with.

            Comment


            • #7
              But doesn't VCF always show forward strand??



              It is a G only if it is in the reverse strand. And the base before the event is a C in the forward strand, right?

              Comment


              • #8
                Ah, yes, you're probably right. My fault.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                43 views
                0 likes
                Last Post seqadmin  
                Working...
                X