Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 1000 genome SNP call

    I want to call SNPs from the 1000 genome project original .bam files

    Such as the data we can download from:

    ftp://ftp.1000genomes.ebi.ac.uk/vol1...096/alignment/

    I used samtools to call SNPs from the bam file named "HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam" in this folder.

    but the pileup file is empty, the snp call procedure should be correct, is the bam file incompleted or other potential problems?

    Please give me some advise.

    Thank you very much

    Eric

  • #2
    Could you give the exact command of you SNP calling step?

    Comment


    • #3
      Just a quick guess:
      Samtools homepage recommends snp calling with the following command:

      1. samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf
      2. bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf

      One possible cause for your problem could lie in the reference sequence. 1000genomes use reference sequences called 1,2,3,4,5,6,etc for the chromosomes
      If you use chr1,chr2,chr3 (UCSC-style) reference sequence, that could possibly lead to an empty pileup file.

      Comment


      • #4
        Thank you very much for your help.

        Because I just want to call SNP for each individual separately, so the command used is:

        samtools pileup -vcf ref.fa aln.bam > raw.pileup

        I think it should be the issue of reference sequence, which I used for snp call is UCSC hg19. Maybe I can test whether this problem still exist after changing the reference sequence.

        Comment


        • #5
          I think you're right, the error is in the name of the reference sequence. Check to see the chromosome names in the alignment and reference are _exactly_ the same. We have had this problem frequently.

          Comment


          • #6
            mpileup generates only chr1 in a bcf file

            Originally posted by colindaven View Post
            I think you're right, the error is in the name of the reference sequence. Check to see the chromosome names in the alignment and reference are _exactly_ the same. We have had this problem frequently.
            Hi,
            I had a similar situation.
            I have a bam file containing read maps across entire chromosomes.
            I did mpileup,

            samtools mpileup -uf Homo_sapiens_assembly18.fasta s_4.merged.sorted.rmdup.bam | bcftools view -cvbg - > s_4.merged.sorted.rmdup.bam.raw.bcf

            by using bcftools view s_4.merged.sorted.rmdup.bam.raw.bcf > s_4.merged.sorted.rmdup.bam.raw.vcf,

            I realized that the .vcf file has SNP information on only chr1.
            there is no chr2, chr3, ... or chrX

            Do you have any idea about this problem?

            As you suggested, i checked the chromosome name in both reference sequence and bam file. the name is identical.

            Here is a part of my .fai derived from the reference sequence:
            chrM 16571 6 50 51
            chr1 247249719 16915 50 51
            chr2 242951149 252211635 50 51
            chr3 199501827 500021813 50 51
            chr4 191273063 703513683 50 51
            chr5 180857866 898612214 50 51
            :
            :

            Here is a part of my .bam file:
            HWI-EAS276_0022_FC70B81AAXX:4:91:4774:3269#0/1 0 chr1 12060 255 34M * 0 0 CTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATG BB=B=DD;DBBBBBDEEDD?ABABEDFEFGGG@G XA:i:0 MD:Z:34 NM:i:0

            :
            :
            HWI-EAS276_0022_FC70B81AAXX:4:29:16345:8488#0/1 16 chr2 34145 255 34M * 0 0 TCATAGTTCTGCTAGACTTCTCTGAGGTGAGCTA @IGDIGIHDIIIIHIIIIGIIIIIIIIHGIIIII XA:i:0 MD:Z:34 NM:i:0
            HW
            :
            :
            HWI-EAS276_0022_FC70B81AAXX:4:63:1614:2851#0/1 0 chr3 90279 255 34M * 0 0 TTTTATAAGGGGCTTTTCCCCCTTTGCTCAGCAC IIIIHIDIIIHHIIIIIIIIIIDIIIIHBHHHGH XA:i:0 MD:Z:34 NM:i:0


            Thank you

            Hee

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Exploring the Dynamics of the Tumor Microenvironment
              by seqadmin




              The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
              07-08-2024, 03:19 PM
            • seqadmin
              Exploring Human Diversity Through Large-Scale Omics
              by seqadmin


              In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
              06-25-2024, 06:43 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 07-10-2024, 07:30 AM
            0 responses
            25 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-03-2024, 09:45 AM
            0 responses
            201 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-03-2024, 08:54 AM
            0 responses
            211 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-02-2024, 03:00 PM
            0 responses
            193 views
            0 likes
            Last Post seqadmin  
            Working...
            X