Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combine 1000genomes bams to get better coverage?

    Hi all,

    I downloaded the bams from this 1000genomes ftp site:

    ftp://ftp.1000genomes.ebi.ac.uk/vol1...878/alignment/

    I only used the illumina data for my application. I found that the illumina data was about 20x which was not good enough for my application. I noticed that there are also bams from 454 and SoLid. Can I use samtools merge to get a combined bam such that I can get better overall coverage???

    Thanks!

    PS I am not sure if doing this will give me enough coverage even if successful. Does anyone know other places I can download high coverage human fastqs or bams?

  • #2
    It seems like Broad Institute has bams for NA12878 at 40x internally. Is this data available to outsiders?

    Comment


    • #3
      What are you trying to achieve. For variant calling many callers can consider more than one bam at once ?

      Comment


      • #4
        Originally posted by laura View Post
        What are you trying to achieve. For variant calling many callers can consider more than one bam at once ?
        I am trying the now unsupported HLA Caller form the GATK package.

        Supposedly you should get the following HLA calls if you use NA12878.bam from Broad and human_b36_both.fasta:
        ===============================================
        Locus A1 A2 Geno Phase Frq1 Frq2 L Prob Reads1 Reads2 Locus EXP White Black Asian
        A 0101 1101 -1229.5 -15.2 -0.82 -0.73 -1244.7 1.00 180 191 229 1.62 -1.99 -3.13 -2.07
        B 0801 5601 -832.3 -37.3 -1.01 -2.15 -872.1 1.00 58 59 100 1.17 -3.31 -4.10 -3.95
        C 0102 0701 -1344.8 -37.5 -0.87 -0.86 -1384.2 1.00 91 139 228 1.01 -2.35 -2.95 -2.31
        DPA1 0103 0201 -842.1 -1.8 -0.12 -0.79 -846.7 1.00 72 48 120 1.00 -0.90 -INF -1.27
        DPB1 0401 1401 -991.5 -18.4 -0.45 -1.55 -1010.7 1.00 64 48 113 0.99 -2.24 -3.14 -2.64
        DQA1 0101 0501 -1077.5 -15.9 -0.90 -0.62 -1095.4 1.00 160 77 247 0.96 -1.53 -1.60 -1.87
        DQB1 0201 0501 -709.6 -18.6 -0.77 -0.76 -729.7 0.95 50 87 137 1.00 -1.76 -1.54 -2.23
        DRB1 0101 0301 -1513.8 -317.3 -1.06 -0.94 -1832.6 1.00 52 32 101 0.83 -1.99 -2.83 -2.34
        ==============================================

        But if I use the aforementioned three bams and human_g1k_v37.fasta with updated HLA_EXONS.intervals, HLA_DICTIONARY.txt and HLA_POLYMORPHIC_SITES.txt, I got

        =============================================
        Locus A1 A2 Geno Phase Frq1 Frq2 L Prob Reads1 Reads2 Locus EXP White Black Asian
        A 0101 1104 -1133.2 -40.7 -0.82 -6.00 -1173.9 1.00 133 138 177 1.53 -6.82 -7.31 -7.34
        B 0820 5601 -1156.2 -43.5 -6.00 -2.15 -1201.4 1.00 62 71 111 1.20 -8.30 -8.70 -8.15
        C 0102 0701 -1718.5 -150.9 -0.87 -0.86 -1871.5 1.00 46 106 155 0.98 -2.35 -2.95 -2.31
        DPA1 0103 0201 -1443.8 -4.8 -0.12 -0.79 -1451.4 1.00 43 19 62 1.00 -0.90 -INF -1.27
        DPB1 0401 1401 -1102.9 -35.2 -0.45 -1.55 -1139.0 1.00 41 9 52 0.96 -2.24 -3.14 -2.64
        DQA1 0105 0501 -1549.3 -26.2 -1.24 -0.62 -1582.4 1.00 145 57 202 1.00 -2.62 -1.94 -2.72
        DQB1 0203 0501 -1266.4 -145.1 -2.05 -0.76 -1413.4 1.00 33 73 127 0.83 -3.68 -2.80 -3.82
        DRB1 0101 0301 -1683.0 -279.3 -1.06 -0.94 -1965.9 0.83 20 41 96 0.64 -1.99 -2.83 -2.34
        DRB1 0120 0301 -1678.8 -279.3 -6.00 -0.94 -1963.3 0.17 20 41 96 0.64 -6.94 -7.15 -7.00
        ========================================

        The result is close but not exactly. I suspect the reason might be the Broad NA12878.bam is 40x but the combined bam I used is about 35x
        Last edited by ymc; 04-22-2012, 10:38 PM.

        Comment


        • #5
          hi, ymc

          I also try sth. about HLA caller. I want to ask you a question. You say you have updated the file HLA_DICTIONARY.txt. How to get an updated HLA_DICTIONARY.txt? I find all the alleles sequences in the primary HLA_DICTIONARY.txt have the same length, but in the IGMT/HLA database the alleles' lengths are actually different. How to do that?

          Thanks.

          Comment


          • #6
            Originally posted by glede View Post
            hi, ymc

            I also try sth. about HLA caller. I want to ask you a question. You say you have updated the file HLA_DICTIONARY.txt. How to get an updated HLA_DICTIONARY.txt? I find all the alleles sequences in the primary HLA_DICTIONARY.txt have the same length, but in the IGMT/HLA database the alleles' lengths are actually different. How to do that?

            Thanks.
            I only updated the positions. I don't know if the allele sequences also need to be updated.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Developments in Metagenomics
              by seqadmin





              Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
              09-23-2024, 06:35 AM
            • seqadmin
              Understanding Genetic Influence on Infectious Disease
              by seqadmin




              During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

              Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
              09-09-2024, 10:59 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 04:51 AM
            0 responses
            8 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 10-01-2024, 07:10 AM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-30-2024, 08:33 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-26-2024, 12:57 PM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Working...
            X