Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Illumina sequencing: Hits to chromosome Y from female sample?

    Hello all.

    I am doing analysis of Illumina GAII/GAIIx data. I am working with female samples.

    When I map the data, I see hits on chromosome Y.

    The hits match perfectly and uniquely. Some of the reads were 36 bases, others were 100 bases.

    To double-check the alignment, I used BLAT of the full sequence against hg18 (in the UCSC browser)

    I also ran BLAT against hg16, hg17, hg19

    I also did a BLAST search against hg18 (UCSC version) and NCBI Build 36.

    No matter what I ran, I got hits to chromosome Y.

    1. This has happened on 3 different samples (tumor cell line, blood sample, hESC female) prepped at different times
    2. This has happened in two different laboratories. One has a GAII and the other has a GAIIx.
    3. The samples were prepared and sequenced by different people. Two of the samples were prepared by females and sequenced by a female. The third sample was prepared by a female and sequenced by a male.
    4. One sample was chip-seq, one was paired end genomic, one was paired-end exome (captured with NimbleGen's SeqCap)
    5. All three samples were mapped with ELAND. One of the samples was also mapped with MAQ

    #1 rules out contamination in the sample
    #2 rules out contamination in the lab, on the cluster station and on the GA
    #3 rules out contamination by a person
    #4 rules out an artifact of the type of sequencing
    #5 rules out an artifact of the type of mapping

    We know the samples are female with no Y chromosome.

    Has anyone run into this? Does anyone have any thoughts?


  • #2
    Pseudo-autosomal region? From UCSC: PAR1 & 2 are:
    chrY:1-2709520 and chrY:57443438-57772954
    chrX:1-2709520 and chrX:154584238-154913754



    • #3
      subject data

      Are you using anonymous samples or is there the possiblity to access samples from relatives?


      • #4
        X and Y also have a paralog region diverged a few million years ago.


        • #5
          Interesting. How many reads do you have which are mapping to Y?

          After pregnancy Y chromosome DNA can be detected in some females. This isn't where I first heard about it but here's a quick link -

          Pathology. 2010 Feb;42(2):160-4.
          PCR detectable Y chromosome-specific DNA but no intact Y chromosome-bearing cells in polymyositis biopsies of two women with male offspring.
          Fitches AC, Yousem S, Cieply K, Stebbings S, Highton J, Hung NA.

          Department of Pathology, University of Otago, Dunedin School of Medicine, Dunedin, New Zealand.
          AIMS: Pregnancy-related and idiopathic adult polymyositis are inflammatory myopathies of unknown aetiology in which CD8 positive T cells are found in close association with the up-regulation of human leukocyte antigen (HLA) class I on affected muscle cells. A similar polymyositis can also occur in patients with chronic graft versus host disease, wherein graft lymphocytes may be involved in the myositis. We investigated whether polymyositis that was temporally related to pregnancy, contained Y chromosome-bearing cells or signals using polymerase chain reaction (PCR) in biopsies of lesional muscle from two women who had given birth to sons. Furthermore, if Y chromosome material was present, we investigated whether it was contained in the intact inflammatory cells (CD8 positive lymphocytes for example), fetal macrophages, or differentiated fetal stem cells engrafted in the lesional skeletal muscle, and thus whether fetal cells played a role in the pathogenesis of the myositis. METHODS: PCR analysis was used for the Y chromosome in lesional tissue and fluorescence in situ hybridisation (FISH) for intact cells carrying the Y chromosome. RESULTS: Small amounts of Y chromosome material were detected on second round PCR in fresh frozen tissue. No Y chromosome-bearing intact cells of lymphocytic, macrophage or muscle lineage were detected. CONCLUSION: Our results suggest that microchimeric fetal cells are not found in the lesional tissue of pregnancy-related polymyositis.

          PMID: 20085518 [PubMed - indexed for MEDLINE]


          • #6
            I see the opposite problem, heterozygous X SNPs from male samples.


            • #7
              I too ran into the same issue last month in some exome capture samples. The females all had some reads assembling to the Y chromosome but they were color coded (as viewed by CLC) as non-specific matches and were most likely assembled there randomly by CLC. There didn't seem to be any rhyme or reason to their placement and there were at most 2 reads overlapping a specific region. Males processed concurrently had reads specifically matching the Y chromosome as well as much higher coverage of the chromosome as a whole. After some low-level freaking out I decided it was merely an assembly artifact. Now I don't bother including the Y chromosome in the reference genome when assembling female samples.


              • #8
                Pseudoautosomal regions are fine as they are exactly identical and you can mask them out during alignment. The X-Y paralogous region is particularly challenging as the X and Y copies of the region are very similar but not identical. This region is likely to fool most SNP callers into calling wrong SNPs. If you plot the dbSNP density along chrX, you will see a spike around the X-Y paralogous region. Many of them are due to wrong SNP ascertainment messed up by the paralog. On the other hand, the paralogous region is only ~0.1% of the human genome. If you do not care, you can just ignore this issue entirely. Only ~0.1% of SNPs will be affected. But if you study chrX, excluding chrY for female alignment is important.

                On pseudoautosomal regions, if you are using the UCSC/NCBI version of build36, you can call no SNPs in the two regions. If you are using the Ensembl version, you will see a lot of hets. This biologically makes sense.


                • #9
                  lh3: could you please supply a bit more information? What are the chrX and chrY regions that are paralogous? Can you point me to a citation? Best, pfs


                  • #10
                    diabetes mellitus, generalized anxiety disorder, gad (generalized anxiety disorder), generalized anxiety disorder (gad), molecular medicine, urinary tract infection (uti), urinary tract infection, uti

                    I do not know the exact coordinate. You may have a look from UCSC self-alignment.


                    • #11
                      The link above needs one to register

                      The human Y chromosome is unable to recombine with the X chromosome, except for small pieces of pseudoautosomal regions at the telomeres (which comprise about 5% of the chromosome's length). These regions are relics of ancient homology between the X and Y chromosomes.


                      • #12
                        Actually I like the following link better it is far more descriptive of the homologous regions of X and Y


                        Latest Articles


                        • seqadmin
                          Essential Discoveries and Tools in Epitranscriptomics
                          by seqadmin

                          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                          Yesterday, 07:01 AM
                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin

                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM





                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        Last Post seqadmin