Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • KerryOdair
    Statisctics for the next version 2.28 of our Y-Tree YFull (coming soon):
    440 SNPs, 83 subclades
    by haplogroups:
    A0: 1 SNP
    A1b1: 2 SNPs
    E: 11 SNPs, 2 subclades
    G: 14 SNPs, 1 subclade
    I1: 52 SNPs, 15 subclades
    I2: 64 SNPs, 10 subclades
    J1: 4 SNPs, 3 subclades
    J2: 48 SNPs, 1 subclade
    T: 4 SNPs, 3 subclades
    N: 8 SNPs, 9 subclades
    O: 6 SNPs, 1 subclade
    Q: 87 SNPs, 8 subclades
    R1a: 44 SNPs, 12 subclades
    R1b: 73 SNPs, 13 subclades
    R-M479: 27 SNPs, 4 subclades
    others: 5 SNPs, 1 subclade

    Leave a comment:

  • KerryOdair
    Statistics for the next version (2.27) of the Tree (coming soon...):
    1192 SNPs, 205 subclades
    by haplogroups:
    A0: 106 SNPs
    A1b1: 32 SNPs, 2 subclades
    E: 70 SNPs, 9 subclades
    G: 89 SNPs, 1 subclade
    H: 1 SNP
    I1: 122 SNPs, 44 subclades
    I2: 273 SNPs, 29 subclades
    J: 36 SNPs, 8 subclades
    T: 49 SNPs, 6 subclades
    N: 18 SNPs, 5 subclades
    O: 185 SNPs, 27 subclades
    Q: 38 SNPs, 7 subclades
    R1a: 99 SNPs, 40 subclades
    R1b: 72 SNPs, 27 subclades
    others: 2 SNPs

    Leave a comment:

  • KerryOdair
    Update from Maximus Centurion‎ Y-Chr Sequence
    Interpretation Service

    YTree 2.25 (due date 15-20 October) coming soon...
    1724 SNPs, 138 subclades:
    C: 54 SNPs
    E: 226 SNPs, 6 subclades
    G: 83 SNPs
    H: 3 SNPs
    I1: 134 SNPs, 28 subclades
    I2: 193 SNPs, 25 subclades
    J: 84 SNPs, 1 subclade
    L: 53 SNPs, 1 subclade
    N: 163 SNPs, 16 subclades
    O: 1 SNP
    Q: 225 SNPs, 16 subclades
    R1a: 313 SNPs, 24 subclades
    R1b: 92 SNPs, 21 subclades
    T: 92 SNPs, 4 subclades

    Leave a comment:

  • KerryOdair
    YTree v2.24 (in the process of calculating now...)
    Statistics: 571 SNPs, 127 subclades.
    By haplogroups:
    C: 2 SNPs
    E: 31 SNPs, 9 subclades
    G: 3 SNPs, 2 subclades
    H: 4 SNPs
    I1: 73 SNPs, 15 subclades
    I2: 143 SNPs, 28 subclades
    J: 62 SNPs, 17 subclades
    L: 51 SNPs
    N: 28 SNPs, 4 subclades
    Q: 57 SNPs, 6 subclades
    R1a: 40 SNPs, 24 subclades
    R1b: 79 SNPs, 22 subclades

    Leave a comment:

  • KerryOdair
    Very Nice presentation by Greg Magoon from Full Genomes:

    Y Chromosome Sequencing: Progress and Promise
    2014 International Genetic Genealogy Conference
    Washington, D.C. August 16, 2014
    Greg Magoon

    Leave a comment:

  • KerryOdair
    Full Genomes launches Y Prime - a new Y chromosome sequencing product
    The following press release has been written by Full Genomes Corporation.

    Full Genomes Corporation (FGC) is announcing today the introduction of a new Y chromosome sequencing product, dubbed Y Prime. The Y Prime test leverages recent technology advances to economically sequence large portions of a male's Y chromosome, enabling advanced, high-resolution tracing of direct paternal line ancestry.

    FGC has worked with industry leaders to develop a new Y chromosome capture approach and has combined it with Illumina "next-gen" sequencing. The resulting data will be processed with the latest alignment algorithms to improve read mapping. The overall result is a cutting-edge product with Y chromosome coverage breadth that is close to that of FGC's original comprehensive Y sequencing product (now termed Y Elite), at a much lower cost. Additionally, the new product is priced lower than the leading competitor, while retaining a significant advantage in terms of quality and comprehensiveness.

    More information at link

    Leave a comment:

  • KerryOdair
    Statisctics for the YTree version 2.22 (coming soon):
    NEW: 485 SNPs, 23 subclades
    C: 195 SNPs
    I1: 24 SNPs, 3 subclades
    I2: 106 SNPs, 5 subclades
    J: 6 SNPs
    N: 14 SNPs, 6 subclades
    Q: 62 SNPs
    R1a: 37 SNPs, 7 subclades
    R1b: 14 SNPs, 2 subclades
    R-M479: 27 SNPs

    Leave a comment:

  • KerryOdair
    Below are new statistics from the Yfull folks. These are new results coming from sequence Y testing from FullGenomes and the BigY from FamilytreeDNA. We are beginning to see the full impact of the data coming in and this is just the beginning. This is a huge leap in our body of knowledge of the Y.

    Statisctics for the next version 2.21 of our Y-Tree (coming soon):
    will be added 11987 SNPs, 246 new subclades
    Technical requirements: .FASTQ or .BAM file; coverage min 25X; read length min 100 bp
    by haplogroups:
    I1: will be added 406 SNPs, 9 new subclades
    I2: will be added 843 SNPs, 19 new subclades
    J: will be added 1763 SNPs, 16 new subclades
    N: will be added 753 SNPs, 34 new subclades
    O: will be added 1191 SNPs, 23 new subclades
    Q: will be added 402 SNPs, 10 new subclades
    R1a: will be added 281 SNPs, 28 new subclades
    R1b: will be added 595 SNPs, 75 new subclades
    R-M479: will be added 55 SNPs, 2 new subclades
    Last edited by KerryOdair; 06-22-2014, 07:16 AM.

    Leave a comment:

  • KerryOdair
    In addition to my FullGenomes testing I have added the following folders with autosomal testing for myself at my google drive. I have also placed an experimental E-M35 portion of the tree that will be updated as new discoveries are made by the E-M35 Haplozone group.

    1. Genographic 2.0 From National Geneographic

    2. 23andMe Version 2.0 Testing

    3. Family Finder Autosomal testing from Familytreedna

    Link to google drive:
    Access Google Drive with a Google account (for personal use) or Google Workspace account (for business use).
    Last edited by KerryOdair; 05-02-2014, 05:06 PM.

    Leave a comment:

  • KerryOdair
    Test Results

    There is a folder that contains my results files from testing at FullGenomes. This is probably what you will want to look at. It shows all the output files supplied from the testing. So if you are curious about what this data looks like it is available for view. If you have a google login you should be able to view and download these files.

    I have placed on google drive my results files for view. There is also a .bam file of ChrY data that is 2.2 giagbytes. My complete file was 6.6 giagbytes which also contained ChrMt data and STR information. This was just too big to put up on googledrive on an upload with my current system. A word of warning should you want to download the 2.2 giabyte file with ChrY. Googledrive has a 2.0 gig max on downloads for files. Some people have hit this restriction and others have not based on browser and OS type and version. If you move the .bam file to your own googledrive the download seems to work all the time.

    I have also given my .bam for interpretation and study to as well. I am also working with to create primers for my own private snps or family and clan panel.

    There are now two Y Sequence tests available in the Market place today. The Tsunami of newly discovered snps has begun. This was my hope when I started this thread. I am glad to see that we are on the doorstep of these new discoveries based on this kind of DNA testing.

    Link to my personal FullGenomes results

    Access Google Drive with a Google account (for personal use) or Google Workspace account (for business use).

    Leave a comment:

  • KerryOdair
    This is an update on my FullGenomes data with dating of SNP's. I would like to again thank Steve Fix in helping me with my data.

    Over the past month I have been going over my E-M215 database and making a few minor corrections and improvements. None of these materially change the analysis I sent you last November but in this process I have also integrated SNP dating using the Poznik “reliable regions” which I think you might find interesting. In the corresponding spread sheet I maintain the comparison using the average mutation rate of 1x10E-9/y. The results are easily extended if you wish to use Poznik’s mutation rate or Francalacci’s etc.

    The updated analysis even though it includes the Poznik regions did not change my estimate of the effective coverage of the FullGenomes sequencing to be around 16mb. The biggest change I saw using Poznik was that the number of “individual” SNPs fell with respect to the 1KGP Ph 1 “reliable regions” estimate. This has the effect of moving the overall age estimates closer to the StrictMask method. Using Poznik the age of your “individual” segment(TMRCA with HG01497) is 2 ky which is consistent with what I concluded in my November analysis.

    Steve Fix
    28 Jan 2014
    Attached Files
    Last edited by KerryOdair; 02-11-2014, 03:13 PM.

    Leave a comment:

  • KerryOdair
    I would like to take this time to thank Steve Fix who did an analysis of my data from FullGenomes. He has been extremely helpful in adding to my understanding of my results. Below are his comments and a thumbnail showing dating on major snps in the E-M35 group.


    I have completed my analysis of your FullGenomes test results. This report extends and updates what I had previously sent you on Nov 4th and is based on the SNP report provided to you by FullGenomes contained in the file
    In this report they identified 3690 SNP type variants and called 2925 of these positive with varying degrees of reliability. Of these they classified 141 as “private”.

    As outlined in my previous report I note that you are most closely related to the 1KGP sample HG01497 from the CLM(Colombian in Medellin) data set and as part of my analysis have done bottoms-up dating of your results using a mutation rate of 1x10E-9/y. I have done this using two methods: One based upon the “reliable regions” of Y similar to what Wei, Francalacci and others have used. The other based on the 1KGP StrictMask. In general I have found the “reliable regions” method to be more volatile and less repeatable when comparing the 1KGP results to the Complete Genomics data. This analysis does not change that observation. Of the 3690 SNP variants in your results 1187 fell within the reliable regions. Of those I can associate 573 with the E haplogroup and 568 of those were called positive. Similarly for the StrictMask I associate 172 SNP variants in your results with E and all of those were called positive. There is an issue however as to whether all of the SNP variants listed should be classified as SNPs. I identified 30 positively called variants as associated with INDELs including 2 which passed the StrictMask and 6 from the reliable regions. These were mostly from the individual SNP classified set (see my analysis summary spreadsheet) so you will need to take a closer look at your alignment file(.bam) to better understand and resolve these calls. If these are found to be SNPs then my age estimates will increase by the appropriate amount (320y per StrictMask SNP and 112y per reliable region SNP). The locations of these variants can be found in my analysis summary spreadsheet.

    The analysis summary spreadsheet is attached and contains the estimates for the segments and branches of haplogroup E based upon your results. I have also included a segment by segment ratio of the number of SNPs found with respect to the total that I could classify for the two methods. In addition I have made an estimate of the implied coverage achieved in your results based on the two methods and the SNP total. The average is shown to be 16 mb which is about what one might have expected from today’s sequencing technology. This of course assumes a mutation rate which does not vary within the male specific region. In this analysis I did not include the SNPs associated with the individual segment as the total of 386 is not compatible with those observed in the more reliable regions. One would conclude that many of these are not real even though they were screened and come from locations which have produced reliable results elsewhere. Using the quality assessment of FullGenomes the 386 individual SNPs have the following breakdown
    Total StrictMask Reliable Regions
    high quality 11 2 11
    * quality 17 2 4
    ** quality 278 0 14
    *** quality 80 1 1
    This would indicate that the majority of quality calls are from the previously identified reliable regions of Y. This further tends to confirm why using the reliable regions is required for bottoms-up dating and why I prefer the more restrictive StrictMask even though some people would argue that it is too strict. In looking at these results from FullGenome I am not convinced that this criteria should be broadened.

    As I have mentioned previously these estimates appear consistent with what I have seen in the Complete Genomics sequenced samples and continue to be compatible with those of Karafet(2008). Your’s is the first FullGenome results I have looked at as well as the first V12 but these are consistent with what I would expect. I expected the V12 dates as I have observed with the V22 dates to be slightly higher than the V13 and V65 dates. As more results become available over the next few months I will have more to say on this issue.

    In summary it would appear that your TMRCA with HG01497 is somewhere around 1.6k to 3.3ky years ago. Since I find the StrictMask dating more consistent I prefer the 2 ky estimate assuming the 1x10E-9/y mutation rate.

    Steve Fix
    20 Nov 2013
    Attached Files

    Leave a comment:

  • KerryOdair
    This is a link to the latest Y-DNA SNP testing chart for testing companies.

    I would also like to state that I have received my results from FullGenomes testing. I am very satisfied with my results and have received all files as promised by the company. The files supplied are in the chart information. These include 9 data files and my bam file.

    Through this testing my terminal snp has been identified based on the current Y tree, 1k genome project and various other sources. My terminal snp looks to be about 2100 to 2600 years old. They have also identified beyond my terminal snp 25 new mutations unique to me at this point along with 4 unique indels. Based on current studies for snp mutation rate or even a snp mutation every 90 to 75 years these new mutations should get me into a genealogical time frame within paper records.

    There is a funded study by a citizen scientist to get samples from A00 group of people in Cameroon. The hope is to identifiy one of these haplotypes and have the Y fully sequenced. This will hopefully give us the ancestral state of these early snps. This will help in positioning new snps on the tree.

    Exciting times for us in the genetic genalogy world.
    Last edited by KerryOdair; 11-13-2013, 10:55 AM.

    Leave a comment:

  • syfo
    Efficient identification of Y chromosome sequences in the human and Drosophila genomes

    Notwithstanding their biological importance Y chromosomes remain poorly known in most species. A major obstacle to their study is the identification of Y chromosome sequences: due to its high content of repetitive DNA, in most genome projects the Y chromosome sequence is fragmented into a large number of small, unmapped scaffolds. Identification of Y-linked genes among these fragments has yielded important insights about the origin and evolution of Y chromosomes, but the process is labor intensive, restricting studies to a small number of species. Apart from these fragmentary assemblies, in a few mammalian species the euchromatic sequence of the Y is essentially complete, owing to painstaking BAC mapping and sequencing. Here we use female short read sequencing and k-mer comparison to identify Y-linked sequences in two very different genomes, Drosophila virilis and human. Using this method, essentially all D. virilis scaffolds were unambiguously classified as Y-linked or not Y-linked. We found 800 new scaffolds (totaling 8.5 Mbp), and four new genes in the Y chromosome of D. virilis, including JYalpha, a gene involved in hybrid male sterility. Our results also strongly support the preponderance of gene gains over gene losses in the evolution of the Drosophila Y. In the intensively studied human genome, used here as a positive control, we recovered all previously known genes or gene families, plus a small amount (283 kb) of new, unfinished sequence. Despite some ambiguity caused by misassembled segmental duplications, the vast majority of the human sequence could be reliably identified as Y or not Y-linked. Hence this method works in large and complex genomes and can be applied to any species with sex-chromosomes.

    Leave a comment:

  • KerryOdair
    Originally posted by Joann View Post
    Kerry, do you expect to post your full Y sequence, and if so, where?
    Hello Joann,

    Yes, I do expect to post my full Y sequence publicly and Geno 2.0 results. I will also offer it to any academic endeavor for further study of the Y or E-M35 phylogency. It is yet to be determined where it will reside at this point. I may have an area with FullGenomes as a repository for the data. Assuming my sample has good dna for replication, I should receive results the last week of July or first week of August.

    We have a nice quality assurance situation with my sample. In the E-M35 group we already have 52 tests completed for the GenoGraphic 2.0. The 52 tests involve different subclades in E-M35. We already have 5 tests for V12, which happens to be my subclade. 19 tests have already been compared with each other and to the 1000 genomes samples. New SNP’s are being discovered and cataloged by individuals in our group and submitted for new SNP identification numbers. With this information we have a SNP analysis file along with an E-M35 tree that is fluid in its information based on new discoveries. There are 13,000 SNP’s in the Geno 2.0. test.

    This information is going to be used by FullGenomes as Quality Assurance against my full Y sequence, which will be used to match against 29,000 SNP’s. This is over twice the number in the Geno 2.0. test.

    This is the beginning of some exciting times and maybe the beginning of a snp molecular clock specific to E-M35.
    Attached Files
    Last edited by KerryOdair; 06-10-2013, 03:44 PM.

    Leave a comment:

Latest Articles


  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin

    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM
  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin

    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM





Topics Statistics Last Post
Started by seqadmin, 05-24-2024, 07:15 AM
0 responses
Last Post seqadmin  
Started by seqadmin, 05-23-2024, 10:28 AM
0 responses
Last Post seqadmin  
Started by seqadmin, 05-23-2024, 07:35 AM
0 responses
Last Post seqadmin  
Started by seqadmin, 05-22-2024, 02:06 PM
0 responses
Last Post seqadmin