Can we sequence the Y Chromosome

KerryOdair replied

09-21-2011, 10:34 AM
Could Atomic Force Microscopy http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3135514/

or laser microdissection techniques be viable options for isolating the Y chromosome at a reasonable cost?
Leave a comment:
KerryOdair replied

08-26-2011, 10:25 AM
This is why we need sequencing of the Y. The academic community cannot even make up their minds on this issue of STR dating. We need snp dating via sequencing. Most of these studies are using 15 STR's or less in their studies. In our projects we have great numbers of 67 marker tests and people are upgrading to 111. We have better refinement with our databases but we do not have the cross section of a scientific sampling. Y sequencing will establish the tree by snp's. In one company alone there are 213,000 Y dna tests. We are using many testing techniques that are not giving us the complete answer. Hopefully someone will develop the technique for doing the separation process for the Y chromosome for sequencing. If they build the product the people will come and buy.

A new paper by Cristian Capelli and George Busby at Oxford University.

DNA study deals blow to theory of European origins

http://www.bbc.co.uk/news/science-environment-14630012#story_continues_1

A recent idea that most European men can trace their ancestry to early farmers from the Near East has been dealt a blow in a new study.

"A new study deals a blow to the idea that most European men are descended from farmers who migrated from the Near East 5,000-10,000 years ago.

The findings challenge previous research showing that the genetic signature of the farmers displaced that of Europe's indigenous hunters. The latest work leans towards the idea that most of Europe's males trace a line of descent to stone-age hunters.

But the authors say more work is needed to answer this question.

Furthermore, they suggest that some of the markers on the Y chromosome are less reliable than others for estimating the ages of genetic lineages. On these grounds, they argue that current analytical tools are unsuitable for dating the expansion of R-M269. Indeed, Dr Capelli and his team say the problem extends to other studies of Y-chromosome lineages: dates based on the analysis of conventional DNA markers may have been "systematically underestimated", they write in Proceedings B"
Leave a comment:
haleaton replied

08-17-2011, 01:24 PM
Illumina HumanOmni5-Quad

Thanks for posting Ray Banks DNA Forums post--I also posted a link to these important posts there. You are doing important work.

I am new to this, but I was curious what applicability to the Y was the Illumina HumanOmni5-Quad and its 500K custom snps. Its default snps were based on 1K Genomes which seems like a start at Ray's idea in a sense.

I agree there is a value on a focus on Y sequencing that may not be part of current business models focus on immediate medical and monetary metrics.

There is often inherent and hidden value in open source approaches--get the data public and let the community analyze it.
Leave a comment:
KerryOdair replied

08-17-2011, 01:13 PM
Originally posted by haleaton View Post

It provides 4.3 million + up to 500,000 custom markers and that the data set used for marker selection was the 1000 Genomes Dec 2010 release. Y coverage is tiny, but can the 500K customer markers be assigned to Y-DNA--particularly 1k Genome SNPs and other SNPs in FTNDA SnpInfo?

Hello,

I am posting a reply from Ray Banks at DNA-FORUMS who has been a wonderful contributor to the mining of the Y data from the 1K genome project. I see that you have been to that board as well. I put his reply below for others to see that best answers your question.

There is a strong possibility for the next step beyond 23andme with a custom chip with y snp's on it. I am still holding out for full sequencing. As Ray has suggested 30x coverage and read lengths longer than 100K BP would probably be necessary. Right now we are using brute force with numbers of people to look at the data. Sequenced data in libraries using the full power of computing would be a much better situation. Will brute force with enough Y snp's to justify a chip design before sequencing becomes cheap enough for the man on the street is the question. Isolating the Y and sequencing is a better solution in the long run if the pricing could be justified. We are only talking 60 M BP on Y versus 3 B BP on the genome. Isolation of the Y is the big stumbling block at this point probably not the cost of sequencing.

"I may be in a minority on this, but my impression after extracting SNPs in multiple haplogroups and the perspective from the viewpoint of these haplogroups is that the shared SNPs that have been listed with Z numbers fill in important gaps in the Y tree. As best as I can tell, however, these are primarily SNPs that developed 3000-5000 yrs ago and are thus common enough to be found in the several hundred samples we have examined. Those Z items that define new subgroups -- through persistence of everyone -- will end up as routine tests. The many SNPs that are equivalent to category-defining SNPs are not likely to yield much except for a rare early branch. For example, the new E subgroup in which the Bantus of Kenya have samples is shared with Nigerians. This Kenyan group is dated as having arrived in Kenya with the Bantu expansion about 3000 yrs ago. The new Z723 major branch under L140 in haplogroup G is probably going to be dated to the 3000 yr period because all the various subgroups that might be encompassed by Z723 would be that old.

While these Z SNPs are most useful for the population geneticist, the SNPs I think genealogical researchers should be most interested in -- now that the major missing branches are identified -- are the SNPs that occur in a single person. There will be about 5,000 or more of these singletons in the samples. I have found items from recent WTY among them as well as other new SNPs that have not made it into the official trees yet. One of these (L640) I know -- because it matches another man at Family Tree -- will likely define a major subgroup originating 2000 to 3000 yrs ago. Without the confirmatory presence of L640 outside the 1000 Genomes Project, this SNP would not be noticed or qualify for a Z number because it was found in just one person (singleton)in the 1000 Genomes data.

Family Tree's president has given me assurances they will make use of the singletons I have been extracting, but they do not announce new products until they are ready to offer them. So how they might be used would be speculative on my part. These singleton all lack Z numbers

There are multiple ways in which all the new SNPs -- singletons or otherwise -- might be made available:
(1) allow customers to pay the costs of the process of adding a new SNP to the test list
(2) concentrate on singleton sites in a new version of WTY. Those sites with Z numbers in the database at the checked site would be checked automatically based on what I read about this as written by Thomas Krahn.
(4) create an Illumina chip with the perhaps 10,000 Y SNPs identified in the 1000 Genomes Project.
(5) some other product

He has given no hint as to how these new SNPs might be used. But I have been putting in many hours to get these singletons listed. The process is complete for G and Q samples. The process was begun for J this weekend, and all these spreadsheets are linked from the Variants tabs on Greg's spreadsheet. B, C, D and E will probably be finished by the end of next week. E is, I believe, the largest group in the samples.

It seems like the average man's samples has about 15 new SNPs found only in his sample. But this number has varied among haplogroups. It would seem to indicate being more conservative than this that there are about 5000 singletons -- at a minimum.

It is possible that Family Tree will not follow through on what was communicated to me, but the fact that they did not say that they did not think they could make use of the singletons knowing the time commitment needed to extract these and understanding the significance of them offers some encouragement.

There may be 1500 SNPs among the singletons that would represent sizeable subgroups and would qualify as additions to the Y-trees.

I personally see the greatest potential in the Illumina chip concept, but I may not have a firm understanding of all the limitations of the chip or of the ability of any of the labs to implement such a product in a cost-effective manner."

Ray Banks

Last edited by KerryOdair; 08-17-2011, 01:16 PM.
Leave a comment:
haleaton replied

08-17-2011, 12:25 PM
Illumina HumanOmni5-Quad

It provides 4.3 million + up to 500,000 custom markers and that the data set used for marker selection was the 1000 Genomes Dec 2010 release. Y coverage is tiny, but can the 500K customer markers be assigned to Y-DNA--particularly 1k Genome SNPs and other SNPs in FTNDA SnpInfo?

Last edited by haleaton; 08-17-2011, 12:34 PM.
Leave a comment:
KerryOdair replied

08-15-2011, 10:38 AM
Here is an update on the 1K genome mining project for snp’s being done by researchers at DNA-FORUMS. They have isolated 1200+ Z snp’s and 26 have been verified to date. This group of Z snp’s is more than twice the size of any other snp development category. Hundreds of primers are now available through Familytreedna for testing of these snp’s.

I am still searching for a cost effective solution for sequencing the Y. The genetic genealogy people are beginning to be acknowledged by the academic community for the contributions we have found and advancing our knowledge. We are a group of people who are motivated to advance our knowledge of the human genome and our genetic population travels.

This is a list from the Isogg website that denotes what the SNP letters stand for. This is a handy list if you have not seen it before.

SNPs development indicated by beginning letters:
DF = anonymous researcher using publicly available full-genome-sequence data, including 1000 Genomes Project data; named in honor of the DNA-Forums.org genetic genealogy community
IMS-JST = Institute of Medical Science-Japan Science and Technology Agency
KL = Key Laboratory of Contemporary Anthropology, School of Life Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai, China
L = Thomas Krahn, MSc (Dipl.-Ing.) of Family Tree DNA's Genomics Research Center; snps named in honor of the late Leo Little
M = Peter Underhill, Ph.D. of Stanford University
N = The Laboratory of Bioinformatics, Institute of Biophysics, Chinese Academy of Sciences, Beijing
P = Michael Hammer, Ph.D. of University of Arizona
Page, PAGES or PS = David C. Page, Whitehead Institute for Biomedical Research
PK = Biomedical and Genetic Engineering Laboratories, Islamabad, Pakistan
S = James F. Wilson, D.Phil. at Edinburgh University
U = Lynn M. Sims, University of Central Florida; Dennis Garvey, Ph.D. Gonzaga University; and Jack Ballantyne, Ph.D., University of Central Florida
V = Rosaria Scozzari and Fulvio Cruciani, Università "La Sapienza", Rome, Italy
Z = G. Magoon, Richard Rocca, David F. Reynolds, Bonnie Schrack, Peter M. Op den Velde Boots, and an anonymous individual, independent researchers of 1000 Genomes Project data and Thomas Krahn, MSc (Dipl.-Ing.) of Family Tree DNA's Genomics Research Center, with support from the DNA-Forums.org genetic genealogy community

Ancient DNA reveals secrets of human history
Modern humans may have picked up key genes from extinct relatives.

Pääbo is amazed at how quickly the Neanderthal genome has been mined.

http://www.nature.com/news/2011/110809/full/476136a.html
Leave a comment:
KerryOdair replied

05-17-2011, 09:47 AM
Originally posted by SeqAA View Post

You realize it's a pain in the .... to isolate y chromosomes?
I'm not sure what your gameplan is for sequencing just the Y chromosome. Platform really isn't your problem here, it's flow cytometry.

Yes, I understand the flow cytometry is the basic problem. Steven Quake at Stanford is still working on this issue. Here is the latest article on that front.

404 - Page not found | MIT Technology Review

http://www.technologyreview.com/biomedicine/37204/?mod=related

There is also a startup company Noblegen that is developing a simplified version of nanopore genome sequencing technology. This may lead to getting the Y along with the rest of the genome at a cost for the man on the street.

404 - Page not found | MIT Technology Review

http://www.technologyreview.com/biomedicine/37551/?mod=chfeatured

Another person I am working with on this issue has had contact with a company in China. They are looking at the problem from two different angles. One being a chip for testing 10 to 25 million base pairs on the Y. The other if we can get enough people together is just to use existing technology to sequence the Y and create the library. The second solution pricing would be about half the cost of current sequencing of the genome done by Illumina and Complete Genomes.
Leave a comment:
KerryOdair replied

05-17-2011, 09:30 AM
Originally posted by GregRM View Post

Over the last several months, several of us at DNA-Forums have been mining the publicly-available next-gen ChrY data (mainly 1000 Genomes Project data, and some Complete Genomics data), similar to the approach that Keith suggested earlier in this thread. Here is the link to the section of the site where most of the discussion is occurring: http://dna-forums.org/index.php?/forum/143-1k-genomes/ . At this point, we have looked closely at haplogroups R and I, but we have preliminary classifications (based on known variants) for many of the other 1000 Genomes males with "low_coverage" data. So far, we have identified ~300 novel candidate variants (SNPs and short indels) that seem to be present in two or more samples under haplogroups R and I, and, with some outside help, 3 have been confirmed by conventional sequencing (more are in the pipeline; we're tracking our progress using this spreadsheet: https://spreadsheets.google.com/ccc?...thkey=CIOag_UD). In the process, we've also been discovering potential new phylogenetic structure to the Y-tree. An example showing a proposed phylogeny under R-U106 is here: http://www.box.net/shared/4x1fqatlqb . Kerry, if you're interested in joining the effort, feel free to chime in at DNA-Forums...there seem to be plenty of Haplogroup E samples to look at.

Hello,

I am sorry for not responding earlier but times have been busy. I have read about the good work you are doing over at DNA-Forums along with other subclades. Currently in the E-M35 project we are getting new results from some new Walk on the Y tests for our group. We are getting new snp's with these tests. Since our group was the first to test WOY, newer tests by Thomas are sequencing more base pairs than our original tests in 2009.

We currently have no one to dedicate time to the 1000 Genome data. Hopefully someone soon will take up that cause and champion the effort in our group. I have brought this to the attention of our members in the project. Hopefully we will be able to put our full attention into this data soon.

Regards,
Kerry O'Dair
Leave a comment:
Guest replied

04-29-2011, 03:39 PM
You realize it's a pain in the .... to isolate y chromosomes?
I'm not sure what your gameplan is for sequencing just the Y chromosome. Platform really isn't your problem here, it's flow cytometry.
Leave a comment:
GregRM replied

04-29-2011, 02:00 PM
Analysis of next-gen ChrY data at DNA-Forums

Over the last several months, several of us at DNA-Forums have been mining the publicly-available next-gen ChrY data (mainly 1000 Genomes Project data, and some Complete Genomics data), similar to the approach that Keith suggested earlier in this thread. Here is the link to the section of the site where most of the discussion is occurring: http://dna-forums.org/index.php?/forum/143-1k-genomes/ . At this point, we have looked closely at haplogroups R and I, but we have preliminary classifications (based on known variants) for many of the other 1000 Genomes males with "low_coverage" data. So far, we have identified ~300 novel candidate variants (SNPs and short indels) that seem to be present in two or more samples under haplogroups R and I, and, with some outside help, 3 have been confirmed by conventional sequencing (more are in the pipeline; we're tracking our progress using this spreadsheet: https://spreadsheets.google.com/ccc?...thkey=CIOag_UD). In the process, we've also been discovering potential new phylogenetic structure to the Y-tree. An example showing a proposed phylogeny under R-U106 is here: http://www.box.net/shared/4x1fqatlqb . Kerry, if you're interested in joining the effort, feel free to chime in at DNA-Forums...there seem to be plenty of Haplogroup E samples to look at.
Leave a comment:
KerryOdair replied

03-04-2011, 07:45 AM
Affymetrix Releases Comprehensive Validated Data from AxiomT Genomic Database

http://www.tradershuddle.com/20110303177524/Press-Releases/Affymetrix-Releases-Comprehensive-Validated-Data-from-the-Axiom%E2%84%A2-Genomic-Database.html

Written by TradersHuddle Staff
Thursday, 03 March 2011 03:05
SANTA CLARA, Calif.-( Business Wire )-

Affymetrix, Inc. (NASDAQ: AFFX) today announced the release of a complete data set of 5 million variants on its website. The genotyping data set, part of the Axiom™ Genomic Database, is based on extensive validation of genomic variants from the Single Nucleotide Polymorphism Database (dbSNP), 1000 Genomes Project, NHGRI Database of Published Associations, and collaborations that have led to the discovery of novel SNPs and insertion/deletions (indels). The data set includes genotyping data for more than 2 million validated rare and common genomic variants that Affymetrix recently contributed to the 1000 Genomes Project, many of which were not previously available from any source. The data will be incorporated into the 1000 Genomes Project’s public data repository.
Leave a comment:
krobison replied

02-03-2011, 07:40 AM
Complete Genomics releases 40 genomes; 20 more in March. Multiple ethnicities. Presumably about 1/2 male.
Leave a comment:
Joann replied

01-13-2011, 10:45 AM
Newborn screening and next gen sequencing

In her recent talk at the Genomics and Public Health conference, Sharon Terry suggested that our current genetic testing platform for newborns (check your state department of public health for its panel of genetic tests administered at birth) might soon give way to a genomic profile returned to the individual at no charge. That data (in sequence form) could certainly serve to provide material for extended Y chromosome analysis for those who were interested.

Page not found - CMP

https://www.cmpinc.net/2010PHGConference/presentations/Terry.pdf

See slide 5

Last edited by Joann; 01-13-2011, 10:48 AM.
Leave a comment:
KerryOdair replied

01-13-2011, 08:43 AM
Originally posted by krobison View Post

It's not that I think nobody is interested in the Y; it's just that truly developing a technology for isolating Y chromosomes is a serious undertaking for a niche area. There may be groups working on isolation, but it's doing a lot of work for a modest gain.

I have read articles that deal with targeting specific areas on specific chromosomes for medical tests at low costs. Maybe this avenue of research will allow a technique that will be viable for the Y chromosome.

Originally posted by krobison View Post

Again, the cost of data generation for one human genome really is around $10K, perhaps lower with Complete Genomics. Data generation for Y chromosome via SureSelect or RainDance or similar would be on order of $2K per genome -- of which $1K is the sequencing and $1K is the selection. So another method is either going to need to do something more interesting (such as haplotyping a diploid chromosome) or cost much less than $1K.

You could, for example, use Quake's method to separate chromosomes, use qPCR to identify which well had the Y & then just sequence that well. So in a sense, the technology exists -- but it isn't currently commercially available. Perhaps Quake could be interested in undertaking a wide survey of Y

I wish Quake was interested in doing this. But I do not see the academic environment having the resources to do this on a wide enough scale. It will take a commercial effort in my mind.

Originally posted by krobison View Post

So I'm making it badly, but the argument I'm making is two fold: (1) technology exists now to sequence the Y for $1K-$2K, so any group wishing to target Y-chromosome information could have data in about 2 months (if they have the DNA)

We are currently spending $700 to sequence approximately 120K BP today on the hope of finding new snps. Then we spend $30 per snp testing for placement in the tree. So we are currently almost spending $1000 just on the hope of finding new snps. So your cost for sequencing the whole Y chromosome is not far from what we are spending today. This whole debate is brought up very well by an article today in Genome Web Daily below. From my posts you most realize I am in the camp of us not having a $1000 genome and its interpretation via software. I am in the camp of the Genetic future article link below.

The $1,000 Genome Debate is 'Already ... Irrelevant'
January 12, 2011

Matthew Herper and Daniel MacArthur are at odds over the $1,000 genome. Forbes' Herper argues that even though sequencing is becoming cheaper, analyzing a genome still costs much more than $1,000. Over at Genetic Future, MacArthur responds that as sequencing costs continue to fall, "a substantial niche will develop for innovators providing affordable, intuitive, accurate interpretation tools."
On his blog, John Hawks says that "the inevitability of the $1,000 genome has already made it irrelevant." He expects a $1,000 genome will be announced sometime this year and whole-genome sequencing at 4x coverage for less than $100 by the end of 2014. "I think there's a good chance they will be less than $50 at that time," Hawks says of human genomes. As sequencing gets cheaper, he adds, there will be "an expensive, professional class of genome interpretation" for everything from medical applications to personalized genealogical consultation services. "Genomes may not be literally too cheap to meter, but they'll certainly be, as George Church has suggested, free with additional purchase," Hawks says Matthew Herper and Daniel MacArthur are at odds over the $1,000 genome. Forbes' Herper argues that even though sequencing is becoming cheaper, analyzing a genome still costs much more than $1,000. Over at Genetic Future, MacArthur responds that as sequencing costs continue to fall, "a substantial niche will develop for innovators providing affordable, intuitive, accurate interpretation tools."
On his blog, John Hawks says that "the inevitability of the $1,000 genome has already made it irrelevant." He expects a $1,000 genome will be announced sometime this year and whole-genome sequencing at 4x coverage for less than $100 by the end of 2014. "I think there's a good chance they will be less than $50 at that time," Hawks says of human genomes. As sequencing gets cheaper, he adds, there will be "an expensive, professional class of genome interpretation" for everything from medical applications to personalized genealogical consultation services. "Genomes may not be literally too cheap to meter, but they'll certainly be, as George Church has suggested, free with additional purchase," Hawks says.

Why You Can't Have Your $1,000 Genome

http://blogs.forbes.com/matthewherper/2011/01/06/why-you-cant-have-your-1000-genome/

The cost per genome may not really drop below $1,000 for medical use, no matter how cheap the research tools get.

Page not found | ScienceBlogs

http://scienceblogs.com/geneticfuture/2011/01/why_you_can_have_your_1000_gen.php

Page Not Found

http://johnhawks.net/weblog/topics/biotech/whole-genome/sequencing-1000-dollar-genomes-2011.html

Originally posted by krobison View Post

(2) huge numbers of public Y data is looming. So perhaps this thread should be renamed "Yes, we can sequence the Y-chromosome & it's being done".

Your statement here reminds me of an old tv commercial about hamburger chains. Its tag line was "Where's the Beef" I am still looking for all these huge numbers of public Y data is looming as you say. I would possibly rename the thread this way if you want.

We can sequence the Y but who is doing it and where is the data?

PS. It seems the 1000 genomes project is slowing down due to funding from what I understand.

Last edited by KerryOdair; 01-13-2011, 09:10 AM.
Leave a comment:
krobison replied

01-08-2011, 04:59 PM
Note that from ancient remains none of the whole chromosome isolation techniques are likely to work; the chromatin is probably already quite degraded. Another reason for developing hybridization & PCR-based methods; they don't care.

It's not that I think nobody is interested in the Y; it's just that truly developing a technology for isolating Y chromosomes is a serious undertaking for a niche area. There may be groups working on isolation, but it's doing a lot of work for a modest gain.

Again, the cost of data generation for one human genome really is around $10K, perhaps lower with Complete Genomics. Data generation for Y chromosome via SureSelect or RainDance or similar would be on order of $2K per genome -- of which $1K is the sequencing and $1K is the selection. So another method is either going to need to do something more interesting (such as haplotyping a diploid chromosome) or cost much less than $1K.

You could, for example, use Quake's method to separate chromosomes, use qPCR to identify which well had the Y & then just sequence that well. So in a sense, the technology exists -- but it isn't currently commercially available. Perhaps Quake could be interested in undertaking a wide survey of Y

So I'm making it badly, but the argument I'm making is two fold: (1) technology exists now to sequence the Y for $1K-$2K, so any group wishing to target Y-chromosome information could have data in about 2 months (if they have the DNA) and (2) huge numbers of public Y data is looming. So perhaps this thread should be renamed "Yes, we can sequence the Y-chromosome & it's being done".
Leave a comment:

Previous 1 2 3 4 5 6 7 8 template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News