Can we sequence the Y Chromosome

KerryOdair replied

10-11-2010, 09:16 AM
This looks like a good start in the right direction for personal genomes.

Our genomes, unzipped
11/10/2010
Categories: Uncategorized
Written by Daniel MacArthur
When we launched this website back in June, I welcomed readers with a promise that Genomes Unzipped would “ultimately be much more than just a group blog”. Indeed, the last four months of blogging have really just been a prelude of sorts to what comes next: the real Genomes Unzipped.

Today we’re launching an exciting new phase of the project. Although we’re not entirely sure where this journey will take us, we’re looking forward to finding out – and to bringing you along with us.

What are we doing?
Over the last year, all the members of Genomes Unzipped have had genome scans performed by personal genomics company 23andMe; several of us have also had additional tests done by other genetic testing companies (Counsyl, deCODEme). From today, we’ll be making all of our raw genetic data and the reports generated from these tests freely available online. As the project proceeds, we aim to obtain data from an ever larger array of tests – ultimately extending to whole-genome sequencing – and release it openly. Right now you can freely download the 23andMe data from everyone in the project from this website.

Over the next few weeks, each of the members will be writing about their own experiences with genetic testing, and what they’ve learnt from their own genetic data. We’ll be discussing analyses we’ve performed on our own raw data, using software written both by group members and other collaborators; and we’ll be releasing the code for that software in our new code repository. We’ll also be talking about the process of deciding to release our genetic data publicly, and how we discussed this decision with our families.

To make it easier for us (and you) to explore our genomes, we have assembled a custom genome browser using JBrowse – this provides a visual interface that allows our 23andMe (and later, complete sequence) data to be viewed in the context of genes and other features. It’s still in prototype form, but we’ll be refining it and adding more data as the project proceeds.

Link:

Home - Genomes Unzipped

http://www.genomesunzipped.org/

We strive to provide the most in-depth reviews and guides on DNA tests, ancestry kits, and so much more.
Leave a comment:
KerryOdair replied

10-08-2010, 10:23 AM
Some companies or institutions are getting considerable data somewhere.

Complete Genomics Sequences Over 300 Human Genomes in Q3, Order Backlog Grows to 800
October 05, 2010

According to an amended S-1 form filed with the Securities and Exchange Commission this week, the company had an order backlog of more than 800 human genomes, an increase of 300 genomes compared to mid-July.
Leave a comment:
KerryOdair replied

10-05-2010, 11:13 AM
Originally posted by krobison View Post

A little googling around revealed that you can access many of the published personal genomes on Penn State's modification of the UCSC Genome Browser; perhaps you could snag some more Y chromosome SNPs there.

For example, try entering the coordinates chrY:2,700,000-9,000,000

Make sure you have the "Individual Genotypes" track on under "Personal Genomes"

The Y Chromosome Consortium is our strongest organization for research on the Y Chromosome. Our guru’s in this area are at this link: http://ycc.biosci.arizona.edu/contributors.html

Certainly Hammer, Karafet, and Underhill are major players in this endeavor. Cruciani has been the biggest contributor in the study of the M35 haplogorup which our website studies.

Our dataset in the M35 project I think could be matched up against any data set in the world for this group. To use our group as a base before whole genome testing on the subset of the Y chromosome seems to make sense to me. Lets take a small bite of the apple without trying to swallow the whole apple in one bite. If we can deal with this 60 to 80 million base pairs and develop the tools to compare the data this might help in the larger picture. This would seem to be a good proving ground for sequencing and tool development. But first things first we need to sequence the data on the Y chromosome. I believe we have the numbers, resources to finance and the necessary releases of private information to make a go of this from our members.

I appreciate your insights and information. Maybe I am ahead of the power curve on this issue. But this M35 group and project has been a leader in breaking new ground in this area of study. Since the V13 snp discovery and paper by Cruciani we have discovered 11 new subclades for V13 alone in this one subclade of M35. We have broken new ground and wish to continue to do so and think big.

This is the browser link they use at this site.

Human hg38 chr7:155,799,529-155,812,871 UCSC Genome Browser v463

http://genome.ucsc.edu/cgi-bin/hgTracks?hgsid=1004184&hgt.left3=%3C%3C%3C&position=chrY&pix=840&dinkL=2.0&dinkR=2.0&guidelines=on&boolshad.guidelines=1&leftLabels=on&boolshad.leftLabels=1&centerLabels=on&boolshad.centerLabels=1&ruler=on&cytoBand=dense&fishClones=hide&stsMap=dense&gcPercent=hide&ctgPos=hide&gold=hide&gap=dense&clonePos=dense&bacEndPairs=hide&genomicDups=hide&refGene=full&acembly=dense&ensGene=dense&softberryGene=dense&genscan=dense&mrna=full&intronEst=dense&est=hide&tigrGeneIndex=dense&cpgIsland=hide&xenoMrna=dense&xenoEst=hide&blatMouse=dense&exoFish=dense&snpNih=dense&snpTsc=dense&rmsk=dense&simpleRepeat=hide&nci60=dense

Link to Consortium website:

http://ycc.biosci.arizona.edu/

Last edited by KerryOdair; 10-05-2010, 11:59 AM.
Leave a comment:
krobison replied

10-02-2010, 06:26 PM
A little googling around revealed that you can access many of the published personal genomes on Penn State's modification of the UCSC Genome Browser; perhaps you could snag some more Y chromosome SNPs there.

For example, try entering the coordinates chrY:2,700,000-9,000,000

Make sure you have the "Individual Genotypes" track on under "Personal Genomes"
Leave a comment:
krobison replied

10-02-2010, 06:07 PM
The buzz from her talk in the spring is that she gives quite an informative & entertaining talk.
Leave a comment:
ECO replied

10-01-2010, 05:31 PM
Originally posted by KerryOdair View Post

Here is a teenager with more ambitious aspirations than my small look of 80 million BP's on the Y Chromosome. Great article and a must read.

CUPERTINO, Calif.—In many ways, Anne West is a typical 17-year-old California teenager. She wears her hair long. She likes to hang out with her friends. She went to the prom.

She is also analyzing her family's genome.

Having being diagnosed with a pulmonary embolism in 2003, Anne's father John decided last year to get the family's genes sequenced. The process involves an advanced technology that spews out the six billion letters that represent the makeup of a person's genetic code. But after putting up $160,000 to get the four-member family tested, the Wests realized something: sifting through the reams of data was tougher than they ever imagined.

http://online.wsj.com/article/SB1000...l?mod=ITP_TEST

This is very cool and inspiring. I wonder if the collective we could volunteer to help with this dataset. We have enough experts in assembly, annotation, SNP discovery, etc, that 4 human genomes (better yet a family) could be a very interesting data set.

I'll have to think on it a little more, but I would love to collaborate with her and her family to open up the dataset to SEQanswers users. Perhaps I could secure or fund a donation of some compute resources for the product.

Thanks for posting it Kerry!

edit: Looks like we'd be a little late to the party...

She is now at work on a paper based in part on her family's data, with researchers from a Seattle institute. Last month, she was a speaker on a panel at a personal-genomics conference held at Cold Spring Harbor, New York, a scientific mecca.
Leave a comment:
KerryOdair replied

10-01-2010, 03:54 PM
Here is a teenager with more ambitious aspirations than my small look of 80 million BP's on the Y Chromosome. Great article and a must read.

CUPERTINO, Calif.—In many ways, Anne West is a typical 17-year-old California teenager. She wears her hair long. She likes to hang out with her friends. She went to the prom.

She is also analyzing her family's genome.

Having being diagnosed with a pulmonary embolism in 2003, Anne's father John decided last year to get the family's genes sequenced. The process involves an advanced technology that spews out the six billion letters that represent the makeup of a person's genetic code. But after putting up $160,000 to get the four-member family tested, the Wests realized something: sifting through the reams of data was tougher than they ever imagined.

wsj.com

http://online.wsj.com/article/SB10001424052748704814204575508064149859510.html?mod=ITP_TEST
Leave a comment:
Joann replied

09-27-2010, 11:24 AM
Future Repositories and Standards Bodies: Guidance

In addition to the International Society of Genetic Genealogy, I would like to suggest NARA (US National Archives and Records Administration) as a potential source of guidance and standards development for de-centralized or centralized genealogical repository of DNA sequences. This is because the emergent majority of digital archive users at US NARA and other traditional archival records facilities are genealogists and family historians. (Millions of users yearly!)

Analyzing archives and finding facts: use and users of digital data records Margaret O’Neill Adams - Archival Science, 2007 Vol 7 No. 1 21-36

Relocating Meaning in Heritage Archives: A Call for Participatory Heritage Databases. Angela M. Labrador and Elizabeth S. Chilton. Computer Applications in Archaeology. Annual Meeting Proceedings 2009

Page not found – caa2009 – how cryptic can it get

http://www.caa2009.org/articles/Labrador_Contribution386_c%20(1).pdf
Leave a comment:
krobison replied

09-24-2010, 09:56 AM
Complete Genomics has stated very clearly that they are in the game only to do complete human genome sequencing. The Y has limited medical relevance (the obvious example is azoospermia due to deletions on Y) and in any case they've decided to stay out of the retail genomics game. Right now, the assumption is that you would need to assume regulatory issues unless you could really prove otherwise -- a headache they wish to stay away from.

Companies such as Ion Torrent (now into LifeTech) are in the game to make machines, not do much in the way of specific sequencing. PacBio would be the same issue. PacBio+RainDance or Fluidigm may make a very interesting combo for your application (probably also the other targeted sequencing schemes as well), but I still think you'll have trouble getting the cost to where you need it.

I think the quick answer right now is that at the moment there isn't a good solution in the cost you are looking for. If you could batch a large number of samples, then RainDance might be a reasonable option to do the targeting. In any case, I think you will find it challenging to go below $500 total cost per sample with a service provider, and that may even be a bit low-ball.

If I were in your shoes, particularly if not with a lot of funds, I'd focus on mining the available genomes & 1000 genome data to get a much richer set of SNPs. If you look around here, there is another thread on how to access consolidated SNP info from 1K genomes. These could be converted into one of the typical cheap SNP-typing formats & then you could get a richer set to type lots of genomes on cheap array/PCR platforms.

In any case, you probably should find an academic somewhere to collaborate with, as all of these companies will probably be more comfortable doing that than with a private citizen.
Leave a comment:
KerryOdair replied

09-24-2010, 08:28 AM
Originally posted by nilshomer View Post

Disclaimer:
The World Personal Genome Registry is a website created by Illumina for the individual genome sequencing space that allows the community to keep track of the current status of personal whole-genome sequencing. We plan to transfer this registry to an appropriate standards body...

I appreciate the information. I was aware that Illumina was the care taker of this information at the moment. I would suggest that an appropriate standards body might be the International Society of Genetic Genealogy.

Just a moment...

http://www.isogg.org/
Leave a comment:
nilshomer replied

09-23-2010, 06:36 PM
Originally posted by KerryOdair View Post

Interesting link for Registry of sequenced genomes.

http://www.worldpgr.com/

Disclaimer:
The World Personal Genome Registry is a website created by Illumina for the individual genome sequencing space that allows the community to keep track of the current status of personal whole-genome sequencing. We plan to transfer this registry to an appropriate standards body...
Leave a comment:
KerryOdair replied

09-23-2010, 11:21 AM
Interesting link for Registry of sequenced genomes.

http://www.worldpgr.com/

Last edited by KerryOdair; 09-23-2010, 11:37 AM.
Leave a comment:
KerryOdair replied

09-20-2010, 08:34 AM
I had hoped Complete Genomics would be an excellent candidate for this type of application. Below is the frustration looking into this kind of commercial test. The Y Chromosome has no known medical implications that I am aware of. So regulatory compliance seems to be a non issue in my mind, not to mention we should have right over our own dna. There is no need for consulting services that you run into in the 23andMe world as far as regulators are concerned. This reply was sent to me in June of 2009. I also contacted Ion Torrent but since they have been bought out by Life Tech, it is unclear to me if this will speed up or hinder development of this platform. Life Tech may also influence an increase in price for the instrument after the buyout. In terms of privacy this issue has been handled by companies already with an exclusion if you so desire. However, most people seeking this kind of testing are more than willing to share their personal information for matches. PacBio still remains an unknown at this point in time. Existing 2nd generation machines could also possibly do the job right now. My expertise on these machines is lacking to know if that is the case.

There is much to be learned from a detailed y tree. From the academic world the flow of information is slow and the peer review process is a lumbering elephant trying to keep up with dramatic change.

Dear Mr. O'Dair,

Thanks for your interest in Complete Genomics. *While your project sounds very interesting, Complete Genomics does not plan on sequencing partial portions of the genome. Our current plans are to focus exclusively on providing sequencing of the complete human genome - all 6B bases.

Complete Genomics is also providing our sequencing service for research purposes only to bio-pharma companies and genome centers/research organizations. The main reason we aren't sequencing genomes for individuals is because that type of service requires legal consents, consulting services, and privacy and regulatory compliance processes such as CLIA, etc. that Complete Genomics does not have. We have no plans in 2009 to obtain such certification and hence will not be able to sell directly to individuals.

Regards,

Jennifer Turcotte

Complete Genomics, Inc.

VP of Marketing
Leave a comment:
KerryOdair replied

09-17-2010, 11:59 AM
Originally posted by krobison View Post

Have you looked in the 1000 Genomes data at chrY? Perhaps they have some useful variants there - -quite a lot of data is online. How many new SNPs for Y showed up in the Bushman or Asian whole genome sequences?

Here are some comments on the 1000 genomes project Y chromosome. These comments are from people with greater skill sets than my own. I do not think we are going to find the detail we are looking for. There is not enough variety in the y tree in the samples they appeared to use.

1000 Genomes Project: Y Chromosome SNPs

Luke Jostins, Qasim Ayub, Yali Xue, Chris Tyler-Smith

Abstract

• Y chromosome SNPs were called from the 1000 Genomes data, and numerousfilters were applied
• A total of 2870 sites were called as variable in the 77 samples, of whicharound 75% are novel
• 30 sites that passed all filters were re-sequenced using capillarysequencing, giving an estimated false positive rate of 3.3%
• Known HapMap variants and variants from the Y haplogroup tree were used toestimate the sensitivity. This gave 22% for singletons and doubletons and63% for variants with a non-reference allele count of three or greater
• We use the sensitivity to estimate a polymorphism rate of 1 variant per2350bp
• HapMap genotypes and Pilot 1/Pilot 2 concordance was used to estimate aper-genotype error rate for Q10 non-reference bases of under 1%

Bitly | Page Not Found | 404

http://bit.ly/djsOaP

Vince Tilroe

Many established SNPs may have been screened out by their quality "filters" (proximity to other SNPs, for example).

They had 77 men and a total of 188 million bp attributed to the Y, which means that on average only 2,400,000 bp per Y was sequenced: that's just 10%, roughly.

And these genomes have VERY low sequencing coverage (just 1.94x on average), which is one reason the Y-SNP outcomes are so weak. While the HapMap samples covered a decent array of haplogroups, just 2870 usable Y-SNPs found is not a lot given the total amount of sequencing done here.

It's interesting that even with this sparse data they point out the youthfulness of R1b1b2 ("New insights into recent human evolution can also be gained from the branch lenghts; for example, the short internal branch lengths within the haplgroup R1b relative to the other haplogroups suggest a recent expansion of this European haplogroup.)

Yet it also apparent from Figure 4 that if we are going to get deep insights into intra-haplogroup structure (not to mention a truly
precise SNP-based molecular clock) we are going to need much better Y- chromosome sequencing than was done here.

Vicent Vizachero

Many established SNPs may have been screened out by their quality "filters" (proximity to other SNPs, for example).

But I think the more likely explanation is the low quality of Y- chromosome sequencing. They only got 188 million bp out of 77 guys, which amounts to just 2.4 million bp on average. That's only 10% of the full Y, and only maybe 20% of the "sequencable" Y. And some of the men were only sequenced at low depth (<3x), so sequencing errors could also have an impact.

Vicent Vizachero

Originally posted by krobison View Post

For the sort of high-throughput approach you are describing, one of the targeted sequencing technologies could make sense. For example, you might try designing a RainDance or OLink primer library to amplify the Y. Or, perhaps a SureSelect/Nimblegen approach.

Thanks for the leads here. I will pass them along.

Originally posted by krobison View Post

A big question would be what cost per sample are you really willing to take on and how many samples? That would really affect the choice of technology.

Yes that is the big question. Where are we today cost wise doing this kind of sequencing with potential third generation machines. There are over 200,000 str tests today and I have heard numbers as great as 35k total for 23andMe testing. So their is a market place. If there is a price point I am thinking people would pay up to $600 for that kind of y sequence test. The real question is that in the realm of reality in todays market.

Last edited by KerryOdair; 09-18-2010, 08:18 AM.
Leave a comment:
KerryOdair replied

09-17-2010, 11:17 AM
Originally posted by krobison View Post

First off, what's wrong with the sequence we already have? Y is not a total wasteland.

I do not consider the currently known sequences a wasteland. I use a very good y-mapper supplied by a vendor today I think using all known sequences.

Homepage-DNA Day Sale - FamilyTreeDNA

http://ymap.ftdna.com/

Big Y-700 + mtFull Sequence $608USD $489USD $119 off Add to cart Family Finder + Big Y-700 + mtFull Sequence $687USD $507USD $180 off Add to cart Family Finder + mtFull Sequence $238USD $169USD $69 off Add to cart Maternal Ancestry $159USD $129USD For genetic males and females Explore your heritage on your maternal line Connect […]

Originally posted by krobison View Post

One group has reported specifically capturing a specific mammalian chromosome by flow cytometry & getting sequences highly enriched for the targeted chromosome.

Thank you for this lead, I will check into it further.
Leave a comment:

Previous 1 4 5 6 7 8 template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News