Amazon 1000 genomes?
From the amazon blog : "Researchers pay only for the additional AWS resources they need for further processing or analysis of the data.".
I'm guessing that's the "gotcha": you can view chunks for free (which you can anyway ... from other sources) but you get to pay for analyzing it.
I am wary of this "we'll keep the data and you can pay us" concept of "the cloud".
I think a better model would be: here's a shell login to your own VM and you can write or use your own python/java/c/bash programs to quickly access the 200TB.
I wish TCGA would do something like this but the data is locked down pretty hard. Maybe we'll get some open access disease samples as more Asian countries provide less encumbered data.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
This topic is closed.
X
This is a sticky topic.
X
X
-
Amazon puts it in the cloud
s3.amazonaws.com/1000genomes
Leave a comment:
-
Leave a comment:
-
Our browser has been updated to version 63 of the ensembl code and we have a new Variation Pattern Finder tool to go along slide it
1000genomes.org is your first and best source for all of the information you’re looking for. From general topics to more of what you would expect to find here, 1000genomes.org has it all. We hope you find what you are searching for!
1000genomes.org is your first and best source for all of the information you’re looking for. From general topics to more of what you would expect to find here, 1000genomes.org has it all. We hope you find what you are searching for!
The Data Slicer now also allows you to subsample vcf files on sample and population
Leave a comment:
-
We have also now added a public mysql instance for the ensembl databases which back our browser
You can find more details of this on http://www.1000genomes.org/public-en...mysql-instance
Leave a comment:
-
New Resources for 1000 Genomes
New Resources for 1000 Genomes
General Info
As well as posting new announcements on the front page of http://www.1000genomes.org, we have both rss http://www.1000genomes.org/announcements/rss.xml and twitter http://twitter.com/1000genomes twitter
You can also subscribe to and announcements list we have setup. http://listserver.1000genomes.org/ma...o/1000announce [email protected]
We have started an FAQ http://www.1000genomes.org/faq to provide help as to where to find certain data sets which surround the 1000 genomes project and answers to other similar questions.
Data Search
You can now search both our website and our ftp site.
To search the main website you can use the search box which appears in the top right hand corner of each page on http://www.1000genomes.org.
Our ftp search is linked to from the top menu bar at the top of each page. For our ftp site we have an index on the ftp site called the ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree which is updated every night to reflect the contents of the ftp site. http://www.1000genomes.org/ftpsearch
The search itself will look for strings in the names of files and directories on the ftp site. This means the search can be used to find all vcf files or files associated with a particular release date or particular individual.
The search options will allow you to include md5s in the output and have the ftp paths point to either the NCBI or the EBI ftp site. Due to the volume of results which would be returned the search by default excludes fastq and bam files but you can return these results to the search. Currently the search will only return the first 1000 results due to the large volume of files on the ftp site.
Accessibility
Many of our releases contain very large files which can be challenging to download in their entirety. Both bam and vcf files have indexes which allow subsections to be downloaded using samtools or tabix respectively. There are descriptions of how to do this in our faq. We also now have a web based tool within our Ensembl browser which allows you to request a 10KB subsection of these files.
The Data Slicer (http://browser.1000genomes.org/tools.html) needs the URL of a indexed bam or vcf file and then will present a view of this file and a bam or vcf file to download. The data slicer can be accessed from the tool link at in the top right hand of all browser pages. It should work for any remotely accessible tabix indexed vcf file. It will work for any indexed bam over http but may only work for ftp bams within the EBI
You can also upload data from bam or vcf files from our ftp site. To do you you need to click on the mange your data link on the left hand menu of a page. This is best done from Location view. The section of the menu you need to click on is labeled attach remote file. Only bam files from the EBI ftp site will be visible but any remotely accessible vcf which is accompanied by a tabix index. Once your file is loaded you should be able to see the snps or aligned reads displayed and also share these links with others. This is described with screenshots in our Ensembl tutorial http://www.1000genomes.org/sites/100...l_20110506.doc
The browser also has a variant effect predictor tool which will take in up to 750 snps and indels in VCF format or an Ensembl specific format. This tool provides functional consequences with respect to the current gene and regulatory annotation which include SIFT and PolyPhen for any non synonymous snps. http://browser.1000genomes.org/tools.html. You can also download
If you have any questions about these new features or any other aspects of the project please email [email protected]Tags: None
- Stuck
Latest Articles
Collapse
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
-
by seqadmin
The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.
Avian Conservation
Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...-
Channel: Articles
03-08-2024, 10:41 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:37 PM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:37 PM
|
||
Started by seqadmin, Yesterday, 06:07 PM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:07 PM
|
||
Started by seqadmin, 03-22-2024, 10:03 AM
|
0 responses
51 views
0 likes
|
Last Post
by seqadmin
03-22-2024, 10:03 AM
|
||
Started by seqadmin, 03-21-2024, 07:32 AM
|
0 responses
67 views
0 likes
|
Last Post
by seqadmin
03-21-2024, 07:32 AM
|
Leave a comment: