Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • nhntran
    replied
    It is such a really old topic. But while I was searching for AWS EC2 AMI related threads on here, I came up to this post.
    Just in case there are some other newbies like me, you can try to learn more by exploring this page:
    Informatics for RNA-seq: A web resource for analysis on the cloud. Educational tutorials and working pipelines for RNA-seq analysis including an introduction to: cloud computing, critical file form...

    It is a really good resource that introduces all about AWS, which I found more easily to understand than the tutorials on AWS itself. And you can also explore their lectures on AWS and RNA seq analysis using AWS.
    Thanks!

    Leave a comment:


  • keo
    replied
    Hello Lluc,
    Perhaps you've resolved your questions by now, but I'll just post my answer anyway, and hope someone adds or corrects me.
    I have had the same problem, and I haven't found a really "for dummies" page. Up to now, what I have found out is:
    AWS is a service where you rent servers offsite. The way they do it is by renting virtual servers, that they call "instances" on "EC2". You have complete control of your instance, so it's like having your own server. You have ssh command line access, as well as a web-based control panel. You can rent several instances at a time, and there is a cluster option to rent several instances that work as a cluster. There are several types of instances which include different RAM, number of processors, number of cores per processor and instance disk storage. You will need external storage, which they call "S3". When you "initiate an instance" you have to load an "image" of a server (RAM and disk) so that you don't have to install everything from zero. These images are called "AMI". Amazon provides several pre made images with different pre installed OS (Debian, RedHat, Windows, etc.) Once you install something new on your instance, you will have to save that image on the S3 storage in order to have it ready when you connect to your instance again. The space used for your instance is grouped in objects called "buckets", and can be accessed at the time of instance creation (or re-creation) or even through the web using keys that you can give to third parties.
    There are several applications, both native and third party, that you can access directly from your instance without installing the whole thing. These are the "APIs". A common API is the storefront, which makes your instance use all of Amazon's web store functions on your own domain and products. There are some APIs for science and sequencing.
    So for your question, the transfer would be between the 1000G's bucket and your instance, without going through your local network. The speed can be anything from 1.5 to 10 Mbps, from what I've read, depending on your luck. Once you configure your instance you can use it as your own server.
    There is no way of avoiding the Credit Card step, I've asked. In theory, you can use a "Free Tier" level for one year, and not have any charges made to your card, but they will not tell you if you went over the limit and they will start charging.
    I don't know what sequences you're querying at 1000G, but perhaps it would be best to download them first and do the queries locally. It would be a one time huge download that could be done overnight with your IT's approval.

    Hope this helps, and I hope someone else that is more knowledgeable jumps in.

    Leave a comment:


  • Lluc
    started a topic Seeking advice for Amazon Web Services usage

    Seeking advice for Amazon Web Services usage

    I have been searching for some specific sequences in the 1000 Genomes Project data, using samtools view and BreakSeq, until the IT services in my University contacted me, because I was taking too much bandwidth. Then, the 1000G people suggested me to use AWS. It looks like a good solution, but I have some doubts, and I would appreciate if other users of AWS can ease my concerns.

    1. I don't understand the language used in the AWS website ("instances", "API", bla, bla, bla). May I assume that if I start an EC2 instance, I will connect to it through ssh as with any remote machine, and be able to install samtools and what not?

    2. They claim most of the 1000 Genomes Project data is available in a "bucket", and they mention several ways of accessing it that I don't know about. Will I be able to samtools-view the bam files or read fastq files?

    3. Assuming so, how fast the data would be transferred from that bucket to my EC2 instance? Most of the time consumed by the pipeline before was to download. I need to know the speed of data transfer to estimate the cost.

    4. Almost the first thing AWS asks you for is your credit card number. I don't want to give mine, and there's none available for the lab. Do you know of alternative ways to pay? We have a budget, but it's managed by the University, which requires invoices and so on.

    Thank you.

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:46 AM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-24-2024, 11:09 AM
0 responses
26 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-19-2024, 07:20 AM
0 responses
160 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
127 views
0 likes
Last Post seqadmin  
Working...
X