Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cloud computing sever for DNA sequence assembly

    Dear all,

    Do you know some famous cloud computing sever for DNA sequence assembly?

    Because my sequence dataset is too large: 450 million for one end, 900 million for pair end.

    I have checked the cloud computing on Amazon. However, for high memory instances, they just provide 68.4GB, it is not enough.

    In addition, I want these sever have already installed some assembly software like Trinity, AbySS, velvet.

    Do you know such sever?

    Thanks!

    Jingjing

  • #2
    I've been looking for more memory as well.
    As far as pre installed software, you could always boot an image of biolinux.

    Comment


    • #3
      You might want to look here, if you are still searching:

      Comment


      • #4
        Another option is to use Ray, a de novo assembler that can spread the job across multiple nodes on a cluster.

        For setting up such a cluster on Amazon, STAR::Cluster can be quite useful.

        Comment


        • #5
          Again, plugging one's own project, Gossamer may be able to handle that volume of data on your existing hardware. Depending on your existing hardware, of course.
          sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});

          Comment


          • #6
            @Pseudonym - does Gossamer work on a 2008-vintage Intel Core 2 Quad 8200 with 8 GB of RAM (running Bio-Linux 6) to de novo assemble Ion Torrent Ion-318 shotgun data (480 Mbp)? The data is from an oral bacterium, probably 2.0-2.2 Mbp in size. Your paper is most interesting.

            Comment


            • #7
              Originally posted by hengnck View Post
              @Pseudonym - does Gossamer work on a 2008-vintage Intel Core 2 Quad 8200 with 8 GB of RAM (running Bio-Linux 6) to de novo assemble Ion Torrent Ion-318 shotgun data (480 Mbp)? The data is from an oral bacterium, probably 2.0-2.2 Mbp in size. Your paper is most interesting.
              Thanks!

              Memory won't be a problem at all; 8 GB is more than enough to assemble anything smaller than 250Mbp or so. The only catch I can see is that as far as we know, nobody has ever tried Gossamer on Ion Torrent data, and we don't know what its error characteristics are, so it may need a little fine-tuning to get good results.

              Feel free to PM me if you need any help.
              sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});

              Comment


              • #8
                Thanks, Pseudonym. I'll install and try Gossamer out when the MIRA assembly finishes... it's only been 6 *days* continuously chugging away - two passes and another 2.36 million reads to go. I'll give you feedback on how Gossamer deals with Ion Torrent data.

                Comment


                • #9
                  For more memory for cloud computing server, you can ask and learn more about here erp on cloud

                  Comment


                  • #10
                    Originally posted by ramcob View Post
                    For more memory for cloud computing server, you can ask and learn more about here erp on cloud
                    You may want to give a try to the SSD instances (hi1.4xlarge) and with that maybe a swap file on these will help for software requiring tons of memory.

                    Otherwise, I suggest you use Ray on cc2.8xlarge instances -- these have 16 cores, 32 threads, 60 GiB ram and 10 Gigabit Ethernet.


                    -Sébastien

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin


                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                      Today, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    37 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    41 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    35 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    54 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X