Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GB File Storage/Transfer Solutions

    Hi,

    I'd appreciate any advice on moving large sequence/read files (5 GB to 25 GB) among different servers such as Galaxy or the UCSC Genome Browser.

    Our campus Linux servers provide 50 GB of storage and only allow secure FTP file transfers (SFTP). To upload a large file to Galaxy, I first have to download the file from the campus server to my Mac desktop, then upload the file to Galaxy or UCSC. This is a painfully slow process (several hours per file).

    What I am looking for is a cloud storage site that allows "server-to-server" file transfer so that I can eliminate the download/upload process to my desktop. Most of the services I have tried don't allow file transfers larger than 2 GB or don't support SFTP.

    If anyone could help me out with some suggestions, I'd really appreciate it.

    JJW

  • #2
    I'm interested in this as well.

    Comment


    • #3
      The only service I've found that allows storage and transfer of large files (>10 GB) is called Humyo, but it is a paid service, and doesn't support FTP transfer. For now, one of our lab desktop computers is being used to download the files from our Linux server and then upload them to Galaxy.

      If I use my free DropBox folder, it takes about 10 minutes to send a 1 GB file (gzip compressed) from the DropBox server to Galaxy. I'm happy with that, except that DropBox doesn't support file transfers larger than 2 GB.

      jjw

      Comment


      • #4
        Depending on how your linux ssh server is configured, you may consider to use sshfs-like approach - you "mount" the ssh file system to your local computer as if it is local. Data are still downloaded to your desktop and then uploaded to galaxy, but the two steps are done at the same time. You do not need extra disk space at all. (Hope I understand your question correctly)

        Comment


        • #5
          Solved

          Bingo! You did understand my question correctly, and you answered it perfectly.

          I installed an application called ExpanDrive on the desktop PC that mounts the Linux ssh file system as a local drive (sshfs). To test it out, I uploaded a small 1 MB FASTQ file to Galaxy directly with the "Get Data" Tool, and it uploaded in less than a minute.

          Then I uploaded a 1.6 GB file (gzipped) file to Galaxy. It took just over an hour to get the file on to Galaxy. As you said, it was a one step process.

          Your advice is much appreciated. This is very smooth compared to some of the clunky methods I was using without success.

          jjw

          Comment


          • #6
            Good to know it works. For linux, there is sshfs. For mac, there is macfuse+macfusion. I used to use macfuse before. Someone says macfuse does not work for snow leopard. If this happens, you may google out some solutions. I have not tried, though.

            Comment


            • #7
              You're right (again). I read on the Google Code page that MacFuse doesn't support Snow Leopard's 64-bit kernel, but you can run it if OS X is started in 32-bit mode. ExpanDrive uses the MacFuse library, so the Mac version doesn't have 64-bit support. The Windows version of ExpanDrive seems to work fine.

              I use a Mac as my desktop computer for data analysis and an older lab PC running Windows XP to transfer the files via sshfs. Essentially the XP machine is just a file server, so I don't have to use additional resources on my Mac.

              Thanks!
              jjw

              Comment


              • #8
                This seems to be a very interesting and useful discussion, but too technical for my background .. I will do some reading, but if someone can put it in layman terms...!
                --
                bioinfosm

                Comment


                • #9
                  If it seems "too technical", it's only because I didn't explain it clearly

                  The poster "lh3" gave me the advice that worked best for transferring large files from our Linux server to online tools such as Galaxy when a secure connection (SSH) is required. I can post the steps I used to do this. However, if you're looking for the specifics of SSH and SSHFS, other Linux or Unix gurus could probably explain those.

                  I'm one of those people with just enough knowledge to be dangerous, but I'd be glad to help where I can.

                  Comment


                  • #10
                    I don't know if this helps, but back in the day , everyone used FXP for this. It's still supported by a bunch of FTP servers, not sure how the support is with FTPS. It essentially lets you initiate a file transfer directly between two ftp servers, without going through your intermediate connection.

                    For windows the go to FXP app was flashFXP, googling around a bit i found this though,
                    CrossFTP is a powerful FTP and Amazon S3 client.

                    might be useful?

                    Misko

                    Comment


                    • #11
                      SRA to Galaxy?

                      Hello All,

                      Does anyone know how to transfer files directly from ncbi's SRA to Galaxy. It seems it would save a fair amount of bandwidth to transfer directly. I've tried pasting the SRA dataset's download FTP URL into the Galaxy Get Data box, but an error is generated. The other issue is the .bz2 compression that SRA uses.

                      Any ideas out there? Thanks.

                      Comment


                      • #12
                        I thought the only way to upload data to galaxy (and others) was to use the web interface. I did not know ssh was allowed.

                        BTW, I am using sshfs (on top of macfuse) without problems (32 bit system though). Well, if you are in a laptop or machine that goes online on and off... you may leave your sshfs mount point in a unusable state. Recently I discover a way to umount:

                        Code:
                        $ sudo diskutil unmount force /Users/drio/sshfs/ardmore
                        -drd

                        Comment


                        • #13
                          Upload fastqs quickly to Galaxy

                          Just come across this care of subio support on youtube - saves me so much time and local disk space . . .

                          The NGS generates huge data and you need a high spec computera with a very large storage and RAM. But, of course, only limited biologists are taking such an ...


                          Sharing it in case other people don't know this already . . .

                          If you want to upload fastqs directly from SRA server to Galaxy . . . .use the DRA site at ddbj: http://trace.ddbj.nig.ac.jp/

                          look for your SRx experiment/run and copy the fastq link from the right hand side

                          paste this into the URL box on get data at galaxy

                          upload takes <1 min usually

                          Have fun

                          Simon

                          Comment


                          • #14
                            Originally posted by jjw14 View Post
                            ExpanDrive uses the MacFuse library, so the Mac version doesn't have 64-bit support.
                            There's OSXFUSE that replaces MacFuse and supports 64-bit kernels

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:37 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 06:07 PM
                            0 responses
                            9 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            49 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            67 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X