I'm attempting to download a ton of protected GTEx data using sratoolkit.
Using fastq-dump alone for accession numbers is painfully slow. My Mac is taking 2-4 hours to get each ~3Gb file.
e.g. This took 3.4GB file took 2:23:19 to download:
Prefetch has worked better; running both the html download (~30min) and fastq-dump from the prefetched .sra (~1hr) is much faster than fastq-dump alone. Unfortunately, I can't get prefetch to use Aspera.
(GTEx SRR#s have been changed to a publicly available SRR# for the examples below.)
I think the aspera problem is prefetch-specific; tests using the following code resulted in a nice, speedy download.
However, things went much less smoothly using this:
I ended up with this error:
2017-08-09T22:54:52 prefetch.2.8.2: 1) Downloading 'SRR292241'...
2017-08-09T22:54:52 prefetch.2.8.2: Downloading via fasp...
2017-08-09T22:54:53 prefetch.2.8.2 err: process failed while waiting process - ascp failed with 1
2017-08-09T22:54:54 prefetch.2.8.2 err: process failed while waiting process - ascp failed with 1
2017-08-09T22:54:54 prefetch.2.8.2: fasp download failed
2017-08-09T22:54:54 prefetch.2.8.2: 1) failed to download SRR292241
I've tried adding the -X 200G flag to circumvent the error, without success.
Any suggestions for speeding up my download from SRA/dbGaP?
Using fastq-dump alone for accession numbers is painfully slow. My Mac is taking 2-4 hours to get each ~3Gb file.
e.g. This took 3.4GB file took 2:23:19 to download:
Code:
~/sra-toolkit/bin/fastq-dump --bzip2 --split-spot --clip --skip-technical --dumpbase --readids SRR1397673
(GTEx SRR#s have been changed to a publicly available SRR# for the examples below.)
I think the aspera problem is prefetch-specific; tests using the following code resulted in a nice, speedy download.
Code:
~/Applications/Aspera\ Connect.app/Contents/Resources/ascp -T -l150M -i ~/Applications/Aspera\ Connect.app/Contents/Resources/asperaweb_id_dsa.putty [email protected]:/sra/sra-instant/reads/ByRun/sra/SRR/SRR292/SRR292241/SRR292241.sra .
Code:
~/sra-toolkit/bin/prefetch -t ascp -a "~/Applications/Aspera\ Connect.app/Contents/Resources/ascp | ~/Applications/Aspera\ Connect.app/Contents/Resources/asperaweb_id_dsa.openssh" SRR292241
2017-08-09T22:54:52 prefetch.2.8.2: 1) Downloading 'SRR292241'...
2017-08-09T22:54:52 prefetch.2.8.2: Downloading via fasp...
2017-08-09T22:54:53 prefetch.2.8.2 err: process failed while waiting process - ascp failed with 1
2017-08-09T22:54:54 prefetch.2.8.2 err: process failed while waiting process - ascp failed with 1
2017-08-09T22:54:54 prefetch.2.8.2: fasp download failed
2017-08-09T22:54:54 prefetch.2.8.2: 1) failed to download SRR292241
I've tried adding the -X 200G flag to circumvent the error, without success.
Any suggestions for speeding up my download from SRA/dbGaP?
Comment