Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • gsgs
    replied
    you won't believe me, but I really need all these old programs that
    that no longer work with these big files
    and often give other problems with Win64
    A utility to split the files looks so much easier.
    Of course, I wrote such an utility decades ago, really simple,
    but compiled with 16bit --> don't work.
    A 64-bit compiler might help, at least to run a few of my programs
    on 64-bit. But remembering how much trouble it was to install and
    understand my 16-bit compiler (GCC3.2,DJGPP) in 2002 ...

    Leave a comment:


  • dpryan
    replied
    Your life would be much easier if you just used a computer that could download and convert the whole file from SRA using fastq-dump. Just borrow someone's laptop instead of trying to reinvent the wheel (particularly since this this particular wheel has terrible documentation).

    Leave a comment:


  • gsgs
    replied
    I found downloaded,unzipped,copied
    sra decrypt for Win 32

    but I get the error:

    C:\1918>vdb-decr tu1
    2013-08-21T19:04:59 vdb-decr.2.3.2 err: encryption key not found while opening m
    anager within virtual file system module - unable to obtain a password
    2013-08-21T19:04:59 vdb-decr.2.3.2: exiting: RC(rcVFS,rcMgr,rcOpening,rcEncrypti
    onKey,rcNotFound) (2615479768)


    they didn't say in the paper or at the NCBI download page
    that I need a password

    Leave a comment:


  • GenoMax
    replied
    Originally posted by gsgs View Post
    I cannot just switch to Win64, since I need all my old programs
    that were written on 16bit or 32bit
    It should be possible to use those on the 64-bit windows (you could also run a VM and 32-bit windows, if you find an absolute need for it).

    Leave a comment:


  • gsgs
    replied
    I cannot just switch to Win64, since I need all my old programs
    that were written on 16bit or 32bit

    Leave a comment:


  • dpryan
    replied
    Ah, yeah, I expect that the SRA format is pretty non-trivial from the various discussions of it. Honestly, if your computer is having issues with files ~4GB then you might just be better off using someone else's (though check if the drive is NTFS formatted), particularly if you're stuck on windows. Got a labmate with a Mac?

    Leave a comment:


  • GenoMax
    replied
    If you are using 32-bit windows XP (which you likely are) this may not be possible. What kind of format do you have on your external drive? You may need NTFS for files > 4GB.

    Leave a comment:


  • gsgs
    replied
    OK, I tried to download the file to my external drive, it took 5.5h ,
    until an error message was displayed that the file couldn't be copied.

    Then I searched my main HD and found that it was put into a temporary file
    which had 4631463048 Bytes, so apparently >4GB is possible on my
    main drive but not on the external one.
    (Windows XP, computer bought in 2010 or 2011)


    I made a copy of that temporary file to another file on the maindrive,
    then I closed the error window, and indeed, the temporary file was
    deleted, but luckily I had the copy.
    As expected I can't copy that file to the external drive nor can I access it
    with any of my programs.
    But DOS-commands copy,type,find do work.

    So, I need a program that splits the big file into 2 smaller files, that can be assessed.

    Leave a comment:


  • gsgs
    replied
    I found that other thread, saying that the format is complicated,
    so there is no such table.

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc



    I'm having problems with files >4GB and wanted to test it
    on a partially downloaded file first

    I have to split the large files, so they work with my programs.
    It's also faster, better for testing, dealing with 4GB files is tedious.

    I doubt that sra-tools will work with such splitted files
    Last edited by gsgs; 08-21-2013, 03:47 AM.

    Leave a comment:


  • dpryan
    replied
    Why don't you either use fastq-dump or just download the gzipped fastq files from ENA (such as this one)?
    Last edited by dpryan; 08-21-2013, 03:40 AM. Reason: forgot a word

    Leave a comment:


  • gsgs
    replied
    I want the table, that converts a byte from the sra file
    into a sequence of nucleotides



    SRA toolkit sourcecode has "4na" and "2na"

    Leave a comment:


  • babaref
    replied
    How to convert fastq format to sra files? is there any perl script for this conversion?

    Leave a comment:


  • tbusch0000
    replied
    Thanks for the tips.

    I got the fastq-dump working on an x-large amazon cloud instance running cent os ami.

    Leave a comment:


  • seb567
    replied
    About 1-2 hours for a 2 GB sra file, though it is very approximated.

    I downloaded all sra files for SRA010766, converted them from sra to fastq, then to fastq.gz. The script started yesterday 6 PM (EST).

    So yours is slower, way slower.

    [boiseb01@ls30 Illumina-SRX015621]$ ls
    batch-3 SRR033559_1.fastq.gz SRR033570_1.fastq.gz SRR033581_1.fastq.gz SRR033592_1.fastq.gz SRR033603_1.fastq.gz SRR033614_1.fastq.gz SRR033625_1.fastq.gz
    download.log SRR033559_2.fastq.gz SRR033570_2.fastq.gz SRR033581_2.fastq.gz SRR033592_2.fastq.gz SRR033603_2.fastq.gz SRR033614_2.fastq.gz SRR033625_2.fastq.gz
    files.txt SRR033560_1.fastq.gz SRR033571_1.fastq.gz SRR033582_1.fastq.gz SRR033593_1.fastq.gz SRR033604_1.fastq.gz SRR033615_1.fastq.gz SRR033626_1.fastq.gz
    list-sra.sh SRR033560_2.fastq.gz SRR033571_2.fastq.gz SRR033582_2.fastq.gz SRR033593_2.fastq.gz SRR033604_2.fastq.gz SRR033615_2.fastq.gz SRR033626_2.fastq.gz
    newFiles SRR033561_1.fastq.gz SRR033572_1.fastq.gz SRR033583_1.fastq.gz SRR033594_1.fastq.gz SRR033605_1.fastq.gz SRR033616_1.fastq.gz SRR033627_1.fastq.gz
    nohup.out SRR033561_2.fastq.gz SRR033572_2.fastq.gz SRR033583_2.fastq.gz SRR033594_2.fastq.gz SRR033605_2.fastq.gz SRR033616_2.fastq.gz SRR033627_2.fastq.gz
    README SRR033562_1.fastq.gz SRR033573_1.fastq.gz SRR033584_1.fastq.gz SRR033595_1.fastq.gz SRR033606_1.fastq.gz SRR033617_1.fastq.gz SRR033628_1.fastq
    SRA010766 SRR033562_2.fastq.gz SRR033573_2.fastq.gz SRR033584_2.fastq.gz SRR033595_2.fastq.gz SRR033606_2.fastq.gz SRR033617_2.fastq.gz SRR033628_2.fastq
    SRR033552_1.fastq.gz SRR033563_1.fastq.gz SRR033574_1.fastq.gz SRR033585_1.fastq.gz SRR033596_1.fastq.gz SRR033607_1.fastq.gz SRR033618_1.fastq.gz SRR033629_1.fastq
    SRR033552_2.fastq.gz SRR033563_2.fastq.gz SRR033574_2.fastq.gz SRR033585_2.fastq.gz SRR033596_2.fastq.gz SRR033607_2.fastq.gz SRR033618_2.fastq.gz SRR033629_2.fastq
    SRR033553_1.fastq.gz SRR033564_1.fastq.gz SRR033575_1.fastq.gz SRR033586_1.fastq.gz SRR033597_1.fastq.gz SRR033608_1.fastq.gz SRR033619_1.fastq.gz SRR033630_1.fastq
    SRR033553_2.fastq.gz SRR033564_2.fastq.gz SRR033575_2.fastq.gz SRR033586_2.fastq.gz SRR033597_2.fastq.gz SRR033608_2.fastq.gz SRR033619_2.fastq.gz SRR033630_2.fastq
    SRR033554_1.fastq.gz SRR033565_1.fastq.gz SRR033576_1.fastq.gz SRR033587_1.fastq.gz SRR033598_1.fastq.gz SRR033609_1.fastq.gz SRR033620_1.fastq.gz SRR033631_1.fastq
    SRR033554_2.fastq.gz SRR033565_2.fastq.gz SRR033576_2.fastq.gz SRR033587_2.fastq.gz SRR033598_2.fastq.gz SRR033609_2.fastq.gz SRR033620_2.fastq.gz SRR033631_2.fastq
    SRR033555_1.fastq.gz SRR033566_1.fastq.gz SRR033577_1.fastq.gz SRR033588_1.fastq.gz SRR033599_1.fastq.gz SRR033610_1.fastq.gz SRR033621_1.fastq.gz SRR033632_1.fastq
    SRR033555_2.fastq.gz SRR033566_2.fastq.gz SRR033577_2.fastq.gz SRR033588_2.fastq.gz SRR033599_2.fastq.gz SRR033610_2.fastq.gz SRR033621_2.fastq.gz SRR033632_2.fastq
    SRR033556_1.fastq.gz SRR033567_1.fastq.gz SRR033578_1.fastq.gz SRR033589_1.fastq.gz SRR033600_1.fastq.gz SRR033611_1.fastq.gz SRR033622_1.fastq.gz SRR033633_1.fastq
    SRR033556_2.fastq.gz SRR033567_2.fastq.gz SRR033578_2.fastq.gz SRR033589_2.fastq.gz SRR033600_2.fastq.gz SRR033611_2.fastq.gz SRR033622_2.fastq.gz SRR033633_2.fastq
    SRR033557_1.fastq.gz SRR033568_1.fastq.gz SRR033579_1.fastq.gz SRR033590_1.fastq.gz SRR033601_1.fastq.gz SRR033612_1.fastq.gz SRR033623_1.fastq.gz
    SRR033557_2.fastq.gz SRR033568_2.fastq.gz SRR033579_2.fastq.gz SRR033590_2.fastq.gz SRR033601_2.fastq.gz SRR033612_2.fastq.gz SRR033623_2.fastq.gz
    SRR033558_1.fastq.gz SRR033569_1.fastq.gz SRR033580_1.fastq.gz SRR033591_1.fastq.gz SRR033602_1.fastq.gz SRR033613_1.fastq.gz SRR033624_1.fastq.gz
    SRR033558_2.fastq.gz SRR033569_2.fastq.gz SRR033580_2.fastq.gz SRR033591_2.fastq.gz SRR033602_2.fastq.gz SRR033613_2.fastq.gz SRR033624_2.fastq.gz

    Leave a comment:


  • SongLi
    replied
    Hi seb567,

    How slow are you experiencing with fasta-dump?

    My experiene is this: my computer is Xeon 2.4G 4core, 12G RAM, fasta-dump takes 600 minutes to finish one sra file.

    I have tried the newest release and also different sra files. fastq-dump is always very slow.

    Thanks,

    Originally posted by seb567 View Post
    I have to download and convert files to test Ray, the assembler I am working on (see a thread elsewhere on this forum).

    My take on sratoolkit (I use /software/sratoolkit.2.0b4-2-centos_linux64/):

    It is slow, but it works. My guess is that data are compressed, using something like LIBBZ2 (it is just a guess). That explains the compression ratio as well as the slowness.



    Binaries are linked against libz and libbz2, but the slowness indicates that they probably rely on libbz2.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Advanced Tools Transforming the Field of Cytogenomics
    by seqadmin


    At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
    09-26-2023, 06:26 AM
  • seqadmin
    How RNA-Seq is Transforming Cancer Studies
    by seqadmin



    Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
    09-07-2023, 11:15 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 09:38 AM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-27-2023, 06:57 AM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-26-2023, 07:53 AM
1 response
23 views
0 likes
Last Post seed_phrase_metal_storage  
Started by seqadmin, 09-25-2023, 07:42 AM
0 responses
17 views
0 likes
Last Post seqadmin  
Working...
X