Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Decompressing a file - very basic question

    Hi everyone,

    I have been running my sequencing analysis on a remote server. The sequencing data that I receive is compressed (.gz), so what I upload onto the server is a compressed file. I have been using gunzip to decompress these.

    My concern is that often the process seems to take so long that I get a "Write failed" "broken pipe" error. The file no longer has the .gz extension, so it seems to have decompressed, but I have no decompressed file with which to compare size. So, a few questions:

    1. Is there a way to make sure a file decompressed completely, without having a size to compare beforehand?

    2. If a file does fail to decompress, does it remain in some kind of partially-decompressed state or does it revert?

    3. Is there a faster way to decompress large files?

    Thanks in advance.

  • #2
    What may be happening is your connection times out due to "inactivity" while waiting for the file to uncompress.

    One general purpose solution may be to (depending on what operating system and client program you are using) is to set "keepalive" to something, like once every 60 seconds, so the connection does not die. Google "keepalive" and the program you are using. e.g. in Ubuntu here's the first Google hit from 2006 that has the instructions http://embraceubuntu.com/2006/02/03/...essions-alive/

    for #1 and #2, in my experience if the decompress fails, everything reverts to the original condition.

    for #3, probably not, but you can decompress multiple files at the same time (assuming you have more than one processor core) by detaching the jobs, e.g.
    Code:
    gunzip whoppinggreatfastqfile.fq.gz &
    and if you detach from the terminal, then it will run to completion even if you disconnect, i.e.
    Code:
    nohup gunzip whoppinggreatfastqfile.fq.gz &

    Comment


    • #3
      You might also look into GNU screen, which lets you disconnect from a window, then reconnect later. It's like nohup, but better. People had been telling me for years to use it, and when I finally did, I wanted to slap myself backwards through time for not using it sooner.

      Comment


      • #4
        Great! Thanks for the advice.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:47 AM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X