I am Moritz from the University Heidelberg in Germany.
For my bachelor thesis I have 20 large (25-30 GB) genome files (.txt.gz) by patients with hepatocellular carcinoma. I have Bpipe installed on my Ubuntu server, which I have got to try out several approaches.
Steps included are:
Alignment (BWA (Transform sai and sam)) against hg19.fasta
Transform (samtols)
Dedupe
The problem I have is that in order to try out my bpipe workflow, I have to take a whole sequence of 30 GB and start from the beginning. That takes a lot of time. So my questions are:
How can I shorten one file?
Where can I find a short sequence that I can use to test my pipeline?
For my bachelor thesis I have 20 large (25-30 GB) genome files (.txt.gz) by patients with hepatocellular carcinoma. I have Bpipe installed on my Ubuntu server, which I have got to try out several approaches.
Steps included are:
Alignment (BWA (Transform sai and sam)) against hg19.fasta
Transform (samtols)
Dedupe
The problem I have is that in order to try out my bpipe workflow, I have to take a whole sequence of 30 GB and start from the beginning. That takes a lot of time. So my questions are:
How can I shorten one file?
Where can I find a short sequence that I can use to test my pipeline?
Comment