I was following the steps in the vignette of DEXSeq here to construct the exon count files (.txt) for each of my 16 biological samples. However, as shown on page 3, we need to go through these steps:
(1) samtools index treated2.bam
(2) samtools view treated2.bam > treated2.sam
(3) sort k1,1 k2,2n treated2.sam > treated2_sorted.sam
(4) python deseq_count.py -p yes Dmel.BDGP5.25.62.DEXSeq.chr.gff treated2_sorted.sam treated2fb.txt
In (1), the indexing is done in a reasonable time (<3 min); in (2) I realize that the SAM files can be huge (>15GB) for each sample; in (3) will the sorted SAM file be of the same size as the unsorted SAM file? How long it will take for the sorting? in (4), can I say after step (3), the generated SAM file (unsorted) is no longer useful, and we only need the sorted SAM files?
The reason I ask this is because I have 16 samples, so the total size of files becomes very large and the disk space is limited... Just wanna make sure these sizes are reasonable, and ask if I could delete the (if it's useless) SAM files while sorting them in step (3). What would be the Linux command, then?
Thank you so much!
(1) samtools index treated2.bam
(2) samtools view treated2.bam > treated2.sam
(3) sort k1,1 k2,2n treated2.sam > treated2_sorted.sam
(4) python deseq_count.py -p yes Dmel.BDGP5.25.62.DEXSeq.chr.gff treated2_sorted.sam treated2fb.txt
In (1), the indexing is done in a reasonable time (<3 min); in (2) I realize that the SAM files can be huge (>15GB) for each sample; in (3) will the sorted SAM file be of the same size as the unsorted SAM file? How long it will take for the sorting? in (4), can I say after step (3), the generated SAM file (unsorted) is no longer useful, and we only need the sorted SAM files?
The reason I ask this is because I have 16 samples, so the total size of files becomes very large and the disk space is limited... Just wanna make sure these sizes are reasonable, and ask if I could delete the (if it's useless) SAM files while sorting them in step (3). What would be the Linux command, then?
Thank you so much!
Comment