I hope someone can help in this issue.
I have 8 bam files from mm9 alignment, each ~4-5 geg in size. When I run summarizeOverlaps over 3 files, it takes 2-3 hours to finish and it works although my computer almost freezes up. But when I put the 8 files together, then keep it on overnight (as it takes too long to wait), the computer freezes (although it is 16 geg i7 mac, so supposed to be powerful) and the command never results in anything.
I am making my own txdb file from gtf that I used for the alignment to match the naming of the chromosomes. (script is below).
Do you have any tips on how I can get the summzerOverlaps to work on the 8 files to create one se file without freezing up the computer? I have been trying to do that for the past 2 week and always same result.
Any input is appreciated.
here’s the script:
library("DESeq2")
library("GenomicFeatures")
library("Rsamtools")
library("GenomicAlignments")
library("GenomicRanges”)
mm9_from_cluster_gtf_txdb <- makeTranscriptDbFromGFF(file="~/Desktop/genes.gtf", format="gtf”)
head(seqlevels(mm9_from_cluster_gtf_txdb))
saveDb(mm9_from_cluster_gtf_txdb, file="/Path/To/Libraries/TxDB/mm9_from_cluster_Ensembl_txdb.sqlite”)
exonsByGene<-exonsBy(mm9_from_cluster_gtf_txdb,by="gene")
seqinfo(exonsByGene)
fls <- list.files("/Path/To/BamFiles", pattern="paired.accepted_hits.bam", full= TRUE)
fls
Experiment <- c(fls[2:8], fls[1])
Experiment
bamLst_experiment <- BamFileList(Experiment, yieldSize=100000)
seqinfo(bamLst_experiment)
se_test_experiment <- summarizeOverlaps(exonsByGene,bamLst_experiment, mode="Union", singleEnd=FALSE, ignore.strand=TRUE, fragments=TRUE) <<<This is the step that freezes the computer when I run the 8 of the files together.
I have 8 bam files from mm9 alignment, each ~4-5 geg in size. When I run summarizeOverlaps over 3 files, it takes 2-3 hours to finish and it works although my computer almost freezes up. But when I put the 8 files together, then keep it on overnight (as it takes too long to wait), the computer freezes (although it is 16 geg i7 mac, so supposed to be powerful) and the command never results in anything.
I am making my own txdb file from gtf that I used for the alignment to match the naming of the chromosomes. (script is below).
Do you have any tips on how I can get the summzerOverlaps to work on the 8 files to create one se file without freezing up the computer? I have been trying to do that for the past 2 week and always same result.
Any input is appreciated.
here’s the script:
library("DESeq2")
library("GenomicFeatures")
library("Rsamtools")
library("GenomicAlignments")
library("GenomicRanges”)
mm9_from_cluster_gtf_txdb <- makeTranscriptDbFromGFF(file="~/Desktop/genes.gtf", format="gtf”)
head(seqlevels(mm9_from_cluster_gtf_txdb))
saveDb(mm9_from_cluster_gtf_txdb, file="/Path/To/Libraries/TxDB/mm9_from_cluster_Ensembl_txdb.sqlite”)
exonsByGene<-exonsBy(mm9_from_cluster_gtf_txdb,by="gene")
seqinfo(exonsByGene)
fls <- list.files("/Path/To/BamFiles", pattern="paired.accepted_hits.bam", full= TRUE)
fls
Experiment <- c(fls[2:8], fls[1])
Experiment
bamLst_experiment <- BamFileList(Experiment, yieldSize=100000)
seqinfo(bamLst_experiment)
se_test_experiment <- summarizeOverlaps(exonsByGene,bamLst_experiment, mode="Union", singleEnd=FALSE, ignore.strand=TRUE, fragments=TRUE) <<<This is the step that freezes the computer when I run the 8 of the files together.
Comment