large number of contigs
As @NGSfan pointed out, this is indeed RAM problem. STAR bins genome sequence in a way that each chromosome (contig) starts at a new bin, which creates an overhead of Nchromosomes*BinSize, where BinSize=2^genomeChrBinNbits. By default, --genomeChrBinNbits = 18,
so BinSize=2^18~256kb, so with 300,000 contigs you would need ~75GB of RAM - that's what likely killed your job.
I suggest that you try a much smaller value of --genomeChrBinNbits 12. This would require just a few GB of RAM and should allow you to generate the genome files. I have not tried STAR with more than 50,000 contigs, and I suspect there might be significant slowdown in the mapping speed when the number of contigs is too big.
Originally posted by JonB
View Post
so BinSize=2^18~256kb, so with 300,000 contigs you would need ~75GB of RAM - that's what likely killed your job.
I suggest that you try a much smaller value of --genomeChrBinNbits 12. This would require just a few GB of RAM and should allow you to generate the genome files. I have not tried STAR with more than 50,000 contigs, and I suspect there might be significant slowdown in the mapping speed when the number of contigs is too big.
Comment