Seqanswers Leaderboard Ad

**shunyip** · 01-06-2014, 11:33 PM

Hello Arcolombo,

I believe the problem is your genome file, just as you are suspecting. A genome file should be called "genome.fa".

I hope this post will help you find what you need: http://seqanswers.com/forums/showthread.php?t=5996

**ffinkernagel** · 01-07-2014, 12:32 AM

The genomes you can download from the STAR website have already been prepared - no need to run genomeGenerate on them again. Just skip ahead to the alignment stage.

**arcolombo698** · 01-07-2014, 11:20 AM

If I wish to re create the Genome directory from a previous directory used, and using the junctions bed file that was found on the STAR website, how to proceed?

I ran a STAR command line that called a previous genome.fa from the UCSC site that I use for tophat. I added the parameters that point to the genome directory (UCSC hg19 directory) and also points to the genome.fa (from the previously used hg19 file). but in the genome creation I added the junctions file (according to the manual it is more accurate).

I still get an error regarding

[acolombo@hpc-login2 STAR]$ /auto/rcf-proj/sa1/software/STAR_2.3.0e/STAR --runMode genomeGenerate --genomeDir /auto/rcf-proj/sa1/data/Homo_sapiens1/UCSC/hg19/Sequence/WholeGenomeFasta --genomeFastaFiles /auto/rcf-proj/sa1/data/Homo_sapiens1/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa --runThreadN 1 --sjdbFileChrStartEnd /auto/rcf-proj/sa1/data/Junctions_Annotations --sjdbOverhang 1 genomeChrBinNbits 12
Jan 07 11:10:23 ..... Started STAR run
Jan 07 11:10:24 ... Starting to generate Genome files
Jan 07 11:15:40 ... finished processing splice junctions database ...
Jan 07 11:16:55 ... starting to sort Suffix Array. This may take a long time...
Jan 07 11:17:26 ... sorting Suffix Array chunks and saving them to disk...
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Abort

**arcolombo698** · 01-07-2014, 01:53 PM

This issue was found in the previous announcement of STAR release and the work solution was to use the parameters

/auto/rcf-proj/sa1/software/STAR_2.3.0e/STAR --runMode genomeGenerate --genomeDir /auto/rcf-proj/sa1/data/hg19/Sequence --genomeFastaFiles /auto/rcf-proj/sa1/data/Homo_sapiens1/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa --runThreadN 1 --sjdbFileChrStartEnd /auto/rcf-proj/sa1/data/Junctions_Annotations --sjdbOverhang 1 genomeChrBinNbits 6 --genomeSAindexNbases 4

Yet this is processing for over 45 minutes - 1 .5 hours. (quite very slow)

**arcolombo698** · 01-07-2014, 03:05 PM

Here are the results

it gives an error about not enough SA indices... currently re running

Attached Files

Log.txt (13.3 KB, 42 views)

**ffinkernagel** · 01-08-2014, 12:41 AM

How much memory do you have?

**shunyip** · 01-08-2014, 01:10 AM

Originally posted by ffinkernagel View Post

How much memory do you have?

Agreed, you may have run out of disk space. How much free memory do you have in your hard drive?

**ffinkernagel** · 01-09-2014, 07:57 AM

Not disk space, RAM - STAR uses quite a lot of ram to generate it's index (last time I checked, 16 GB were not enough for a human genome)

**anagd** · 10-25-2016, 01:30 AM

Hello I am having problmes trying to generate the index of mouse GRCm38 from Ensembl.
STAR stops when.. sorting Suffix Array chunks and saving them to disk... is running without any error so my Genome file for the next step is not generated.

I am running STAR using cygwin from windows and I have 64Gb RAM.
I heard that maybe the problem ends up with STAR's pre-compiled build. I am not an expert in informatics and RNA-seq analysis is also new for me, so I don't understand well how I have to compile STAR executable but what I did is set the working directory in cd STAR/source and runing STAR from here. Also I set the path to STAR executable in PATH enviroment variable in windows setting system. You guys did you have similar problems?

Im very stuck in this step for several days and I dont know what to do. Any help is welcoming. Could I use a already index generated from STAR in case I cannot do my own indexes?

I have a Intel Xeon CPU 3.5Ghz Number of Cores 4, Number of logical Procss 8 The mouse genome and genes.gtf files I downloaded them from iGenome website and I am using the WholeGenome.fa file from Ensembl. Is this genome too big and I have RAM limitiation? Should I generate my index chromosome per chromosome? How long could be last the index generation?

This is my command:

./STAR --runMode genomeGenerate --genomeDir /cygdrive/c/Ana_Gómez_Secuenciación/CM1_FACS/20160818_Carpeta_de_trabajo_H3YJLBGXY/index --genomeFastaFiles /cygdrive/c/Ana_Gómez_Secuenciación/Genome/reference/Mus_musculus/Ensembl/GRCm38/Sequence/WholeGenomeFasta/genome.fa --runThreadN 6 --sjdbGTFfile /cygdrive/c/Ana_Gómez_Secuenciación/Genome/GTF_files/referenceGTF/genes.gtf --sjdbOverhang 75 --genomeSAsparseD parameter 1

**cmbetts** · 10-25-2016, 09:31 AM

Generally, you can't just drop a linux binary into a Cygwin environment and expect it to run. As you alluded to, you almost certainly have to use MinGW to compile your own binaries. As someone who's slammed their head into a wall repeatedly trying to compile NGS analysis tools in Cygwin (I really wish I'd documented how I got samtools to compile properly that one time!), I'd highly recommend running Linux in a VM, I run ubuntu server installed under VirtualBox on my work mandated Windows PC, or natively as a dual boot. You'll find nothing but pain trying to get a usable NGS environment going on Windows, while almost everything you'd want to use was designed for and probably has a precompiled binary available for Linux (Not to mention a competent commandline, which is how nearly all of the tools are run).

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

STAR rna seq Aligner installation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News