Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 2-pass speed about 7 M/hr

    hello,

    is it possible that mapping speed for the 2nd pass decreases to 7 M/hr if 900.000 new splice sites and a comprehensive gene model (gencode v19) were used for index generation.
    first pass was ~100-fold faster (700 M/hr).

    my concrete syntax:
    /home/ws/SW_install/STAR/STAR/source/STAR --runThreadN 31 --outSAMstrandField intronMotif --outSAMtype BAM SortedByCoordinate --genomeDir $indices --readFilesCommand zcat --readFilesIn /home/ws/data/PatientData/$b/m*/$s/t*/date*/$f1 /home/ws/data/PatientData/$b/m*/$s/t*/date*/$f2 --outFileNamePrefix $name

    wiht 31 cores on a 192 GByte Scientific Linux 7 workstation.

    Can I do something to improve speed?

    dietmar

    Comment


    • Originally posted by neokao View Post
      Thanks, Alex.

      I don't know what's causing the problem for multi-threading on my Mac mini.
      I got the same error even after I rebooted the computer and started the mapping freshly.

      Anyways, I went ahead and started the mapping one by one with --runThreadN 1 already.
      It is weird that even with --runThreadN 1 in code like this:

      > STAR --genomeDir ./GenomeDir/ --readFilesIn ./BGI_RNAseq_data_2015/01.fq --runThreadN 1

      , it sometimes worked but sometimes did not.

      For the same .fq file, it could give the the Killed: 9 error and when I redid with the exact same code, it went through successfully. Very strange.

      My .fq files do have distinct prefixes and they are ordered by two digits number as described before: 01.fq, 02.fq, etc. Could you shed more light on --readFilesIn XX.fq --outFileNamePrefix XX ? Thanks.
      You can map each of the FASTQ files in the separate directory, e.g. 01/, 02/ ... The output files in all directories will have the same names, such as 01/Aligned.out.sam 02/Aligned.out.sam ...
      Alternatively, you can run all STAR jobs in one directory but with different prefixes corresponding to your FASTQ files, i.e.
      STAR --readFilesIn 01.fastq --outFileNamePrefix 01_
      STAR --readFilesIn 02.fastq --outFileNamePrefix 02_
      In this case the output file will have the specified prefixes for each of the runs, i.e.
      01_Aligned.out.sam, 02_Aligned.out.sam ...

      I suspect that there is some problem with RAM management as STAR takes almost all of the available RAM.
      Can you try to reboot your machine - I heard that this helps some Mac systems to "declutter" RAM?
      Also, please run "top" command while running STAR to see how much memory is being used.

      Cheers
      Alex

      Comment


      • Originally posted by dietmar13 View Post
        hello,

        is it possible that mapping speed for the 2nd pass decreases to 7 M/hr if 900.000 new splice sites and a comprehensive gene model (gencode v19) were used for index generation.
        first pass was ~100-fold faster (700 M/hr).

        my concrete syntax:
        /home/ws/SW_install/STAR/STAR/source/STAR --runThreadN 31 --outSAMstrandField intronMotif --outSAMtype BAM SortedByCoordinate --genomeDir $indices --readFilesCommand zcat --readFilesIn /home/ws/data/PatientData/$b/m*/$s/t*/date*/$f1 /home/ws/data/PatientData/$b/m*/$s/t*/date*/$f2 --outFileNamePrefix $name

        wiht 31 cores on a 192 GByte Scientific Linux 7 workstation.

        Can I do something to improve speed?

        dietmar
        Hi Dietmar,

        there were some reports of the slowdown in the 2nd pass. In one of the cases it was caused by the splice junctions (likely false positive) in the mitochondrion genome: https://groups.google.com/d/msg/rna-...Y/0jSn0vy0ccgJ.
        If filtering out chrM junctions does not help, please send me the list of junctions from the 1st pass and a few million reads for testing.

        Cheers
        Alex

        Comment


        • Originally posted by alexdobin View Post
          I suspect that there is some problem with RAM management as STAR takes almost all of the available RAM.
          Can you try to reboot your machine - I heard that this helps some Mac systems to "declutter" RAM?
          Also, please run "top" command while running STAR to see how much memory is being used.

          Cheers
          Alex
          I guess so too. I manually did these 20 .fq files with occasional Killed: 9 error. I found that it could usually go through if I run the EXACT code again (even without rebooting the OSX). However now I really got stuck with one biggest .fq file (~ 6.6G). For that particular .fg file, I got Abort trap: 6 error at ..... Started sorting BAM step. It happens everytime (tried 6~7 times so far even with a fresh reboot). I did not see anything weird with top command.
          I also tried the --limitIObufferSize 100000000 but still got the Abort trap: 6 error.
          It is frustrating since this is the last file to map. The particular log.out file is attached here. Thanks for the advice.
          Attached Files
          Last edited by neokao; 03-31-2015, 07:05 AM.

          Comment


          • Originally posted by neokao View Post
            I guess so too. I manually did these 20 .fq files with occasional Killed: 9 error. I found that it could usually go through if I run the EXACT code again (even without rebooting the OSX). However now I really got stuck with one biggest .fq file (~ 6.6G). For that particular .fg file, I got Abort trap: 6 error at ..... Started sorting BAM step. It happens everytime (tried 6~7 times so far even with a fresh reboot). I did not see anything weird with top command.
            I also tried the --limitIObufferSize 100000000 but still got the Abort trap: 6 error.
            It is frustrating since this is the last file to map. The particular log.out file is attached here. Thanks for the advice.
            Please try the latest STAR release https://github.com/alexdobin/STAR/re...ag/STAR_2.4.0k - I have improved the BAM sorting and it now should require less RAM. Also, it may be safer to use a separate BAM sorting limit for RAM, say --limitBAMsortRAM 10000000000 .

            Cheers
            Alex

            Comment


            • Originally posted by alexdobin View Post
              Please try the latest STAR release https://github.com/alexdobin/STAR/re...ag/STAR_2.4.0k - I have improved the BAM sorting and it now should require less RAM. Also, it may be safer to use a separate BAM sorting limit for RAM, say --limitBAMsortRAM 10000000000 .

              Cheers
              Alex
              I tried that particular .fq file using old STAR with SAM output and then it went through.
              I still want to test your new version.
              (Thanks for your new code.) However, I got error when I tried to compile it.
              clang: error: no such file or directory: 'htslib/libhts.a'
              make: *** [STARforMac] Error 1

              (I did install gcc on my OSX Yosemite)

              Comment


              • @neokao: You need to install the new htslib library that is part of the samtools package: http://sourceforge.net/projects/samt...iles/samtools/

                Comment


                • I did have SAMTOOLS installed. Say under my NGS folder, I have samtools-1.2 folder and STAR-STAR_2.4.0k folder. I did STARforMac in the source directory (in STAR-STAR_2.4.0k).
                  Any advice? Thanks.

                  Comment


                  • You could make a "htslib" directory in your STAR source and copy that file in there.

                    Comment


                    • Compilation on Mac is tricky because the default compiler - clang - does not support OMP used by STAR. Please try to compile with
                      make STARforMacStatic CXX=/path/to/gcc

                      Comment


                      • Originally posted by alexdobin View Post
                        Compilation on Mac is tricky because the default compiler - clang - does not support OMP used by STAR. Please try to compile with
                        make STARforMacStatic CXX=/path/to/gcc
                        Following your suggestion, I got a different error:
                        /bin/sh: /path/to/gcc: No such file or directory
                        make: *** [Depend.list] Error 127

                        I did install gcc.
                        gcc --version
                        Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
                        Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
                        Target: x86_64-apple-darwin14.1.0
                        Thread model: posix

                        Thanks.

                        Comment


                        • Your gcc should be in /usr/bin. Confirm that by

                          Code:
                          $ which gcc
                          When you compile STAR do

                          Code:
                          $ make STARforMacStatic CXX=/usr/bin/gcc

                          Comment


                          • Yes. My gcc is there.
                            So I did make STARforMacStatic CXX=/usr/bin/gcc but still got error:

                            clang: error: unsupported option '-static-libgcc'
                            make: *** [STARforMacStatic] Error 1

                            Thanks folks.

                            Originally posted by GenoMax View Post
                            Your gcc should be in /usr/bin. Confirm that by

                            Code:
                            $ which gcc
                            When you compile STAR do

                            Code:
                            $ make STARforMacStatic CXX=/usr/bin/gcc

                            Comment


                            • See if the second answer in this thread helps:http://stackoverflow.com/questions/1...-osx-mavericks

                              Otherwise you will have to wait for Alex to respond.

                              Comment


                              • The Mac's /usr/bin/gcc (which is on the PATH so you can invoke it with simply gcc) symlinks to clang, so you are still trying to compile with clang.

                                When you installed (configured) true gcc, did you use --prefix option? You need to find installation path for the true gcc.
                                You need to be able to check the version:
                                /path/to/gcc/g++ -v

                                and it should say something like (not Apple LLVM clang etc):

                                Using built-in specs.
                                Target: x86_64-redhat-linux
                                Configured with: ../configure --prefix=/opt/hpc
                                Thread model: posix
                                gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)


                                If it works, compile STAR with (it has to be g++, not gcc - I made a mistake in the previous post):
                                make STARforMacStatic CXX=/path/to/g++

                                Cheers
                                Alex

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Choosing Between NGS and qPCR
                                  by seqadmin



                                  Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                  10-18-2024, 07:11 AM
                                • seqadmin
                                  Non-Coding RNA Research and Technologies
                                  by seqadmin




                                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                  Nobel Prize for MicroRNA Discovery
                                  This week,...
                                  10-07-2024, 08:07 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 05:31 AM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-24-2024, 06:58 AM
                                0 responses
                                20 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-23-2024, 08:43 AM
                                0 responses
                                48 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-17-2024, 07:29 AM
                                0 responses
                                58 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X