Originally posted by talioto
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
sheepyuan: what do you mean by "The same to u!" ?
If you refer to the post of taliato, I don't think it is a good idea to disable shared memory
as it is the fastest way to do message passing between processes on the same machine.
Also, Open-MPI 1.4.2 is very old. The current stable release of Open-MPI is 1.6.1.
A lot of improvements were added in Open-MPI since 1.4.2 !
And gcc 4.1.2 is very old too although I don't think this will change much.
Originally posted by sheepyuan View PostThe same to u!Last edited by seb567; 09-25-2012, 03:48 AM.
Comment
-
You need to install the openmpi package. For example, if you are using Fedora do a 'yum install openmpi openmpi-devel'. If the packages are already installed, make sure that they are in your path (you can add them to your .bash_profile). If you are trying to run Ray from a remote 'screen' job, make sure you source your .bash_profile too.
Comment
-
Guys,
I examine dthis tread from the very beginning but could not find answer for my problem. Sorry for silly question. I tried to install Ray 2.0.0 and failed on two machines, one SciLinux 5.5 and another RHEL 55, which are esentially the same. Here is the output:
[Code]
[yaximik@SciLinux55 Ray-v2.0.0]$ make PREFIX=ray-build
make[1]: Entering directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/RayPlatform'
mpic++ -Wall -ansi -O3 -D MAXKMERLENGTH=32 -D RAY_VERSION=\"2.0.0\" -D RAYPLATFORM_VERSION=\"1.0.3\" -I. -c -o memory/ReusableMemoryStore.o memory/ReusableMemoryStore.cpp
make[1]: mpic++: Command not found
make[1]: *** [memory/ReusableMemoryStore.o] Error 127
make[1]: Leaving directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/RayPlatform'
make[1]: Entering directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/code'
mpic++ -Wall -ansi -O3 -D MAXKMERLENGTH=32 -D RAY_VERSION=\"2.0.0\" -I ../RayPlatform -I. -c -o application_core/ray_main.o application_core/ray_main.cpp
make[1]: mpic++: Command not found
make[1]: *** [application_core/ray_main.o] Error 127
make[1]: Leaving directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/code'
mpic++ code/TheRayGenomeAssembler.a RayPlatform/libRayPlatform.a -o Ray
make: mpic++: Command not found
make: *** [Ray] Error 127
[yaximik@SciLinux55 Ray-v2.0.0]$
[Code]
Her is output from RHEL55
[code]
[[yaximik@G5NNJN1 Ray-v2.0.0]$ make PREFIX=ray-build
make[1]: Entering directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/RayPlatform'
mpicxx -Wall -ansi -O3 -D MAXKMERLENGTH=32 -D RAY_VERSION=\"2.0.0\" -D RAYPLATFORM_VERSION=\"1.0.3\" -I. -c -o memory/ReusableMemoryStore.o memory/ReusableMemoryStore.cpp
make[1]: mpicxx: Command not found
make[1]: *** [memory/ReusableMemoryStore.o] Error 127
make[1]: Leaving directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/RayPlatform'
make[1]: Entering directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/code'
mpicxx -Wall -ansi -O3 -D MAXKMERLENGTH=32 -D RAY_VERSION=\"2.0.0\" -I ../RayPlatform -I. -c -o application_core/ray_main.o application_core/ray_main.cpp
make[1]: mpicxx: Command not found
make[1]: *** [application_core/ray_main.o] Error 127
make[1]: Leaving directory `/home/yaximik/Bioinformatics/Ray-v2.0.0/code'
mpicxx code/TheRayGenomeAssembler.a RayPlatform/libRayPlatform.a -o Ray
make: mpicxx: Command not found
make: *** [Ray] Error 127
[yaximik@G5NNJN1 Ray-v2.0.0]$
[code]
Essentially tghe same. I have
openmpiwrappers-openmpi-1-4.el5.x86_64
openmpi-1.4.-4.el5.x86_64
openmpi-devel-1.4-4. el5-x86_64
openmpi-libs-1.4-4.el5 x86_64
installed. Both machines are 64 bit, one is 2 processor, 8 GB RAM, another is 16 processor 96GB RAM. Please help as II'd like to try Ray 2.0.0 on my project.
Comment
-
Ray runs well when I use a single node, but when utilizing more than this I get an MPI exit code- like this
Ray:25109 terminated with signal 11 at PC=5718e0 SP=7fff9eb8a838. Backtrace:
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZNK14ReadAnnotation7getRankEv+0x0)[0x5718e0]
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN40Adapter_RAY_MPI_TAG_REQUEST_VERTEX_READS4$
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN18MessageTagExecutor11callHandlerEiP7Messag$
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN11ComputeCore3runEv+0x3cc)[0x5985ec]
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN7Machine5startEv+0x1d8d)[0x46906d]
/home/bstamps/Ray/Ray-v2.0.0/Ray(main+0x73)[0x464d73]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2b3fbc934cdd]
/home/bstamps/Ray/Ray-v2.0.0/Ray[0x464c39]
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 25094 on
node c310 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
Thoughts?
Comment
-
Originally posted by bstamps View PostRay runs well when I use a single node, but when utilizing more than this I get an MPI exit code- like this
...
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 25094 on
node c310 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
Not much help, I know, but the general idea is that the problem may be with your hardware setup and not with ray.
Comment
-
It appears setting my ptile below the maximum per node (16) has solved the problem...I'll have to go bug my computing center as to why 15 is kosher and 16 causes MPI to die. Either way I'm very happy with Ray's performance- being able to span my job across 4500 cores has sped assembly up quite a bit...
Comment
-
I spoke a little too soon- Ray appears to be throwing segmentation faults randomly through the assembly process on random nodes. Adding in "route-messages" seems to have helped, but my jobs still fail every so often. The computing center seem to think it's an issue with Ray, but I'm curious as to what the community thinks.
Comment
-
Originally posted by bstamps View PostRay runs well when I use a single node, but when utilizing more than this I get an MPI exit code- like this
Ray:25109 terminated with signal 11 at PC=5718e0 SP=7fff9eb8a838. Backtrace:
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZNK14ReadAnnotation7getRankEv+0x0)[0x5718e0]
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN40Adapter_RAY_MPI_TAG_REQUEST_VERTEX_READS4$
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN18MessageTagExecutor11callHandlerEiP7Messag$
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN11ComputeCore3runEv+0x3cc)[0x5985ec]
/home/bstamps/Ray/Ray-v2.0.0/Ray(_ZN7Machine5startEv+0x1d8d)[0x46906d]
/home/bstamps/Ray/Ray-v2.0.0/Ray(main+0x73)[0x464d73]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2b3fbc934cdd]
/home/bstamps/Ray/Ray-v2.0.0/Ray[0x464c39]
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 25094 on
node c310 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
Thoughts?
Ray v2.1.0 was released today. There are a lot of bug fixes, with 2 fixes for 2 bugs that could lead to segmentation faults.
Comment
-
Originally posted by westerman View PostI am not a big Ray user but I will sometimes get the above problem and then when I do a re-run the problem goes away. I think that it has to do with my cluster's setup. I suggest trying a small run and put one job per node just to make sure that everything will work.
Not much help, I know, but the general idea is that the problem may be with your hardware setup and not with ray.
Comment
-
Originally posted by bstamps View PostIt appears setting my ptile below the maximum per node (16) has solved the problem...I'll have to go bug my computing center as to why 15 is kosher and 16 causes MPI to die. Either way I'm very happy with Ray's performance- being able to span my job across 4500 cores has sped assembly up quite a bit...
Originally posted by bstamps View Postacross 4500 cores
Comment
-
Originally posted by bstamps View PostI spoke a little too soon- Ray appears to be throwing segmentation faults randomly through the assembly process on random nodes. Adding in "route-messages" seems to have helped, but my jobs still fail every so often. The computing center seem to think it's an issue with Ray, but I'm curious as to what the community thinks.
Can you send an email on the list with your hardware and Ray command ?
Pure MPI applications may not be the answer for very large clusters, hybrid programming models are likely better.
We have work in progress on a new hybrid programming model. At the moment, Ray only uses MPI (v2.1.0 for instance). So when you run on 8 nodes * 24 cores / node = 192 cores, Ray is launched on 192 processes, with 24 processes per node.
We have devised a new programming model called "mini-ranks". If you Google "mini-ranks", you will mostly find hits about Lego blocks because "mini-ranks" in parallel programming is new as I believe we invented that ourselves !
Our implementation of the mini-ranks model can use 1 MPI process per node, 23 POSIX threads per process and an additional communication thread for each node. The mini-ranks run inside POSIX threads and the MPI rank actually does not do much.
Ray is already ported to that model (mini-ranks implemented with MPI+POSIX threads) in the git source tree.
Instead of launching like this:
mpiexec -n 192 Ray ...
You launch it like this:
mpiexec -n 8 -bynode Ray -mini-ranks-per-rank 23 ...
Note that our "mini-ranks" implementation needs 1 thread for communication for each node.
Although this is experimental, you may be interested to test that on your hardware.
The branch is called minirank-model should you want to check that.
Sébastien Boisvert
Ray maintainer
Comment
Latest Articles
Collapse
-
by seqadmin
The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...-
Channel: Articles
05-06-2024, 07:48 AM -
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 05-10-2024, 06:35 AM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
05-10-2024, 06:35 AM
|
||
Started by seqadmin, 05-09-2024, 02:46 PM
|
0 responses
21 views
0 likes
|
Last Post
by seqadmin
05-09-2024, 02:46 PM
|
||
Started by seqadmin, 05-07-2024, 06:57 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
05-07-2024, 06:57 AM
|
||
Started by seqadmin, 05-06-2024, 07:17 AM
|
0 responses
21 views
0 likes
|
Last Post
by seqadmin
05-06-2024, 07:17 AM
|
Comment