Monday 2 March 2009

[NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182

OpenMPI failure (fixed)

$ mpiexec -n 4 fe2.py

[host-desktop1:09127] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

orte_rml_base_select failed
--> Returned value -13 instead of ORTE_SUCCESS

--------------------------------------------------------------------------
[host-desktop1:09127] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_system_init.c at line 42
[host-desktop1:09127] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 52
--------------------------------------------------------------------------
Open RTE was unable to initialize properly. The error occured while
attempting to orte_init(). Returned value -13 instead of ORTE_SUCCESS.
--------------------------------------------------------------------------

Reinstall, from a fresh build.

cd downloads
wget http://www.open-mpi.org/software/ompi/v1.3/downloads/openmpi-1.3.tar.gz
tar xzf openmpi-1.3.tar.gz

Reading the README showed, PATH and LD_LIBRARY_PATH need a specific entry. ompi_info, a program residing in /bin reports the MPI status, but it's location is of more interest.

whereis ompi_info
ompi_info: /usr/bin/ompi_info /usr/share/man/man1/ompi_info.1.gz

So /usr/bin needs to be in PATH, and /usr/lib needs to be in LD_LIBRARY_PATH.

Checking .bashrc showed LD_LIBRARY_PATH was wiped on its instantiation. The first line was changed from

LD_LIBRARY_PATH=blahbah

to

LD_LIBRARY_PATH=blahbah:$LD_LIBRARY_PATH

and things started working again.

1 comment:

  1. Glad you got it working.

    We're working to improve our error messages to make such things a bit more clear in future versions.

    ReplyDelete