Installing chainermn on osc
In one of the assignments, we wanted to do performance analysis of chainermn in multinode environment. So we tried to install chainermn on osc. Turns out it’s bit complected. For impatients:
- Get the base modules:
module load python/3.6-conda5.2 gnu/6.1.0 cuda/9.1.85 mvapich2/2.3rc2-gpu
- Create a virtualenv:
conda create -n myenv python=3.6 . /usr/local/python/3.6-conda5.2/etc/profile.d/conda.sh source activate myenv
- We need to install mpi4py
from source code. (Not using
pip
).wget https://bitbucket.org/mpi4py/mpi4py/downloads/mpi4py-3.0.0.tar.gz tar xf mpi4py-3.0.0.tar.gz cd mpi4py-3.0.0
- Replace
mpi.cfg
with:[mpi] mpicc = /opt/mvapich2/gnu/6.1/2.3rc2-gpu/bin/mpicc mpicxx = /opt/mvapich2/gnu/6.1/2.3rc2-gpu/bin/mpicxx include_dirs = /opt/mvapich2/gnu/6.1/2.3rc2-gpu/include libraries = cudart library_dirs = /opt/mvapich2/gnu/6.1/2.3rc2-gpu/lib:/usr/local/cuda/9.1.85/lib64
Observations:
- We need
cudart
with the gpu enabled mpi. :
allows specifying multiple entries.
- We need
- Build and install
mpi4py
python setup.py build python setup.py install
- Install
chainer
.pip install chainer
This will also install
cupy
if it doesn’t (check the output), you will need to install. - Now the last command would install
cupy-92
but remember ourmpi
version neededcuda 9.1
. So once you login to the gpu node, we will be cheating like following:# we need to load modules again on the compute node module load python/3.6-conda5.2 gnu/6.1.0 cuda/9.1.85 mvapich2/2.3rc2-gpu # init conda defaults . /usr/local/python/3.6-conda5.2/etc/profile.d/conda.sh # activate the environment which has everything installed source activate myenv # do the cheating so the cupy-92 can load cuda 9.2 cublas and mpi # can still be happy with cuda 9.1 export LD_LIBRARY_PATH=/usr/local/cuda/9.2.88/lib64:$LD_LIBRARY_PATH # make sure mvapich2 *actually* uses gpu! export MV2_USE_CUDA=1
With all this setup you should be able to run the chainermn on osc.
Happy coding!