Linux (NVIDIA GPU)¶
For machines with NVIDIA GPUs. Adds GPU and MPI+GPU variants on top of the CPU baseline.
GPU installs are advanced
For CPU-only work, use the Quick Installation Superbuild first. NVIDIA GPU builds require CUDA, a CUDA-enabled Kokkos build, architecture flags, and often CUDA-aware MPI.
Read GPU Installation Overview before following this page.
System packages¶
CUDA toolkit (12.x or later): install via the NVIDIA CUDA repository.
After install, nvcc --version should report your toolkit version.
CUDA-aware OpenMPI¶
Required for MPI+GPU. The system OpenMPI is almost never built with CUDA awareness; build from source:
EXASIM=$PWD
MPI_PREFIX=$EXASIM/openmpi
mkdir -p $EXASIM/build_openmpi && cd $EXASIM/build_openmpi
wget https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-4.1.6.tar.gz
tar xf openmpi-4.1.6.tar.gz && cd openmpi-4.1.6
PATH=/usr/local/cuda/bin:$PATH ./configure \
--prefix=$MPI_PREFIX \
--with-cuda=/usr/local/cuda \
--disable-mpi-fortran --disable-oshmem
make -j16 install
export PATH=$MPI_PREFIX/bin:$PATH
ompi_info | grep -i cuda
ompi_info should list cuda under MPI extensions. For CPU-only
MPI, omit the CUDA bits.
CPU Baseline¶
Build or install a CPU baseline first. For most users this should be the CPU Superbuild. Use manual dependency steps only if your GPU build requires explicit site dependency control.
Manual Dependency Fallback¶
Follow Manual Dependency Installation and Common Dependencies for GKlib, METIS, ParMETIS, Kokkos serial, SymEngine, and Text2Code when the Superbuild is not sufficient.
Build Kokkos CUDA¶
cd $EXASIM/kokkos
mkdir buildcuda && cd buildcuda
cmake .. -DCMAKE_INSTALL_PREFIX=$EXASIM/kokkos/buildcuda \
-DKokkos_ENABLE_SERIAL=ON \
-DKokkos_ENABLE_CUDA=ON \
-DKokkos_ARCH_VOLTA70=ON \
-DCMAKE_CXX_COMPILER=$EXASIM/kokkos/buildcuda/bin/nvcc_wrapper
make -j16 install
Kokkos_ARCH_* selects the GPU architecture. Common values:
| Flag | Hardware |
|---|---|
Kokkos_ARCH_VOLTA70 |
V100 |
Kokkos_ARCH_AMPERE80 |
A100 |
Kokkos_ARCH_HOPPER90 |
H100 |
Kokkos_ARCH_TURING75 |
T4, RTX 20-series |
Full list in kokkos/cmake/kokkos_arch.cmake.
Build Exasim¶
GPU (single-rank):
cd $EXASIM
cmake -S install -B build_gpu \
-DEXASIM_CUDA=ON -DEXASIM_NOMPI=ON \
-DKokkos_DIR=$EXASIM/kokkos/buildcuda/lib/cmake/Kokkos \
-DEXASIM_BUILD_LIBRARY_EXAMPLES=ON
PATH=/usr/local/cuda/bin:$PATH cmake --build build_gpu -j8
CPU + MPI:
PATH=$MPI_PREFIX/bin:$PATH \
cmake -S install -B build_mpi \
-DEXASIM_CUDA=OFF -DEXASIM_HIP=OFF \
-DEXASIM_MPI=ON -DEXASIM_NOMPI=OFF \
-DWITH_PARMETIS=ON \
-DKokkos_DIR=$EXASIM/kokkos/buildserial/lib/cmake/Kokkos \
-DEXASIM_BUILD_LIBRARY_EXAMPLES=ON \
-DMPI_CXX_COMPILER=$MPI_PREFIX/bin/mpicxx \
-DMPI_C_COMPILER=$MPI_PREFIX/bin/mpicc
PATH=$MPI_PREFIX/bin:$PATH cmake --build build_mpi -j16
GPU + MPI:
PATH=/usr/local/cuda/bin:$MPI_PREFIX/bin:$PATH \
cmake -S install -B build_mpi_gpu \
-DEXASIM_CUDA=ON -DEXASIM_MPI=ON -DEXASIM_NOMPI=OFF \
-DWITH_PARMETIS=ON \
-DKokkos_DIR=$EXASIM/kokkos/buildcuda/lib/cmake/Kokkos \
-DEXASIM_BUILD_LIBRARY_EXAMPLES=ON
PATH=/usr/local/cuda/bin:$MPI_PREFIX/bin:$PATH \
cmake --build build_mpi_gpu -j8