eth-cscs
diff --git a/‎.github/actions/spelling/allow.txt‎
Lines changed: 5 additions & 0 deletions b/‎.github/actions/spelling/allow.txt‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/software/communication/cray-mpich.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/software/communication/cray-mpich.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/software/communication/dockerfiles/base‎
Lines changed: 36 additions & 0 deletions b/‎docs/software/communication/dockerfiles/base‎
Lines changed: 36 additions & 0 deletions
diff --git a/‎docs/software/communication/dockerfiles/libfabric‎
Lines changed: 20 additions & 0 deletions b/‎docs/software/communication/dockerfiles/libfabric‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎docs/software/communication/dockerfiles/nccl-tests‎
Lines changed: 7 additions & 0 deletions b/‎docs/software/communication/dockerfiles/nccl-tests‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/software/communication/dockerfiles/nvshmem‎
Lines changed: 54 additions & 0 deletions b/‎docs/software/communication/dockerfiles/nvshmem‎
Lines changed: 54 additions & 0 deletions
diff --git a/‎docs/software/communication/dockerfiles/openmpi‎
Lines changed: 12 additions & 0 deletions b/‎docs/software/communication/dockerfiles/openmpi‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/software/communication/dockerfiles/osu‎
Lines changed: 16 additions & 0 deletions b/‎docs/software/communication/dockerfiles/osu‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎docs/software/communication/dockerfiles/ucx‎
Lines changed: 13 additions & 0 deletions b/‎docs/software/communication/dockerfiles/ucx‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎docs/software/communication/index.md‎
Lines changed: 56 additions & 9 deletions b/‎docs/software/communication/index.md‎
Lines changed: 56 additions & 9 deletions
@@ -17,9 +17,11 @@ CWP
 CXI
 Ceph
 Containerfile
+Containerfiles
 DNS
 Dockerfiles
 Dufourspitze
+EFA
 EMPA
 ETHZ
 Ehrenfest
@@ -76,6 +78,8 @@ MeteoSwiss
 NAMD
 NICs
 NVMe
+NVSHMEM
+NVLINK
 Nordend
 OpenFabrics
 OAuth
@@ -102,6 +106,7 @@ ROCm
 RPA
 Roboto
 Roothaan
+SHMEM
 SSHService
 STMV
 Scopi
 
@@ -28,7 +28,7 @@ This means that Cray MPICH will automatically be linked to the GTL library, whic
     $ ldd myexecutable | grep gtl
             libmpi_gtl_cuda.so => /user-environment/linux-sles15-neoverse_v2/gcc-13.2.0/cray-gtl-8.1.30-fptqzc5u6t4nals5mivl75nws2fb5vcq/lib/libmpi_gtl_cuda.so (0x0000ffff82aa0000)
     ```
-    
+
     The path may be different, but the `libmpi_gtl_cuda.so` library should be printed when using CUDA.
     In ROCm environments the `libmpi_gtl_hsa.so` library should be linked.
     If the GTL library is not linked, nothing will be printed.
@@ -40,7 +40,7 @@ See [this page][ref-slurm-gh200] for more information on configuring Slurm to us
 !!! warning "Segmentation faults when trying to communicate GPU buffers without `MPICH_GPU_SUPPORT_ENABLED=1`"
     If you attempt to communicate GPU buffers through MPI without setting `MPICH_GPU_SUPPORT_ENABLED=1`, it will lead to segmentation faults, usually without any specific indication that it is the communication that fails.
     Make sure that the option is set if you are communicating GPU buffers through MPI.
-    
+
 !!! warning "Error: "`GPU_SUPPORT_ENABLED` is requested, but GTL library is not linked""
     If `MPICH_GPU_SUPPORT_ENABLED` is set to `1` and your application does not link against one of the GTL libraries you will get an error similar to the following during MPI initialization:
     ```bash
 
@@ -0,0 +1,36 @@
+ARG ubuntu_version=24.04
+ARG cuda_version=12.8.1
+FROM docker.io/nvidia/cuda:${cuda_version}-cudnn-devel-ubuntu${ubuntu_version}
+
+RUN apt-get update \
+    && DEBIAN_FRONTEND=noninteractive \
+       apt-get install -y \
+        build-essential \
+        ca-certificates \
+        pkg-config \
+        automake \
+        autoconf \
+        libtool \
+        cmake \
+        gdb \
+        strace \
+        wget \
+        git \
+        bzip2 \
+        python3 \
+        gfortran \
+        rdma-core \
+        numactl \
+        libconfig-dev \
+        libuv1-dev \
+        libfuse-dev \
+        libfuse3-dev \
+        libyaml-dev \
+        libnl-3-dev \
+        libnuma-dev \
+        libsensors-dev \
+        libcurl4-openssl-dev \
+        libjson-c-dev \
+        libibverbs-dev \
+        --no-install-recommends \
+    && rm -rf /var/lib/apt/lists/*
@@ -0,0 +1,20 @@
+ARG gdrcopy_version=2.5.1
+RUN git clone --depth 1 --branch v${gdrcopy_version} https://github.com/NVIDIA/gdrcopy.git \
+    && cd gdrcopy \
+    && export CUDA_PATH=/usr/local/cuda \
+    && make CC=gcc CUDA=$CUDA_PATH lib \
+    && make lib_install \
+    && cd ../ && rm -rf gdrcopy
+
+# Install libfabric
+ARG libfabric_version=1.22.0
+RUN git clone --branch v${libfabric_version} --depth 1 https://github.com/ofiwg/libfabric.git \
+    && cd libfabric \
+    && ./autogen.sh \
+    && ./configure --prefix=/usr --with-cuda=/usr/local/cuda --enable-cuda-dlopen \
+       --enable-gdrcopy-dlopen --enable-efa \
+    && make -j$(nproc) \
+    && make install \
+    && ldconfig \
+    && cd .. \
+    && rm -rf libfabric
@@ -0,0 +1,7 @@
+ARG nccl_tests_version=2.17.1
+RUN wget -O nccl-tests-${nccl_tests_version}.tar.gz https://github.com/NVIDIA/nccl-tests/archive/refs/tags/v${nccl_tests_version}.tar.gz \
+    && tar xf nccl-tests-${nccl_tests_version}.tar.gz \
+    && cd nccl-tests-${nccl_tests_version} \
+    && MPI=1 make -j$(nproc) \
+    && cd .. \
+    && rm -rf nccl-tests-${nccl_tests_version}.tar.gz
@@ -0,0 +1,54 @@
+RUN apt-get update \
+    && DEBIAN_FRONTEND=noninteractive \
+       apt-get install -y \
+        python3-venv \
+        python3-dev \
+        --no-install-recommends \
+    && rm -rf /var/lib/apt/lists/* \
+    && rm /usr/lib/python3.12/EXTERNALLY-MANAGED
+
+# Build NVSHMEM from source
+ARG nvshmem_version=3.4.5
+RUN wget -q https://developer.download.nvidia.com/compute/redist/nvshmem/${nvshmem_version}/source/nvshmem_src_cuda12-all-all-${nvshmem_version}.tar.gz \
+    && tar -xvf nvshmem_src_cuda12-all-all-${nvshmem_version}.tar.gz \
+    && cd nvshmem_src \
+    && NVSHMEM_BUILD_EXAMPLES=0 \
+       NVSHMEM_BUILD_TESTS=1 \
+       NVSHMEM_DEBUG=0 \
+       NVSHMEM_DEVEL=0 \
+       NVSHMEM_DEFAULT_PMI2=0 \
+       NVSHMEM_DEFAULT_PMIX=1 \
+       NVSHMEM_DISABLE_COLL_POLL=1 \
+       NVSHMEM_ENABLE_ALL_DEVICE_INLINING=0 \
+       NVSHMEM_GPU_COLL_USE_LDST=0 \
+       NVSHMEM_LIBFABRIC_SUPPORT=1 \
+       NVSHMEM_MPI_SUPPORT=1 \
+       NVSHMEM_MPI_IS_OMPI=1 \
+       NVSHMEM_NVTX=1 \
+       NVSHMEM_PMIX_SUPPORT=1 \
+       NVSHMEM_SHMEM_SUPPORT=1 \
+       NVSHMEM_TEST_STATIC_LIB=0 \
+       NVSHMEM_TIMEOUT_DEVICE_POLLING=0 \
+       NVSHMEM_TRACE=0 \
+       NVSHMEM_USE_DLMALLOC=0 \
+       NVSHMEM_USE_NCCL=1 \
+       NVSHMEM_USE_GDRCOPY=1 \
+       NVSHMEM_VERBOSE=0 \
+       NVSHMEM_DEFAULT_UCX=0 \
+       NVSHMEM_UCX_SUPPORT=0 \
+       NVSHMEM_IBGDA_SUPPORT=0 \
+       NVSHMEM_IBGDA_SUPPORT_GPUMEM_ONLY=0 \
+       NVSHMEM_IBDEVX_SUPPORT=0 \
+       NVSHMEM_IBRC_SUPPORT=0 \
+       LIBFABRIC_HOME=/usr \
+       NCCL_HOME=/usr \
+       GDRCOPY_HOME=/usr/local \
+       MPI_HOME=/usr \
+       SHMEM_HOME=/usr \
+       NVSHMEM_HOME=/usr \
+       cmake . \
+       && make -j$(nproc) \
+       && make install \
+   && ldconfig \
+   && cd .. \
+   && rm -r nvshmem_src nvshmem_src_cuda12-all-all-${nvshmem_version}.tar.gz
@@ -0,0 +1,12 @@
+ARG OMPI_VER=5.0.8
+RUN wget -q https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-${OMPI_VER}.tar.gz \
+    && tar xf openmpi-${OMPI_VER}.tar.gz \
+    && cd openmpi-${OMPI_VER} \
+    && ./configure --prefix=/usr --with-ofi=/usr --with-ucx=/usr \
+        --enable-oshmem --with-cuda=/usr/local/cuda \
+        --with-cuda-libdir=/usr/local/cuda/lib64/stubs \
+    && make -j$(nproc) \
+    && make install \
+    && ldconfig \
+    && cd .. \
+    && rm -rf openmpi-${OMPI_VER}.tar.gz openmpi-${OMPI_VER}
@@ -0,0 +1,16 @@
+ARG omb_version=7.5.1
+RUN wget -q http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-${omb_version}.tar.gz \
+    && tar xf osu-micro-benchmarks-${omb_version}.tar.gz \
+    && cd osu-micro-benchmarks-${omb_version} \
+    && ldconfig /usr/local/cuda/targets/sbsa-linux/lib/stubs \
+    && ./configure --prefix=/usr/local CC=$(which mpicc) CFLAGS="-O3 -lcuda -lnvidia-ml" \
+                   --enable-cuda --with-cuda-include=/usr/local/cuda/include \
+                   --with-cuda-libpath=/usr/local/cuda/lib64 \
+                   CXXFLAGS="-lmpi -lcuda" \
+    && make -j$(nproc) \
+    && make install \
+    && ldconfig \
+    && cd .. \
+    && rm -rf osu-micro-benchmarks-${omb_version} osu-micro-benchmarks-${omb_version}.tar.gz
+
+WORKDIR /usr/local/libexec/osu-micro-benchmarks/mpi
@@ -0,0 +1,13 @@
+# Install UCX
+ARG UCX_VERSION=1.19.0
+RUN wget https://github.com/openucx/ucx/releases/download/v${UCX_VERSION}/ucx-${UCX_VERSION}.tar.gz \
+    && tar xzf ucx-${UCX_VERSION}.tar.gz \
+    && cd ucx-${UCX_VERSION} \
+    && mkdir build \
+    && cd build \
+    && ../configure --prefix=/usr --with-cuda=/usr/local/cuda --with-gdrcopy=/usr/local \
+       --enable-mt --enable-devel-headers \
+    && make -j$(nproc) \
+    && make install \
+    && cd ../.. \
+    && rm -rf ucx-${UCX_VERSION}.tar.gz ucx-${UCX_VERSION}
@@ -1,20 +1,67 @@
 [](){#ref-software-communication}
 # Communication Libraries
 
-CSCS provides common communication libraries optimized for the [Slingshot 11 network on Alps][ref-alps-hsn].
+Communication libraries, like MPI and NCCL, are one of the building blocks for high performance scientific and ML workloads.
+Broadly speaking, there are two levels of communication:
+
+* **Intra-node** communication between two processes on the same node.
+* **Inter-node** communication between different nodes, over the [Slingshot 11 network][ref-alps-hsn] that connects nodes on Alps.
+
+To get the best inter-node performance on Alps, they need to be configured to use the [libfabric][ref-communication-libfabric] library that has an optimised back end for the Slingshot 11 network on Alps.
+
+As such, communication libraries are part of the "base layer" of libraries and tools used by all workloads to fully utilize the hardware on Alps.
+They comprise the *network* layer in the following stack:
+
+* **CPU**: compilers with support for building applications optimized for the CPU architecture on the node.
+* **GPU**: CUDA and ROCM provide compilers and runtime libraries for NVIDIA and AMD GPUs respectively.
+* **Network**: libfabric, MPI, NCCL, NVSHMEM, need to be configured for the Slingshot network.
+
+CSCS provides communication libraries optimised for libfabric and Slingshot in uenv, and guidance on how to create container images that use them.
+This section of the documentation provides advice on how to build and install software to use these libraries, and how to deploy them.
 
 For most scientific applications relying on MPI, [Cray MPICH][ref-communication-cray-mpich] is recommended.
 [MPICH][ref-communication-mpich] and [OpenMPI][ref-communication-openmpi] may also be used, with limitations.
 Cray MPICH, MPICH, and OpenMPI make use of [libfabric][ref-communication-libfabric] to interact with the underlying network.
 
-Most machine learning applications rely on [NCCL][ref-communication-nccl] or [RCCL][ref-communication-rccl] for high-performance implementations of collectives.
-NCCL and RCCL have to be configured with a plugin using [libfabric][ref-communication-libfabric] to make full use of the Slingshot network.
+Most machine learning applications rely on [NCCL][ref-communication-nccl] for high-performance implementations of collectives.
+NCCL have to be configured with a plugin using [libfabric][ref-communication-libfabric] to make full use of the Slingshot network.
 
 See the individual pages for each library for information on how to use and best configure the libraries.
 
-* [Cray MPICH][ref-communication-cray-mpich]
-* [MPICH][ref-communication-mpich]
-* [OpenMPI][ref-communication-openmpi]
-* [NCCL][ref-communication-nccl]
-* [RCCL][ref-communication-rccl]
-* [libfabric][ref-communication-libfabric]
+<div class="grid cards" markdown>
+
+-   __Low Level__
+
+    Learn about the low-level networking library libfabric, and how to use it in uenv and containers
+
+    [:octicons-arrow-right-24: libfabric][ref-alps]
+
+</div>
+<div class="grid cards" markdown>
+
+-   __MPI__
+
+    Cray MPICH is the most optimized and best tested MPI implementation on Alps, and is used by uenv.
+
+    [:octicons-arrow-right-24: Cray MPICH][ref-communication-cray-mpich]
+
+    For compatibility in containers:
+
+    [:octicons-arrow-right-24: MPICH][ref-communication-mpich]
+
+    Also OpenMPI can be built in containers or in uenv
+
+    [:octicons-arrow-right-24: OpenMPI][ref-communication-openmpi]
+
+</div>
+<div class="grid cards" markdown>
+
+-   __Machine Learning__
+
+    Communication libraries used by ML tools like Torch, and some simulation codes.
+
+    [:octicons-arrow-right-24: NCCL][ref-communication-nccl]
+
+    [:octicons-arrow-right-24: NVSHMEM][ref-communication-nvshmem]
+
+</div>