Segmentation fault error for CT_example in Docker container

Hi!
I am trying to run the CT example in an ubuntu docker container on an ubuntu host. The host machine is equipped with a Titan X gpu and CUDA version 11.2.

The Dockerfile is based on the example from this site (How to set up OpenCL for GPUs on Linux and Docker) and enhanced with the installation steps from the GGEMS documentation:

# Dockerfile bases on description on https://linuxhandbook.com/setup-opencl-linux-docker/ .install
# It's a bit modified to build also the GGEMS framework.

#FROM ubuntu:20.04
# we need the devel image because we need the cuda-toolkit-dir for compiling GGEMS
FROM nvidia/cuda:11.2.1-devel-ubuntu20.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get -y upgrade \
  && apt-get install -y \
    apt-utils \
    unzip \
    tar \
    curl \
    xz-utils \
    ocl-icd-libopencl1 \
    opencl-headers \
    clinfo \
    git \
	cmake-curses-gui \
	g++ \
	python3.8 \
	python3-pip \
	nano \
    ;

RUN mkdir -p /etc/OpenCL/vendors && \
    echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd

# compile ggems according to https://doc.ggems.fr/v1.2/building_and_installing.html
RUN mkdir /home/ggems_build && mkdir /home/ggems_install
RUN cd /home && git clone https://github.com/GGEMS/ggems.git
RUN cd /home/ggems_build
RUN cmake -DCMAKE_INSTALL_PREFIX=/home/ggems_install /home/ggems
RUN make
RUN make install

# set environment variables according to https://doc.ggems.fr/v1.2/building_and_installing.html
ENV LD_LIBRARY_PATH=/home/ggems_install/ggems/lib
ENV PYTHONPATH=/home/ggems_install/ggems/python_module:/home/ggems_install/ggems/lib

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility

After building the Docker image, I ran the container with:
docker run -it --gpus all docker_image

and ran the following commands inside the container:
cd /home/ggems_install/ggems/examples/2_CT_Scanner/
python3.8 ct_scanner.py -v 3

The output is a “Segmentation fault (core dumped)” error:

When I print the device infos with the OpenCLManager, I get the following output:

I’d appreciate any help with this. Do you by any chance plan to include a Dockerfile in the following releases?

Thanks in advance!
Anneke

Hi Anneke,
Unfortunately I don’t have too much experience with Docker and GGEMS together, I didn’t do the validation. I will try to use your Docker configuration on one of my machines, I will see if I find a solution.

Did you run the simulation several times? Does the ‘segmentation fault’ always occur in the same place? Sometimes on Linux the first simulations end with a ‘seg fault’, I don’t know why, it may be due to an initialization state of the graphics card.

Kind regards,
Didier

Hi Didier,

thanks for your reply! It would be really great if you could investigate if you run into the same error.

The error occurs always in the same place before the simulation starts. It seems to occur in or after using the CheckKernel function of the GGEMSOpenCLManager, as the output “|GGEMS GGEMSOpenCLManager::CheckKernel|(3) Checking if kernel has already been compiled…” is printed before the error.

I don’t know if it is relevant, but I got a lot of warnings during the build process of the GGEMS framework inside Docker, as you can see in the image below.

Hi Anneke,
I managed to reproduce the bug with docker, and I fixed it. I committed my corrections in the master branch of git. It should work now.
The warnings come from the gcc compiler, but they have no impact on the compilation, you can compile with clang instead of gcc if you want to delete them.

Kind regards
Didier

Hi Didier,

I just tested the committed code and now I can run simulations within the Docker container, too. Thank you so much for looking into this and fixing it!

Best regards,
Anneke