GPU Computing

From Alex Griffing…

SLURM partition

The GPUs are accessible through compute nodes in the gpu SLURM partition. You can send a script to this queue using a command like sbatch -p gpu myscript.sh.

Hardware

The gpu computing node has two K20 GPUs:

$ srun -p gpu lspci | grep -i nvidia
05:00.0 3D controller: NVIDIA Corporation Device 1028 (rev a1)
42:00.0 3D controller: NVIDIA Corporation Device 1028 (rev a1)

Directory structure

The development tools for GPU computing, including libraries and compilers, have been installed in /usr/local/cuda-5.0/. Of particular interest could be nvcc in /usr/local/cuda-5.0/bin/ and the shared object files such as libcudart.so and libcurand.so in /usr/local/cuda-5.0/lib64/.

Hardware-specific nvcc build options

Because we are using K20 GPUs, you will want to tell the nvcc compiler that we can use their modern features such as double precision floating point arithmetic. To do this, use -arch sm_35 to specify the 'compute capability' of 3.5. Other options that you will probably want to use for scientific computing are --prec-div=true and --prec-sqrt=true. For more information see the nvcc options page.

Headless GPGPU profiling

The NVIDIA Visual Profiler nvvp (packaged as nvidia-visual-profiler for Ubuntu) is a cross-platform way to profile GPGPU applications, but this is less useful when we can run the applications only on the command line through the SLURM resource manager. In this case we can use nvprof (packaged as nvidia-profiler) to get a log file which can be downloaded and viewed on a desktop using nvvp.

BRC Cluster Workshop

Table of Contents

GPU Computing

SLURM partition

Hardware

Directory structure

Hardware-specific nvcc build options

Headless GPGPU profiling

BRC Cluster Workshop

User Tools

Site Tools

Table of Contents

GPU Computing

SLURM partition

Hardware

Directory structure

Hardware-specific nvcc build options

Headless GPGPU profiling

Page Tools