Docker 容器中使用 Nvidia 显卡

Docker 容器中使用 Nvidia 显卡。> Linux *** 4.15.0-142-generic #146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021 x86_64 x86_64 x8

Docker 容器中使用 Nvidia 显卡

环境

Linux *** 4.15.0-142-generic #146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

NVIDIA Corporation GT218 [GeForce 210]

Docker version 18.09.7, build 2d0083d

问题

docker run –runtime=nvidia … 报错 docker: Error response from daemon: Unknown runtime specified nvidia.

解决方式

确认当前环境状态

默认 docker 已安装

  • 确认显卡是否安装
1
lspci | grep -i vga
  • 确认显卡驱动是否安装
1
nvidia-smi
  • 确认 nvidia-docker2 是否安装
1
nvidia-docker image ls

安装显卡驱动

本示例是在 Ubuntu 环境

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# 获取显卡驱动列表(优先安装 recommended 标识的驱动,服务器优先安装 -server 后缀的)
ubuntu-drivers devices
---------------------------------------------------------------------------------------------
== /sys/devices/pci0000:00/0000:00:03.1/0000:07:00.0 ==
modalias : pci:v000010DEd00000A65sv00000000sd00000000bc03sc00i00
vendor   : NVIDIA Corporation
model    : GT218 [GeForce 210]
driver   : xserver-xorg-video-nouveau - distro free builtin
driver   : nvidia-340 - distro non-free recommended
driver   : nvidia-304 - distro non-free

# 安装驱动
sudo apt install nvidia-340

# 如 nvidia-msi 依旧报错,重启服务器

安装 nvidia-docker2

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update

sudo apt-get install -y nvidia-docker2

sudo pkill -SIGHUP dockerd
# 至此安装完成
---------------------------------------------------------------------------------------------
# 依然存在问题,则重启 docker
sudo pkill -SIGHUP dockerd
sudo systemctl restart docker

验证

1
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Licensed under CC BY-NC-SA 4.0
Gear(夕照)的博客。记录开发、生活,以及一些不足为道的思考……