# この記事は

Docker コンテナから GPU 触りたい(CUDA)
これを Windows 環境で実現したい
若干詰まったのでメモしました

# TL;DR

真っ先に Docker Desktop for Windows を消せ
あとはついにWSL2+docker+GPUを動かせるようになったらしいので試してみる - Qiita を信じろ
Docker インストールスクリプトの「やめよ？」は信じるな（2020/12/19現在の話です）

# なんで Windows で GPU コンテナ？

python で GPU 使いたい(GPU バックエンドの numpy)
でも Windows 環境上で python を使いたくない（Intel MKL のバージョン不整合が原因で十時間単位で人生を食われてる）
ついでに python のパッケージマネージャ周りがアレなので仮想環境とかそういうの触りたくない
ということで docker コンテナ上でよしなにしたい

以下、環境構築の経緯。

# Docker Desktop for Windows を WSL2 対応させる

※はるか昔にやったのを思い出して書いてるので不正確かも - 前提 - はるか昔の更に昔に Docker Desktop for Windows はインストール済み - Windows update かけた後… - PC 再起動後に Docker くんに WSL2 対応を勧められた - 多分これ：WSL 2 対応 Docker Desktop for Windowsを使うための手順 - Qiita - 言われるがままに設定したらなんか WSL2 で動いてるっぽい感じになった - 本当かどうかわかんないけど警告とかは出なくなったして docker も動いてるので放置

※昔の話ここまで

# Windows Insider Program に参加

※過去１～２時間の出来事を思い出しながら書いている

「設定 --> 更新とセキュリティ --> Windows Insider Program」
- 色々登録して Dev チャンネルに設定
「設定 --> 更新とセキュリティ --> Windows Update --> 詳細オプション」
- 「Windows の更新時に他の Microsoft 製品の更新プログラムを受け取る」をONにする
出てくる更新を全部入れる
- つまりひたすら再起動
- Insider Preview 有効化後の初回再起動は割と重いので注意
- 特に最新の「Windows Subsystem for Linux Update」が入っていることが重要

# NVIDIA ドライバを Windows にインストール

※過去１～２時間の出来事を思い出しながら書いている

CUDA on WSL :: CUDA Toolkit Documentation
- この手順でいう「Installing NVIDIA Drivers」の項目
いうて CUDA ZONE から最新の CUDA ドライバを入手してインストールするだけ
- ただし、ダウンロードには nVidia Developer の登録が必要で、これがめんどくさかった
- 登録情報の Organization の項目でちょっと悩んだけど Indivisual Developer でお茶を濁した

# WSL2 版の Ubuntu を動かす

※過去１～２時間の出来事を思い出しながら書いている

## Windows Store から Ubuntu をインストール

インストールが完了すると即 Ubuntu のコンソールが立ち上がる
ユーザー名とパスワードが要求されるので入れる（Ubuntu 上のユーザーアカウントを作成）

## WSL2 で動いてるかチェック(Windows から)

管理者権限の cmd で以下のコマンドを実行

wsl.exe --list -v

以下のような結果になるはず

  NAME                   STATE           VERSION
* docker-desktop-data    Running         2
  Ubuntu                 Running         2
  docker-desktop         Running         2

VERSION 列が 2 になってれば OK らしい

## WSL2 で動いているかチェック(Ubuntu から)

ubuntu 上で以下のコマンド

uname -r

↓みたいな感じで結果に WSL2 がついてれば多分ダイジョブ

5.4.72-microsoft-standard-WSL2

自分がやったときは最初 WSL2 がついてなかったけど PC 再起動したら何か付いた

## WSL2 になってない場合

wsl.exe --set-default-version 2 とか
wsl.exe --set-version (distro name) 2 とか
WSL2 を使用するように設定できるらしい
このコマンドどっかの手順で実行してるかも…

# docker 入れる

※ここからはメモしながら作業してる

## ubuntu 上で docker を入れようとした

以下のコマンドを実行

curl https://get.docker.com | sh

そしたら↓って言われた

Warning: the "docker" command appears to already exist on this system.

If you already have Docker installed, this script can cause trouble, which is
why we're displaying this warning and provide the opportunity to cancel the
installation.

If you installed the current Docker package using this script and are using it
again to update Docker, you can safely ignore this message.

You may press Ctrl+C now to abort this script.

そんなバカなと思って docker コマンドを打つと

The command 'docker' could not be found in this WSL 2 distro.
We recommend to activate the WSL integration in Docker Desktop settings.

See https://docs.docker.com/docker-for-windows/wsl/ for details.

ちゃんと教えてくれるようになってるとか親切のデーモンかよ

## Docker Descktor for Windows で設定入れる

Docker Descktor for Windows の設定開く
General
- Use the WSL 2 based engine にチェックは入ってた
Resource --> WSL Integration
- いつの間にか Enable integration with additional distros: に Ubuntu が増えてた
- 無効化されてたのでスイッチ入れる
Apply & Restart

## ubuntu で docker 呼び出す

今度はいつものヘルプが出てきた

# nVidia Container Toolkit 入れる

## apt-get でサクッと入れる

ついにWSL2+docker+GPUを動かせるようになったらしいので試してみる - Qiita の内容をありがたく使わせてもらう

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list

sudo apt-get update
sudo apt-get install -y nvidia-docker2

# docker のテスト

## まずは nginx

いつものこのコマンド

docker run -p 8080:80 nginx

Ubuntu からだと

Unable to find image 'nginx:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.65.1:53: read udp 192.168.65.3:52374->192.168.65.1:53: i/o timeout.
See 'docker run --help'.

Windows からだと

docker: Error response from daemon: open \\.\pipe\docker_engine_linux: The system cannot find the file specified.
See 'docker run --help'.

？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？
ちなみに PC を再起動しても改善しなかった

# docker トラブルシューティング

## Error response from daemon: open \.\pipe\docker_engine_linux: The system cannot find the file specified. · Issue #4495 · docker/for-win · GitHub

docker をリスタートしたら直ったって言ってる人が多数
Docker Descktop for Windows のデバッグメニューからリスタートして cmd から docker run -p 8080:80 nginx をリトライしたら通った…
ブラウザから http://localhost:8080 でアクセスできることを確認
ubuntu からでも全く同様に正常動作を確認
めっちゃモニョるけどまぁ動いたのでヨシ！
PC 起動後ちょっと待たないとダメ系の話なのかも

## ちなみに

cmd から run して Ctrl+C で抜けたらコンテナが終了せずに残ってた
ubuntu から run して Ctrl+C で抜けた場合はちゃんとコンテナ終了してた
怖すぎる

# docker のテスト（再）

## まずは nvidia-smi

以下のコマンド

docker run --gpus all --rm nvidia/cuda:9.0-base nvidia-smi

ubuntu 上で実行したら

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

cmd から実行したら

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

はい

# なんでダメ？

## 多分 docker が Docker Descktop for Windows なのがダメ

Ubuntu に docker を入れないとダメ
上手くいってる記事では Docker Descktop for Windows を入れていない

## しょうがないので

Docker Desktop for Windows を消して途中からやり直す
Windows からコンテナを立ち上げたいという欲求が微塵も無いってのも理由の１つ

# ついでなので Hyper-V を無効化

## やりかた

「Windows の機能の有効化または無効化」で「Hyper-V」のチェックを外す

## なんで？

もともと Hyper-V で docker 使ってた民だったので「Hyper-V」を有効化する必要があった
でも WSL2 ベースの Docker は「仮想マシンプラットフォーム」のほうが必要で「Hyper-V」は不要
なので先に関連の設定を全部クリーンにしてからインストール手順を進めたい

## 注意

「Windows の機能の有効化または無効化」上の機能名としての「Hyper-V」は WSL2 では不要という意味
技術名としての「Hyper-V」は WSL2 でも使ってる

# Docker Desktop for Windows を削除

フツーに「アプリと機能」から削除
終わったら一応 PC をリブート

# Ubuntu のセットアップをやりなおす

## docker があるかチェック

docker で思いっきりヘルプ出てきたし
でも docker run nginx はデーモンにつながらないって言われる
パッケージ調べると

$ apt list --installed | grep docker

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

docker.io/focal-updates,focal-security,now 19.03.8-0ubuntu1.20.04.1 amd64 [installed,automatic]
nvidia-docker2/bionic,now 2.5.0-1 all [installed]

nvidia-docker2 のせいっぽいので関連含めて一旦削除

sudo apt remove nvidia-docker2
sudo apt autoremove

これで docker コマンドが消えた

## Ubuntu に docker 入れる

以下のコマンドをありがたく使わせてもらう

curl https://get.docker.com | sh

やめよ？　って言われたけどそんなん知らんわ！　Docker Desktop for Windows で GPU 使えないからやっとるんじゃい

WSL DETECTED: We recommend using Docker Desktop for Windows.
Please get Docker Desktop from https://www.docker.com/products/docker-desktop

（将来的には Docker Desktop for Windows 上でも GPU コンテナ動くのかもだけど、その時に環境作り直せば良いやと思っている）
20秒待つとインストールが完了する
ついでなので sudo なしで run できるようにしておく

sudo usermod -aG docker <USERNAME HERE>

## nvidia-docker2 を入れる

リポジトリの設定はすでにやってるのでスキップ
以下のコマンドだけ

sudo apt update
sudo apt install nvidia-docker2

## まずは nginx

これ

docker run -p 8080:80 nginx

問題なく通った

## 次は nvidia-smi

以下のコマンド

docker run --gpus all --rm nvidia/cuda:9.0-base nvidia-smi

失敗した

$ docker run --gpus all --rm nvidia/cuda:9.0-base nvidia-smi
Unable to find image 'nvidia/cuda:9.0-base' locally
9.0-base: Pulling from nvidia/cuda
be8ec4e48d7f: Pull complete
33b8b485aff0: Pull complete
d887158cc58c: Pull complete
05895bb28c18: Pull complete
84ba571a9830: Pull complete
dcfa08e04229: Pull complete
0ae7d70a879e: Pull complete
Digest: sha256:0ae476107c47e56c258414c46300ae3fdca03a3bb955665d40a4dde1e5a60ad4
Status: Downloaded newer image for nvidia/cuda:9.0-base
docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: exec: "nvidia-smi": executable file not found in $PATH: unknown.

単に nvidia-smi が無い？

## 気を取り直してベンチマーク

以下のコマンド

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

動いた（めっちゃニッコリしてます）

$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Unable to find image 'nvcr.io/nvidia/k8s/cuda-sample:nbody' locally
nbody: Pulling from nvidia/k8s/cuda-sample
22dc81ace0ea: Pull complete
1a8b3c87dba3: Pull complete
91390a1c435a: Pull complete
07844b14977e: Pull complete
b78396653dae: Pull complete
95e837069dfa: Pull complete
fef4aadda783: Pull complete
343234bd5cf3: Pull complete
d1e57bfda6f0: Pull complete
c67b413dfc79: Pull complete
529d6d22ae9f: Pull complete
d3a7632db2b3: Pull complete
4a28a573fcc2: Pull complete
71a88f11fc6a: Pull complete
11019d591d86: Pull complete
10f906646436: Pull complete
9b617b771963: Pull complete
6515364916d7: Pull complete
Digest: sha256:aaca690913e7c35073df08519f437fa32d4df59a89ef1e012360fbec46524ec8
Status: Downloaded newer image for nvcr.io/nvidia/k8s/cuda-sample:nbody
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
MapSMtoCores for SM 7.5 is undefined.  Default to use 64 Cores/SM
GPU Device 0: "GeForce RTX 2070" with compute capability 7.5

> Compute 7.5 CUDA device: [GeForce RTX 2070]
36864 bodies, total time for 10 iterations: 58.368 ms
= 232.824 billion interactions per second
= 4656.487 single-precision GFLOP/s at 20 flops per interaction

# 感想

ほんとに動いたよ
あとは vscode との連携だけが問題

# この記事は

# TL;DR

# なんで Windows で GPU コンテナ？

# Docker Desktop for Windows を WSL2 対応させる

# Windows Insider Program に参加

# NVIDIA ドライバを Windows にインストール

# WSL2 版の Ubuntu を動かす

## Windows Store から Ubuntu をインストール

## WSL2 で動いてるかチェック(Windows から)

## WSL2 で動いているかチェック(Ubuntu から)

## WSL2 になってない場合

# docker 入れる

## ubuntu 上で docker を入れようとした

## Docker Descktor for Windows で設定入れる

## ubuntu で docker 呼び出す

# nVidia Container Toolkit 入れる

## apt-get でサクッと入れる

# docker のテスト

## まずは nginx

# docker トラブルシューティング

## Error response from daemon: open \.\pipe\docker_engine_linux: The system cannot find the file specified. · Issue #4495 · docker/for-win · GitHub

## ちなみに

# docker のテスト（再）

## まずは nvidia-smi

# なんでダメ？

## 多分 docker が Docker Descktop for Windows なのがダメ

## しょうがないので

# ついでなので Hyper-V を無効化

## やりかた

## なんで？

## 注意

# Docker Desktop for Windows を削除

# Ubuntu のセットアップをやりなおす

## docker があるかチェック

## Ubuntu に docker 入れる

## nvidia-docker2 を入れる

## まずは nginx

## 次は nvidia-smi

## 気を取り直してベンチマーク

# 感想

# 参考