GPU available on the VM but not on the Kubernetes node
On the node I see a Tesla GPU:
$ cat /proc/driver/nvidia/gpus/0000\:00\:07.0/information
Model: Tesla T4
IRQ: 10
GPU UUID: GPU-01fd306e-ac2f-6c63-329a-c0f60196601e
Video BIOS: 90.04.38.00.03
Bus Type: PCIe
DMA Size: 47 bits
DMA Mask: 0x7fffffffffff
Bus Location: 0000:00:07.0
Device Minor: 0
GPU Excluded: No
While the Kubernetes node reports 0 GPUs:
Capacity:
cpu: 4
ephemeral-storage: 66573292Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 15999492Ki
nvidia.com/gpu: 0
pods: 110
This happens for ml-prod-2022-07-01-t4-kzd3ozfhuqbz-node-2 and ml-prod-2022-07-01-t4-kzd3ozfhuqbz-node-3.
Selinux is disabled, nvidia-gpu-device-plugin running. No errors.