1.6 KiB
1.6 KiB
title | updated | created |
---|---|---|
nvidea | 2022-04-03 11:44:16Z | 2021-05-04 14:58:11Z |
NVIDIA
show installed video drivers
nvidia-smi
list installed hw
lspci | grep -i nvidia sudo lshw -numeric -C display
find NVIDIA modules
find /usr/lib/modules -name nvidia.ko
Settings
nvidia-settings
run
nvidia-smi nvidia-smi -L
nvidia-smi -l n # run every n seconds
monitoring nvidia
https://github.com/fbcotter/py3nvml
successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero => error; Modify in host and set the -1 to 0
/sys/bus/pci/devices/0000:2b:00.0/numa_node
for a in /sys/bus/pci/devices/*; do echo 0 | sudo tee -a $a/numa_node; done
set numa value at start computer
sudo crontab -e
sudo VISUAL=vi crontab -e
# Add the following line
@reboot (echo 0 | tee -a "/sys/bus/pci/devices/<PCI_ID>/numa_node")
start docker with --gpus=all every time, otherwise error
failed call to cuInit: UNKNOWN ERROR (-1
no NVIDIA GPU device is present: /dev/nvidia0 does not exist
docker run -it -p 8888:8888 --gpus=all tensorflow/tensorflow:latest-gpu-jupyter
update nvidea drivers
ubuntu-drivers autoinstall