Summaries/AI/nvidea/nvidea.md

---
title: nvidea
updated: 2022-04-03 11:44:16Z
created: 2021-05-04 14:58:11Z
---

# NVIDIA

## show installed video drivers

nvidia-smi

[Latest drivers](https://www.nvidia.com/Download/index.aspx?lang=en-us)

---

## list installed hw

lspci | grep -i nvidia
sudo lshw -numeric -C display

## find NVIDIA modules

find /usr/lib/modules -name nvidia.ko

## Settings

nvidia-settings

## run
```bash
nvidia-smi                               nvidia-smi -L
nvidia-smi -l n   # run every n seconds
```


## monitoring nvidia
https://github.com/fbcotter/py3nvml

---

## successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero  => error; Modify in host and set the -1 to 0
/sys/bus/pci/devices/0000:2b:00.0/numa_node

for a in /sys/bus/pci/devices/*; do echo 0 | sudo tee -a $a/numa_node; done

https://stackoverflow.com/questions/44232898/memoryerror-in-tensorflow-and-successful-numa-node-read-from-sysfs-had-negativ

---

## set numa value at start computer

```bash
sudo crontab -e
# Add the following line
@reboot (echo 0 | tee -a "/sys/bus/pci/devices/<PCI_ID>/numa_node")
```
[Source](https://askubuntu.com/questions/1379119/how-to-set-the-numa-node-for-an-nvidia-gpu-persistently)

---

## start docker with --gpus=all every time, otherwise error
### failed call to cuInit: UNKNOWN ERROR (-1
### no NVIDIA GPU device is present: /dev/nvidia0 does not exist
docker run -it -p 8888:8888 --gpus=all tensorflow/tensorflow:latest-gpu-jupyter

---

## update nvidea drivers
ubuntu-drivers autoinstall
Init rest 2022-08-09 21:04:44 +02:00			`---`
			`title: nvidea`
			`updated: 2022-04-03 11:44:16Z`
			`created: 2021-05-04 14:58:11Z`
			`---`

			`# NVIDIA`

			`## show installed video drivers`

			`nvidia-smi`

			`[Latest drivers](https://www.nvidia.com/Download/index.aspx?lang=en-us)`

			`---`

			`## list installed hw`

			`lspci \| grep -i nvidia`
			`sudo lshw -numeric -C display`

			`## find NVIDIA modules`

			`find /usr/lib/modules -name nvidia.ko`

			`## Settings`

			`nvidia-settings`

			`## run`
			```bash
			`nvidia-smi nvidia-smi -L`
			`nvidia-smi -l n # run every n seconds`
			```


			`## monitoring nvidia`
			`https://github.com/fbcotter/py3nvml`

			`---`

			`## successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero => error; Modify in host and set the -1 to 0`
			`/sys/bus/pci/devices/0000:2b:00.0/numa_node`

			`for a in /sys/bus/pci/devices/*; do echo 0 \| sudo tee -a $a/numa_node; done`

			`https://stackoverflow.com/questions/44232898/memoryerror-in-tensorflow-and-successful-numa-node-read-from-sysfs-had-negativ`

			`---`

			`## set numa value at start computer`

			```bash
			`sudo crontab -e`
			`# Add the following line`
			`@reboot (echo 0 \| tee -a "/sys/bus/pci/devices/<PCI_ID>/numa_node")`
			```
			`[Source](https://askubuntu.com/questions/1379119/how-to-set-the-numa-node-for-an-nvidia-gpu-persistently)`

			`---`

			`## start docker with --gpus=all every time, otherwise error`
			`### failed call to cuInit: UNKNOWN ERROR (-1`
			`### no NVIDIA GPU device is present: /dev/nvidia0 does not exist`
			`docker run -it -p 8888:8888 --gpus=all tensorflow/tensorflow:latest-gpu-jupyter`

			`---`

			`## update nvidea drivers`
			`ubuntu-drivers autoinstall`