I am running ollama on a dedicated server, it is working but I think it is using the CPU and not the GPU. See log message. How to get it to use the GPU?
Following https://hub.docker.com/r/ollama/ollama. I did the following:
Install the NVIDIA Container Toolkit packages
sudo apt-get install -y nvidia-container-toolkit
Setup Nvidia runtime in docker
sudo nvidia-ctk runtime configure --runtime=docker
Start Ollama container
docker run -d --network=host --restart always -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Run a model
docker exec ollama ollama run llama3.2
In the logs I found
level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
level=INFO source=gpu.go:386 msg="no compatible GPUs were discovered"
Running nvidia-smi shows that the server has a GPU NVIDIA RTX 4000 SFF Ada
You are missing the flag --gpus=all
to allow access to the GPU from the container.
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama