8.2 KiB
		
	
	
	
	
	
			
		
		
	
	GPU
Nvidia
Ollama supports Nvidia GPUs with compute capability 5.0+.
Check your compute compatibility to see if your card is supported: https://developer.nvidia.com/cuda-gpus
| Compute Capability | Family | Cards | 
|---|---|---|
| 9.0 | NVIDIA | H100 | 
| 8.9 | GeForce RTX 40xx | RTX 4090RTX 4080 SUPERRTX 4080RTX 4070 Ti SUPERRTX 4070 TiRTX 4070 SUPERRTX 4070RTX 4060 TiRTX 4060 | 
| NVIDIA Professional | L4L40RTX 6000 | |
| 8.6 | GeForce RTX 30xx | RTX 3090 TiRTX 3090RTX 3080 TiRTX 3080RTX 3070 TiRTX 3070RTX 3060 TiRTX 3060RTX 3050 TiRTX 3050 | 
| NVIDIA Professional | A40RTX A6000RTX A5000RTX A4000RTX A3000RTX A2000A10A16A2 | |
| 8.0 | NVIDIA | A100A30 | 
| 7.5 | GeForce GTX/RTX | GTX 1650 TiTITAN RTXRTX 2080 TiRTX 2080RTX 2070RTX 2060 | 
| NVIDIA Professional | T4RTX 5000RTX 4000RTX 3000T2000T1200T1000T600T500 | |
| Quadro | RTX 8000RTX 6000RTX 5000RTX 4000 | |
| 7.0 | NVIDIA | TITAN VV100Quadro GV100 | 
| 6.1 | NVIDIA TITAN | TITAN XpTITAN X | 
| GeForce GTX | GTX 1080 TiGTX 1080GTX 1070 TiGTX 1070GTX 1060GTX 1050 TiGTX 1050 | |
| Quadro | P6000P5200P4200P3200P5000P4000P3000P2200P2000P1000P620P600P500P520 | |
| Tesla | P40P4 | |
| 6.0 | NVIDIA | Tesla P100Quadro GP100 | 
| 5.2 | GeForce GTX | GTX TITAN XGTX 980 TiGTX 980GTX 970GTX 960GTX 950 | 
| Quadro | M6000 24GBM6000M5000M5500MM4000M2200M2000M620 | |
| Tesla | M60M40 | |
| 5.0 | GeForce GTX | GTX 750 TiGTX 750NVS 810 | 
| Quadro | K2200K1200K620M1200M520M5000MM4000MM3000MM2000MM1000MK620MM600MM500M | 
GPU Selection
If you have multiple NVIDIA GPUs in your system and want to limit Ollama to use
a subset, you can set CUDA_VISIBLE_DEVICES to a comma separated list of GPUs.
Numeric IDs may be used, however ordering may vary, so UUIDs are more reliable.
You can discover the UUID of your GPUs by running nvidia-smi -L If you want to
ignore the GPUs and force CPU usage, use an invalid GPU ID (e.g., "-1")
Laptop Suspend Resume
On linux, after a suspend/resume cycle, sometimes Ollama will fail to discover
your NVIDIA GPU, and fallback to running on the CPU.  You can workaround this
driver bug by reloading the NVIDIA UVM driver with sudo rmmod nvidia_uvm && sudo modprobe nvidia_uvm
AMD Radeon
Ollama supports the following AMD GPUs:
Linux Support
| Family | Cards and accelerators | 
|---|---|
| AMD Radeon RX | 7900 XTX7900 XT7900 GRE7800 XT7700 XT7600 XT76006950 XT6900 XTX6900XT6800 XT6800Vega 64Vega 56 | 
| AMD Radeon PRO | W7900W7800W7700W7600W7500W6900XW6800X DuoW6800XW6800V620V420V340V320Vega II DuoVega IIVIISSG | 
| AMD Instinct | MI300XMI300AMI300MI250XMI250MI210MI200MI100MI60MI50 | 
Windows Support
With ROCm v6.1, the following GPUs are supported on Windows.
| Family | Cards and accelerators | 
|---|---|
| AMD Radeon RX | 7900 XTX7900 XT7900 GRE7800 XT7700 XT7600 XT76006950 XT6900 XTX6900XT6800 XT6800 | 
| AMD Radeon PRO | W7900W7800W7700W7600W7500W6900XW6800X DuoW6800XW6800V620 | 
Overrides on Linux
Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In
some cases you can force the system to try to use a similar LLVM target that is
close.  For example The Radeon RX 5400 is gfx1034 (also known as 10.3.4)
however, ROCm does not currently support this target. The closest support is
gfx1030.  You can use the environment variable HSA_OVERRIDE_GFX_VERSION with
x.y.z syntax.  So for example, to force the system to run on the RX 5400, you
would set HSA_OVERRIDE_GFX_VERSION="10.3.0" as an environment variable for the
server.  If you have an unsupported AMD GPU you can experiment using the list of
supported types below.
If you have multiple GPUs with different GFX versions, append the numeric device
number to the environment variable to set them individually.  For example,
HSA_OVERRIDE_GFX_VERSION_0=10.3.0 and  HSA_OVERRIDE_GFX_VERSION_1=11.0.0
At this time, the known supported GPU types on linux are the following LLVM Targets. This table shows some example GPUs that map to these LLVM targets:
| LLVM Target | An Example GPU | 
|---|---|
| gfx900 | Radeon RX Vega 56 | 
| gfx906 | Radeon Instinct MI50 | 
| gfx908 | Radeon Instinct MI100 | 
| gfx90a | Radeon Instinct MI210 | 
| gfx940 | Radeon Instinct MI300 | 
| gfx941 | |
| gfx942 | |
| gfx1030 | Radeon PRO V620 | 
| gfx1100 | Radeon PRO W7900 | 
| gfx1101 | Radeon PRO W7700 | 
| gfx1102 | Radeon RX 7600 | 
AMD is working on enhancing ROCm v6 to broaden support for families of GPUs in a future release which should increase support for more GPUs.
Reach out on Discord or file an issue for additional help.
GPU Selection
If you have multiple AMD GPUs in your system and want to limit Ollama to use a
subset, you can set ROCR_VISIBLE_DEVICES to a comma separated list of GPUs.
You can see the list of devices with rocminfo.  If you want to ignore the GPUs
and force CPU usage, use an invalid GPU ID (e.g., "-1").  When available, use the
Uuid to uniquely identify the device instead of numeric value.
Container Permission
In some Linux distributions, SELinux can prevent containers from
accessing the AMD GPU devices.  On the host system you can run
sudo setsebool container_use_devices=1 to allow containers to use devices.
Metal (Apple GPUs)
Ollama supports GPU acceleration on Apple devices via the Metal API.