ollama/llm
Jesse Gross 8bf38552de llm: Prefer dedicated GPUs over iGPUs when allocating memory
We currently assign model layers to GPUs according to free VRAM,
which assumes that GPU performance is roughly equal. This does not
work well for mixed dGPU and iGPU systems because iGPUs typically
use system memory which is large but their performance is slow.
This instead assigns layers to dGPUs first and then iGPUs.

In the future, this could be generalized to have a more fine grained
notion of GPU performance but dGPU vs. iGPU performance is the most
extreme.
2025-11-11 13:11:08 -08:00
..
llm_darwin.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_linux.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_windows.go win: lint fix (#10571) 2025-05-05 11:08:12 -07:00
server.go llm: Prefer dedicated GPUs over iGPUs when allocating memory 2025-11-11 13:11:08 -08:00
server_test.go llm: Prefer dedicated GPUs over iGPUs when allocating memory 2025-11-11 13:11:08 -08:00
status.go logs: catch rocm errors (#12888) 2025-10-31 09:54:25 -07:00