ollama/discover
Jesse Gross d5a0d8d904 llm: New memory management
This changes the memory allocation strategy from upfront estimation to
tracking actual allocations done by the engine and reacting to that. The
goal is avoid issues caused by both under-estimation (crashing) and
over-estimation (low performance due to under-utilized GPUs).

It is currently opt-in and can be enabled for models running on the
Ollama engine by setting OLLAMA_NEW_ESTIMATES=1. Behavior in other
cases is unchanged and will continue to use the existing estimates.
2025-08-14 15:24:01 -07:00
..
amd_common.go next build (#8539) 2025-01-29 15:03:38 -08:00
amd_hip_windows.go Better support for AMD multi-GPU on linux (#7212) 2024-10-26 14:04:14 -07:00
amd_linux.go llm: New memory management 2025-08-14 15:24:01 -07:00
amd_windows.go next build (#8539) 2025-01-29 15:03:38 -08:00
cpu_common.go chore(all): replace instances of interface with any (#10067) 2025-04-02 09:44:27 -07:00
cuda_common.go Re-remove cuda v11 (#10694) 2025-06-23 14:07:00 -07:00
gpu.go discovery: fix cudart driver version (#11614) 2025-08-13 15:43:33 -07:00
gpu_darwin.go Revert "cgo: use O3" 2025-01-31 10:25:39 -08:00
gpu_info.h discover: fix compiler warnings (#10572) 2025-05-06 10:49:22 -07:00
gpu_info_cudart.c discovery: fix cudart driver version (#11614) 2025-08-13 15:43:33 -07:00
gpu_info_cudart.h discovery: fix cudart driver version (#11614) 2025-08-13 15:43:33 -07:00
gpu_info_darwin.h Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_info_darwin.m Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_info_nvcuda.c discover: fix compiler warnings (#10572) 2025-05-06 10:49:22 -07:00
gpu_info_nvcuda.h Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_info_nvml.c nvidia libs have inconsistent ordering (#7473) 2024-11-02 16:35:41 -07:00
gpu_info_nvml.h nvidia libs have inconsistent ordering (#7473) 2024-11-02 16:35:41 -07:00
gpu_info_oneapi.c Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_info_oneapi.h Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_linux.go discover: /proc/cpuinfo file open and close. (#9950) 2025-03-31 17:07:42 -07:00
gpu_linux_test.go Refine default thread selection for NUMA systems (#7322) 2024-10-30 15:05:45 -07:00
gpu_oneapi.go Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_test.go Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_windows.go all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
gpu_windows_test.go Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
path.go Re-remove cuda v11 (#10694) 2025-06-23 14:07:00 -07:00
types.go discover: CPU supports flash attention 2025-08-11 15:00:34 -07:00