ollama/discover
Jesse Gross 06b7ee7781 discover: Disable flash attention for Jetson Xavier (CC 7.2)
GGML picks the wrong kernel and these systems fail with:
Sep 28 22:25:39 xavier ollama[48999]: //ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu:437:
ERROR: CUDA kernel flash_attn_ext_f16 has no device code compatible with CUDA arch 720. ggml-cuda.cu
was compiled for: __CUDA_ARCH_LIST__

Fixes #12442
2025-10-07 14:09:01 -07:00
..
cpu_linux.go Use runners for GPU discovery (#12090) 2025-10-01 15:12:32 -07:00
cpu_linux_test.go Use runners for GPU discovery (#12090) 2025-10-01 15:12:32 -07:00
cpu_windows.go Use runners for GPU discovery (#12090) 2025-10-01 15:12:32 -07:00
cpu_windows_test.go Use runners for GPU discovery (#12090) 2025-10-01 15:12:32 -07:00
gpu.go discover: Disable flash attention for Jetson Xavier (CC 7.2) 2025-10-07 14:09:01 -07:00
gpu_darwin.go Use runners for GPU discovery (#12090) 2025-10-01 15:12:32 -07:00
gpu_info_darwin.h Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
gpu_info_darwin.m Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
path.go Re-remove cuda v11 (#10694) 2025-06-23 14:07:00 -07:00
runner.go discovery: prevent dup OLLAMA_LIBRARY_PATH (#12514) 2025-10-06 14:36:44 -07:00
runner_test.go Use runners for GPU discovery (#12090) 2025-10-01 15:12:32 -07:00
types.go discover: Disable flash attention for Jetson Xavier (CC 7.2) 2025-10-07 14:09:01 -07:00