ollama

History

Bruce MacDonald fbe6ae285a server: improve tensor quantization fallback logic (#10806 ) Fall back to alternative quantization types when a tensor's dimensions aren't divisible by the block size required for the original desired quantization type. If retried quantization types fail, the system ultimately falls back to F16 (half-precision floating point) which has a block size of 1 and can handle any tensor dimension.		2025-05-22 10:48:08 -07:00
..
internal	lint: enable usetesting, disable tenv (#10594 )	2025-05-08 11:42:14 -07:00
testdata/tools	all: fix typos in documentation, code, and comments (#7021 )	2024-12-10 12:58:06 -08:00
auth.go	…
create.go	remove support for multiple ggufs in a single file (#10722 )	2025-05-21 13:55:31 -07:00
create_test.go	server: validate local path on safetensor create (#9379 )	2025-02-28 16:10:43 -08:00
download.go	server: organize error types (#9465 )	2025-03-28 11:50:22 -07:00
fixblobs.go	…
fixblobs_test.go	…
images.go	ggml: Seperate tensor load from backend creation	2025-05-19 09:54:22 -07:00
images_test.go	lint: enable usetesting, disable tenv (#10594 )	2025-05-08 11:42:14 -07:00
layer.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
manifest.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
manifest_test.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
model.go	ggml: Seperate tensor load from backend creation	2025-05-19 09:54:22 -07:00
model_test.go	Update the /api/create endpoint to use JSON (#7935 )	2024-12-31 18:02:30 -08:00
modelpath.go	server: organize error types (#9465 )	2025-03-28 11:50:22 -07:00
modelpath_test.go	lint: enable usetesting, disable tenv (#10594 )	2025-05-08 11:42:14 -07:00
prompt.go	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
prompt_test.go	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
quantization.go	server: improve tensor quantization fallback logic (#10806 )	2025-05-22 10:48:08 -07:00
quantization_test.go	ggml: Seperate tensor load from backend creation	2025-05-19 09:54:22 -07:00
routes.go	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
routes_create_test.go	Move quantization to new backend (#10363 )	2025-05-06 11:20:48 -07:00
routes_delete_test.go	Update the /api/create endpoint to use JSON (#7935 )	2024-12-31 18:02:30 -08:00
routes_generate_test.go	lint: enable usetesting, disable tenv (#10594 )	2025-05-08 11:42:14 -07:00
routes_list_test.go	Update the /api/create endpoint to use JSON (#7935 )	2024-12-31 18:02:30 -08:00
routes_test.go	fix: stream accumulator exits early (#10593 )	2025-05-08 13:17:30 -07:00
sched.go	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
sched_test.go	lint: enable usetesting, disable tenv (#10594 )	2025-05-08 11:42:14 -07:00
sparse_common.go	…
sparse_windows.go	…
upload.go	server: always print upload/download part info (#8832 )	2025-02-04 19:30:49 -08:00