ollama

History

Michael Yang d0b32def60 skip quantizing per_layer_token_embd (#11207 ) this tensor isn't compatible with cuda when quantized to q4_K so skip it		2025-06-26 21:49:35 -07:00
..
internal	cache: fix comment function name in cache.go (#11110 )	2025-06-18 05:21:45 -07:00
auth.go	…
create.go	…
create_test.go	…
download.go	server: abort download on empty digest	2025-05-27 11:28:48 -07:00
fixblobs.go	…
fixblobs_test.go	…
images.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
images_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
layer.go	…
manifest.go	…
manifest_test.go	…
model.go	…
modelpath.go	…
modelpath_test.go	…
prompt.go	add thinking support to the api and cli (#10584 )	2025-05-28 19:38:52 -07:00
prompt_test.go	add thinking support to the api and cli (#10584 )	2025-05-28 19:38:52 -07:00
quantization.go	skip quantizing per_layer_token_embd (#11207 )	2025-06-26 21:49:35 -07:00
quantization_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
routes.go	tools: loosen tool parsing to allow for more formats (#11030 )	2025-06-12 14:18:54 -07:00
routes_create_test.go	…
routes_delete_test.go	…
routes_generate_test.go	add thinking support to the api and cli (#10584 )	2025-05-28 19:38:52 -07:00
routes_list_test.go	…
routes_test.go	…
sched.go	Merge branch 'main' into drifkin/array-head-count-simple	2025-06-23 10:37:31 -07:00
sched_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
sparse_common.go	…
sparse_windows.go	…
upload.go	…