ollama/server
Daniel Hiltgen 20c3266e94
Reduce default parallelism to 1 (#11330)
The current scheduler algorithm of picking the paralellism based on available
VRAM complicates the upcoming dynamic layer memory allocation algorithm.  This
changes the default to 1, with the intent going forward that parallelism is
explicit and will no longer be dynamically determined.  Removal of the dynamic
logic will come in a follow up.
2025-07-08 12:08:37 -07:00
..
internal cache: fix comment function name in cache.go (#11110) 2025-06-18 05:21:45 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
create.go remove support for multiple ggufs in a single file (#10722) 2025-05-21 13:55:31 -07:00
create_test.go server: validate local path on safetensor create (#9379) 2025-02-28 16:10:43 -08:00
download.go server: abort download on empty digest 2025-05-27 11:28:48 -07:00
fixblobs.go
fixblobs_test.go
images.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
images_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go tools: refactor tool call parsing and enable streaming (#10415) 2025-05-23 14:19:31 -07:00
modelpath.go server: add hint to the error message when model path access fails (#10843) 2025-05-24 13:17:04 -07:00
modelpath_test.go lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
prompt.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
prompt_test.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
quantization.go skip quantizing per_layer_token_embd (#11207) 2025-06-26 21:49:35 -07:00
quantization_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
routes.go API/CLI context enhancements (#11331) 2025-07-08 11:59:06 -07:00
routes_create_test.go Move quantization to new backend (#10363) 2025-05-06 11:20:48 -07:00
routes_delete_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_generate_test.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go fix: stream accumulator exits early (#10593) 2025-05-08 13:17:30 -07:00
sched.go Reduce default parallelism to 1 (#11330) 2025-07-08 12:08:37 -07:00
sched_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00