ollama/server
Michael Yang 3f6642f6fc
model: implement bert in ollama engine (#9080)
* fix truncate

* s/SentencePieceModel/SentencePiece/

* bert

* wordpiece

* refactor pooling

* more tokenizers

* normalize embeddings
2025-09-15 15:35:59 -07:00
..
internal chore: fix some inconsistent function name in comment 2025-08-13 09:50:27 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
create.go remove support for multiple ggufs in a single file (#10722) 2025-05-21 13:55:31 -07:00
create_test.go server: validate local path on safetensor create (#9379) 2025-02-28 16:10:43 -08:00
download.go server: abort download on empty digest 2025-05-27 11:28:48 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go update vendored llama.cpp and ggml (#11823) 2025-08-14 14:42:58 -07:00
images_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go tools: refactor tool call parsing and enable streaming (#10415) 2025-05-23 14:19:31 -07:00
modelpath.go server: add hint to the error message when model path access fails (#10843) 2025-05-24 13:17:04 -07:00
modelpath_test.go lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
prompt.go fix(openai): handle reasoning_effort (#11868) 2025-08-12 11:02:01 -07:00
prompt_test.go gpt-oss (#11672) 2025-08-05 12:21:16 -07:00
quantization.go skip quantizing per_layer_token_embd (#11207) 2025-06-26 21:49:35 -07:00
quantization_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
routes.go model: implement bert in ollama engine (#9080) 2025-09-15 15:35:59 -07:00
routes_create_test.go Move quantization to new backend (#10363) 2025-05-06 11:20:48 -07:00
routes_debug_test.go server: add debug option for printing out prompt instead of calling model 2025-08-15 13:52:50 -07:00
routes_delete_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_generate_test.go server: skip parsing initial <think> if provided in the prompt (#12024) 2025-08-22 12:00:16 -07:00
routes_harmony_streaming_test.go Revert "runner: move harmony to runner (#12052)" 2025-09-12 20:40:14 -03:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go server: use slices.Equal to simplify code (#11502) 2025-07-23 14:25:39 -07:00
sched.go llm: New memory management 2025-08-14 15:24:01 -07:00
sched_test.go llm: New memory management 2025-08-14 15:24:01 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00