ollama/runner/ollamarunner
Michael Yang 6f7117145f
batch: use tensors for outputs (#12185)
this cleans up the model interface slightly without too much impact in
other areas
2025-09-15 14:33:06 -07:00
..
cache.go llm: Clamp batch size to context size 2025-09-08 20:40:11 -07:00
cache_test.go embedding gemma model (#12181) 2025-09-04 09:09:07 -07:00
multimodal.go ml: Panic rather than return error on tensor allocation failure 2025-05-22 14:38:09 -07:00
runner.go batch: use tensors for outputs (#12185) 2025-09-15 14:33:06 -07:00