ollama

History

Jesse Gross c116a7523d kvcache: Don't shift empty batches When we context shift, we delete half the context and apply RoPE with an offset to the other half. We used to RoPE across the entire context in a single pass with a zero offset for the deleted section. With the change to shifting in batches, we can skip any batches where all of the offsets would be zero. This typically reduces the number of operations by half.		2025-07-29 12:32:22 -07:00
..
cache.go	ollamarunner: Preallocate worst case graph at startup	2025-04-08 10:01:28 -07:00
causal.go	kvcache: Don't shift empty batches	2025-07-29 12:32:22 -07:00
causal_test.go	ml: Panic rather than return error on tensor allocation failure	2025-05-22 14:38:09 -07:00
encoder.go	ollamarunner: Preallocate worst case graph at startup	2025-04-08 10:01:28 -07:00
wrapper.go	ollamarunner: Preallocate worst case graph at startup	2025-04-08 10:01:28 -07:00