ollama

History

Jesse Gross 29ddfc2cab ggml: Disable flash attention for gemma2 Our new engine implementation of gemma2 doesn't support flash attention, which means that it also doesn't support KV cache quantization. Currently, it is possible to turn these two on, which will result in a crash.		2025-09-10 16:40:45 -07:00
..
ggml	ggml: Disable flash attention for gemma2	2025-09-10 16:40:45 -07:00
gguf	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
util/bufioutil	next ollama runner (#7913 )	2025-02-13 16:31:21 -08:00
config.go	add new gemma model (#11204 )	2025-06-25 21:47:09 -07:00