ollama/envconfig
Jesse Gross fdb109469f llm: Allow overriding flash attention setting
As we automatically enable flash attention for more models, there
are likely some cases where we get it wrong. This allows setting
OLLAMA_FLASH_ATTENTION=0 to disable it, even for models that usually
have flash attention.
2025-10-02 12:07:20 -07:00
..
config.go llm: Allow overriding flash attention setting 2025-10-02 12:07:20 -07:00
config_test.go feat: add trace log level (#10650) 2025-05-12 11:43:00 -07:00