ollama/template
Devon Rifkin 5f57b0ef42
add thinking support to the api and cli (#10584)
- Both `/api/generate` and `/api/chat` now accept a `"think"`
  option that allows specifying whether thinking mode should be on or
  not
- Templates get passed this new option so, e.g., qwen3's template can
  put `/think` or `/no_think` in the system prompt depending on the
  value of the setting
- Models' thinking support is inferred by inspecting model templates.
  The prefix and suffix the parser uses to identify thinking support is
  also automatically inferred from templates
- Thinking control & parsing is opt-in via the API to prevent breaking
  existing API consumers. If the `"think"` option is not specified, the
  behavior is unchanged from previous versions of ollama
- Add parsing for thinking blocks in both streaming/non-streaming mode
  in both `/generate` and `/chat`
- Update the CLI to make use of these changes. Users can pass `--think`
  or `--think=false` to control thinking, or during an interactive
  session they can use the commands `/set think` or `/set nothink`
- A `--hidethinking` option has also been added to the CLI. This makes
  it easy to use thinking in scripting scenarios like
  `ollama run qwen3 --think --hidethinking "my question here"` where you
  just want to see the answer but still want the benefits of thinking
  models
2025-05-28 19:38:52 -07:00
..
testdata templates: add autotemplate for gemma3 (#9880) 2025-03-20 00:15:30 -07:00
alfred.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
alfred.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
alpaca.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
alpaca.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
chatml.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
chatml.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
chatqa.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
chatqa.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
codellama-70b-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
codellama-70b-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
command-r.gotmpl convert: import support for command-r models from safetensors (#6063) 2025-01-15 16:31:22 -08:00
command-r.json convert: import support for command-r models from safetensors (#6063) 2025-01-15 16:31:22 -08:00
falcon-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
falcon-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
gemma-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
gemma-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
gemma3-instruct.gotmpl templates: add autotemplate for gemma3 (#9880) 2025-03-20 00:15:30 -07:00
gemma3-instruct.json templates: add autotemplate for gemma3 (#9880) 2025-03-20 00:15:30 -07:00
granite-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
granite-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
index.json templates: add autotemplate for gemma3 (#9880) 2025-03-20 00:15:30 -07:00
llama2-chat.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
llama2-chat.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
llama3-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
llama3-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
magicoder.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
magicoder.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
mistral-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
mistral-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
openchat.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
openchat.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
phi-3.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
phi-3.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
solar-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
solar-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
starcoder2-instruct.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
starcoder2-instruct.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
template.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
template_test.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
vicuna.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
vicuna.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
zephyr.gotmpl update templates to use messages 2024-08-27 15:44:04 -07:00
zephyr.json autodetect stop parameters from template 2024-07-12 16:01:23 -07:00