Commit Graph

19 Commits

Author SHA1 Message Date
Devon Rifkin 5f57b0ef42
add thinking support to the api and cli (#10584)
- Both `/api/generate` and `/api/chat` now accept a `"think"`
  option that allows specifying whether thinking mode should be on or
  not
- Templates get passed this new option so, e.g., qwen3's template can
  put `/think` or `/no_think` in the system prompt depending on the
  value of the setting
- Models' thinking support is inferred by inspecting model templates.
  The prefix and suffix the parser uses to identify thinking support is
  also automatically inferred from templates
- Thinking control & parsing is opt-in via the API to prevent breaking
  existing API consumers. If the `"think"` option is not specified, the
  behavior is unchanged from previous versions of ollama
- Add parsing for thinking blocks in both streaming/non-streaming mode
  in both `/generate` and `/chat`
- Update the CLI to make use of these changes. Users can pass `--think`
  or `--think=false` to control thinking, or during an interactive
  session they can use the commands `/set think` or `/set nothink`
- A `--hidethinking` option has also been added to the CLI. This makes
  it easy to use thinking in scripting scenarios like
  `ollama run qwen3 --think --hidethinking "my question here"` where you
  just want to see the answer but still want the benefits of thinking
  models
2025-05-28 19:38:52 -07:00
Michael Yang 23125648b8
chore: update mllama to use ollama engine (#10637) 2025-05-13 17:36:02 -07:00
Michael Yang 6e9a7a2568
lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
Jesse Gross 900f64e6be prompt: Don't trim whitespace from prompts
New lines can be an important part of a user's prompt and trimming
it can alter the results. We previously only trimmed prompts with
images but refactoring brought this behavior to all prompts, where
it became more noticable.

The /generate endpoint adds less whitespace and therefore doesn't
need to trim it out - this brings the same behavior to /chat.

Thanks to @gabe-l-hart for spotting the issue!

Fixes #7795
2024-12-09 11:02:55 -08:00
Jeffrey Morgan 8b4b243f5f
server: fix warnings in prompt_test.go (#7710) 2024-11-17 13:01:04 -08:00
Jesse Gross c826e57475 runner.go: Better abstract vision model integration
-Update mllama to take the cross attention state as embeddings in
a batch, more similar to how Llava handles it. This improves
integration with the input cache.
-Pass locations in a prompt for embeddings using tags similar to Llava.
-Abstract interface to vision models so the main runner accesses Clip
and Mllama similarly

Co-authored-by: Michael Yang <mxyng@pm.me>
2024-10-30 14:53:43 -07:00
Patrick Devine c7cb0f0602
image processing for llama3.2 (#6963)
Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>
2024-10-18 16:12:35 -07:00
Michael Yang b732beba6a lint 2024-08-01 17:06:06 -07:00
Michael Yang 4a565cbf94 add chat and generate tests with mock runner 2024-07-16 09:39:31 -07:00
Michael Yang d02bbebb11 tools 2024-07-15 15:26:16 -07:00
Michael Yang 22c5451fc2
fix system prompt (#5662)
* fix system prompt

* execute template when hitting previous roles

* fix tests

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2024-07-12 21:04:44 -07:00
Michael Yang 41be28096a add system prompt to first legacy template 2024-07-10 17:03:08 -07:00
Michael Yang 2c3fe1fd97 comments 2024-07-05 13:17:24 -07:00
Michael Yang 269ed6e6a2 update message processing 2024-07-05 13:16:58 -07:00
Michael Yang 58e3fff311 rename templates to template 2024-07-01 10:40:54 -07:00
Patrick Devine 1b272d5bcd
change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) 2024-03-26 13:04:17 -07:00
Michael Yang 0e19476b56
prepend image tags (#2789)
instead of appending image tags, prepend them - this generally produces better results
2024-02-29 11:30:14 -08:00
Bruce MacDonald 88622847c6
fix: chat system prompting overrides (#2542) 2024-02-16 14:42:43 -05:00
Jeffrey Morgan 48a273f80b
Fix issues with templating prompt in chat mode (#2460) 2024-02-12 15:06:57 -08:00