Commit Graph

53 Commits

Author SHA1 Message Date
Devon Rifkin 47991940d4 add qwen3-coder tool support
The format qwen3-coder uses is relatively unique, both in rendering and
in parsing. To implement parsing, I wrote a custom parser in similar
style to harmony. For the rendering, I found that the logic would be
much more difficult to follow in a template, so I introduced the concept
of a built-in renderer that uses go code, rather than a template to
generate prompts.

I set us up for future built-in parsers and renderers by making it so
they can be specified in a Modelfile like so:

```
RENDERER "qwen3-coder"
PARSER "qwen3-coder"
```

These need to be provided explicitly because the architecture alone is
not enough to understand what format the model expects to receive, and
what format we expect it to output (e.g., qwen3-coder is `qwen3moe`,
which includes other qwen3-family models as well)

I haven't converted harmony to be one of these "built-ins" yet, since
some of it is in flux with the changes @ParthSareen has been making to
move harmony to the runner. It is likely that many other built-ins will
need to move to the runner as well, but I'm able to slightly defer that
decision since qwen3-coder doesn't have thinking (and therefore doesn't
need to be in the runner to make structured outputs work). I expect to
unify harmony with this approach very soon.

Whether a particular model supports tools or thinking was previously
inferred from templates, but without a template we now also use the
parser itself to declare what it supports. If we have future models that
re-use the same parsing format, but have different capabilities, we'll
want to parameterize them and give them different names to be specified
as a `PARSER`.

Misc changes:

- I worked on the renderer by diffing outputs from the reference
  implementation and ours. To make it easier to do this, I extended
  <https://github.com/ollama/ollama/pull/11875> to also support
  returning the prompt via the openai compat layer
2025-09-15 11:33:47 -07:00
frob 4378ae4ffa
parser: don't check the file type of safetensors to prevent false negatives. (#12176)
* Don't check the file type of safetensor to prevent false negatives.

---------

Co-authored-by: Patrick Devine <patrick@infrahq.com>
2025-09-05 16:27:40 -07:00
Michael Yang 0dabb4ef6a
skip tokenizer.model if possible (#11050)
if tokenizer.json is already copied, skip tokenizer.model
2025-06-11 12:10:35 -07:00
Jeffrey Morgan fa9973cd7f
api: remove unused sampling parameters (#10581) 2025-05-08 08:31:08 -07:00
Jeffrey Morgan 3b2d2c8326
api: remove unused or unsupported api options (#10574)
Some options listed in api/types.go are not supported in
newer models, or have been deprecated in the past. This is
the first of a series of PRs to clean up the API options
2025-05-05 14:54:40 -07:00
Michael Yang d931ee8f22
create blobs in parallel (#10135)
* default max term height
* error on out of tree files
2025-05-05 11:59:26 -07:00
Michael Yang 16fca86c4a digest files in parallel 2025-04-07 09:46:31 -07:00
Bruce MacDonald 6bd0a983cd model: support for mistral-small in the ollama runner
Mistral is a popular research lab making open source models. This updates
the forward pass of llama architecture models to support both llama models
and mistral models by accounting for additional metadata present in mistral
models, and finding the correct dimensions for the output projection.
2025-04-03 16:57:36 -07:00
Parth Sareen 00ebda8cc4
Revert "parser: remove role validation from Modelfile parser" (#9917)
This reverts commit ffbfe833da.
2025-03-21 12:38:09 -07:00
rylativity ffbfe833da
parser: remove role validation from Modelfile parser (#9874)
* updates parser/parser.go to allow arbitrary roles in Modelfile MESSAGE blocks
2025-03-20 13:11:17 -07:00
Jeffrey Morgan 42cf4db601
parser: fix parsing Modelfiles with multiple FROM commands (#8449) 2025-01-16 00:14:04 -08:00
Patrick Devine 2539f2dbf9
Fix absolute path names + gguf detection (#8428) 2025-01-14 19:01:24 -08:00
Patrick Devine 32bd37adf8
make the modelfile path relative for `ollama create` (#8380) 2025-01-10 16:14:08 -08:00
Jeffrey Morgan 1deafd8254
llama: update vendored code to commit 46e3556 (#8308) 2025-01-08 11:22:01 -08:00
Patrick Devine 86a622cbdc
Update the /api/create endpoint to use JSON (#7935)
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
2024-12-31 18:02:30 -08:00
Patrick Devine 4efb98cb4f
add line numbers for parser errors (#7326) 2024-11-14 13:59:44 -08:00
Josh Yan 9bd00041fa trim all params 2024-06-27 11:18:38 -07:00
Josh Yan 4e986a823c unquote, trimp space 2024-06-27 10:59:15 -07:00
Michael Yang d528e1af75 fix utf16 for multibyte runes 2024-06-13 13:07:42 -07:00
Michael Yang 20b9f8e6f4 Revert "proper utf16 support"
This reverts commit 66ab48772f.

this change broke utf-8 scanning of multi-byte runes
2024-06-13 10:22:16 -07:00
Michael Yang 66ab48772f proper utf16 support 2024-06-05 13:11:50 -07:00
Patrick Devine ccdf0b2a44
Move the parser back + handle utf16 files (#4533) 2024-05-20 11:26:45 -07:00
Michael Yang 119589fcb3 rename parser to model/file 2024-05-01 09:53:50 -07:00
Michael Yang bd8eed57fc fix parser name 2024-05-01 09:52:54 -07:00
Michael Yang 9cf0f2e973 use parser.Format instead of templating modelfile 2024-05-01 09:52:54 -07:00
Michael Yang 176ad3aa6e parser: add commands format 2024-05-01 09:52:54 -07:00
Michael Yang 4d08363580 comments 2024-05-01 09:52:54 -07:00
Michael Yang 8907bf51d2 fix multiline 2024-05-01 09:52:54 -07:00
Michael Yang abe614c705 tests 2024-05-01 09:52:54 -07:00
Michael Yang 238715037d linting 2024-05-01 09:52:54 -07:00
Michael Yang c0a00f68ae refactor modelfile parser 2024-05-01 09:52:54 -07:00
Patrick Devine 7c40a67841
Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Daniel Hiltgen fedd705aea Mechanical switch from log to slog
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Michael Yang 38fe1a368b fix: trim space in modelfile fields 2023-12-05 11:57:29 -08:00
Bruce MacDonald a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang 6517bcc53c
Merge pull request #290 from jmorganca/add-adapter-layers
implement loading ggml lora adapters through the modelfile
2023-08-10 17:23:01 -07:00
Michael Yang 21e6197c0b
Merge pull request #322 from jmorganca/no-comment-warning
no warning on comments
2023-08-10 16:24:41 -07:00
Michael Yang 20bf000e55 no warning on comments 2023-08-10 16:22:38 -07:00
Michael Yang 40d0c4a1dc length check for parameters 2023-08-10 16:09:02 -07:00
Michael Yang 6de5d032e1 implement loading ggml lora adapters through the modelfile 2023-08-10 09:23:39 -07:00
Bruce MacDonald a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00
Michael Yang 9c7f30d31c use max scan token size to hold large objects 2023-07-28 11:43:31 -07:00
Michael Yang f5ac8ddfb4 refactor scan multiline for reuse 2023-07-27 11:30:51 -07:00
Michael Yang 24c2c77057 fix multiline string
the data needs to remove the multiline quotes but include the command:

e.g.

TEMPLATE """
my template values
"""

should be

TEMPLATE
my template values

after scanning
2023-07-25 11:51:43 -07:00
Mohit Gaur f5f79049c2 Incorporate code review improvements 2023-07-25 22:52:23 +05:30
Mohit Gaur ed89da92b4 Improve command parsing and multiline string handling 2023-07-24 18:11:13 +05:30
Jeffrey Morgan d59b164fa2 add prompt back to parser 2023-07-20 01:13:30 -07:00
Jeffrey Morgan 3b135ac963 parser: fix case where multi line string termination error wouldnt show 2023-07-20 00:43:22 -07:00
Jeffrey Morgan e6bae8d916 parser: keep seeking until eof 2023-07-20 00:37:52 -07:00
Michael Yang df146c41e2 separate prompt into template and system 2023-07-19 23:24:31 -07:00