Commit Graph

66 Commits

Author SHA1 Message Date
Michael Yang 6c733bf0a6
s#x/exp/maps#maps# (#11506) 2025-07-23 13:23:32 -07:00
Michael Yang 73b642e6f3
add new gemma model (#11204)
* update patches

* cherry pick metal mean kernel

* cherry pick cuda mean kernel

* gemma3n
2025-06-25 21:47:09 -07:00
Michael Yang 0a066cfd91
Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119)
* Reapply "feat: incremental gguf parser (#10822)" (#11114)

This reverts commit a6e64fbdf2.

* fix older ggufs
2025-06-20 11:11:40 -07:00
Jeffrey Morgan a6e64fbdf2
Revert "feat: incremental gguf parser (#10822)" (#11114)
This reverts commit 6b04cad7e8.
2025-06-18 05:42:44 -07:00
Michael Yang 6b04cad7e8
feat: incremental gguf parser (#10822)
* incremental gguf parser
* gguf: update test to not rely on gguf on disc
* re-use existing create gguf
* read capabilities from gguf kv
* kv exists
* update tests
* s/doneFunc/successFunc/g
* new buffered reader

---------

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-06-12 11:04:11 -07:00
batuhankadioglu 7b68e254c2
all: update several golang.org/x packages (#10436) 2025-04-29 16:51:09 -07:00
Parth Sareen 0682dae027
sample: improve ollama engine sampler performance (#9374)
This change bring in various interface cleanups along with greatly improving the performance of the sampler.

Tested with llama3.2 on local machine.
Improves performance from ~ 70 tokens/s -> 135 tokens/s with topK(40) enabled.
Without topK performance is ~ 110 tokens/s
2025-03-07 12:37:48 -08:00
Blake Mizerany e2252d0fc6
server/internal/registry: take over pulls from server package (#9485)
This commit replaces the old pull implementation in the server package
with the new, faster, more robust pull implementation in the registry
package.

The new endpoint, and now the remove endpoint too, are behind the
feature gate "client2" enabled only by setting the OLLAMA_EXPERIMENT
environment variable include "client2".

Currently, the progress indication is wired to perform the same as the
previous implementation to avoid making changes to the CLI, and because
the status reports happen at the start of the download, and the end of
the write to disk, the progress indication is not as smooth as it could
be. This is a known issue and will be addressed in a future change.

This implementation may be ~0.5-1.0% slower in rare cases, depending on
network and disk speed, but is generally MUCH faster and more robust
than the its predecessor in all other cases.
2025-03-05 14:48:18 -08:00
Jesse Gross e185c08ad9 go.mod: Use full version for go 1.24.0
Otherwise on Linux I get:
go: download go1.24 for linux/amd64: toolchain not available
2025-02-27 13:01:32 -08:00
Blake Mizerany 2412adf42b
server/internal: replace model delete API with new registry handler. (#9347)
This commit introduces a new API implementation for handling
interactions with the registry and the local model cache. The new API is
located in server/internal/registry. The package name is "registry" and
should be considered temporary; it is hidden and not bleeding outside of
the server package. As the commits roll in, we'll start consuming more
of the API and then let reverse osmosis take effect, at which point it
will surface closer to the root level packages as much as needed.
2025-02-27 12:04:53 -08:00
Blake Mizerany 4604b10306
go.mod: bump to go1.24 (#9242) 2025-02-24 13:11:46 -08:00
Michael Yang dcfb7a105c
next build (#8539)
* add build to .dockerignore

* test: only build one arch

* add build to .gitignore

* fix ccache path

* filter amdgpu targets

* only filter if autodetecting

* Don't clobber gpu list for default runner

This ensures the GPU specific environment variables are set properly

* explicitly set CXX compiler for HIP

* Update build_windows.ps1

This isn't complete, but is close.  Dependencies are missing, and it only builds the "default" preset.

* build: add ollama subdir

* add .git to .dockerignore

* docs: update development.md

* update build_darwin.sh

* remove unused scripts

* llm: add cwd and build/lib/ollama to library paths

* default DYLD_LIBRARY_PATH to LD_LIBRARY_PATH in runner on macOS

* add additional cmake output vars for msvc

* interim edits to make server detection logic work with dll directories like lib/ollama/cuda_v12

* remove unncessary filepath.Dir, cleanup

* add hardware-specific directory to path

* use absolute server path

* build: linux arm

* cmake install targets

* remove unused files

* ml: visit each library path once

* build: skip cpu variants on arm

* build: install cpu targets

* build: fix workflow

* shorter names

* fix rocblas install

* docs: clean up development.md

* consistent build dir removal in development.md

* silence -Wimplicit-function-declaration build warnings in ggml-cpu

* update readme

* update development readme

* llm: update library lookup logic now that there is one runner (#8587)

* tweak development.md

* update docs

* add windows cuda/rocm tests

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-01-29 15:03:38 -08:00
Michael Yang cb40d60469 chore: upgrade to gods v2
gods v2 uses go generics rather than interfaces which simplifies the
code considerably
2024-12-21 00:05:16 -08:00
Squishedmac 9ab62eb96f
update golang.org/x dependencies (#8172) 2024-12-20 09:29:30 -08:00
Blake Mizerany a37f4a86a7
go.mod: go 1.22.8 -> 1.23.4 (#8036) 2024-12-10 18:16:16 -08:00
Meng Zhuo 2ebdb54fb3
all: update math32 go mod to v1.11.0 (#6627) 2024-11-23 15:21:54 -08:00
Mikel Olasagasti Uranga 597072ef1b
readme: update google/uuid module (#7310)
update uuid.New().String() to uuid.NewString()
2024-11-21 19:37:04 -08:00
Bruce MacDonald 0679d491fe
chore(deps): bump golang.org/x dependencies (#7655)
- golang.org/x/sync v0.3.0 -> v0.9.0
- golang.org/x/image v0.14.0 -> v0.22.0
- golang.org/x/text v0.15.0 -> v0.20.0
2024-11-14 13:58:25 -08:00
Daniel Hiltgen abd5dfd06a
Bump to latest Go 1.22 patch (#7379) 2024-10-26 17:03:37 -07:00
Patrick Devine c7cb0f0602
image processing for llama3.2 (#6963)
Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>
2024-10-18 16:12:35 -07:00
Daniel Hiltgen feedf49c71 Go back to a pinned Go version
Go version 1.22.6 is triggering AV false positives, so go back to 1.22.5
2024-08-13 11:45:44 -07:00
Michael Yang fb6cbc02fb update named templates 2024-07-05 16:29:32 -07:00
Michael Yang 9b6c2e6eb6 detect chat template from KV 2024-06-06 16:03:47 -07:00
Michael Yang 171eb040fc simplify safetensors reading 2024-05-21 11:28:22 -07:00
Michael Yang 34d5ef29b3 fix conversion for f16 or f32 inputs 2024-05-21 11:28:22 -07:00
jmorganca 63a453554d `go mod tidy` 2024-05-19 23:03:57 -07:00
Patrick Devine 1e1634daca
update go deps (#4324) 2024-05-10 21:39:27 -07:00
Patrick Devine 9f8691c6c8
Add llama2 / torch models for `ollama create` (#3607) 2024-04-15 11:26:42 -07:00
Patrick Devine 5a5efee46b
Add gemma safetensors conversion (#3250)
Co-authored-by: Michael Yang <mxyng@pm.me>
2024-03-28 18:54:01 -07:00
Patrick Devine 1b272d5bcd
change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) 2024-03-26 13:04:17 -07:00
Patrick Devine 2c017ca441
Convert Safetensors to an Ollama model (#2824) 2024-03-06 21:01:51 -08:00
Michael Yang fc483274ad clean up go.mod 2024-02-23 16:53:36 -08:00
vinjn 66ef308abd Import "containerd/console" lib to support colorful output in Windows terminal 2024-02-15 05:56:45 +00:00
Daniel Hiltgen 29e90cc13b Implement new Go based Desktop app
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Daniel Hiltgen ecbfc0182f Go bump to v1.21 to pick up slog 2024-01-18 14:12:57 -08:00
Daniel Hiltgen 39928a42e8 Always dynamically load the llm server library
This switches darwin to dynamic loading, and refactors the code now that no
static linking of the library is used on any platform
2024-01-11 08:42:47 -08:00
Daniel Hiltgen d4cd695759 Add cgo implementation for llama.cpp
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Patrick Devine 630518f0d9
Add unit test of API routes (#1528) 2023-12-14 16:47:40 -08:00
Michael Yang 7232f1fa41 go mod tidy 2023-12-04 16:59:23 -08:00
Michael Yang 01ea6002c4 replace go-humanize with format.HumanBytes 2023-11-14 14:57:41 -08:00
Michael Yang 341fb7e35f go mod tidy 2023-11-01 11:54:25 -07:00
Patrick Devine deeac961bb
new readline library (#847) 2023-10-25 16:41:18 -07:00
Ajay Kemparaj bb8464c0d2
update golang.org/x/net fixes CVE-2023-3978,CVE-2023-39325,CVE-2023-44487 (#855) 2023-10-25 16:17:24 -07:00
Bruce MacDonald a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang 8544edca21 parallel chunked downloads 2023-10-06 12:56:43 -07:00
Patrick Devine 87d9efb364
switch to forked readline lib which doesn't wreck the repl prompt (#578) 2023-09-22 12:17:45 -07:00
Michael Yang e9f6df7dca use slices.DeleteFunc 2023-09-05 09:56:59 -07:00
Bruce MacDonald 42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Michael Yang d791df75dd check memory requirements before loading 2023-08-10 09:23:11 -07:00
Bruce MacDonald a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00