Commit Graph

51 Commits

Author SHA1 Message Date
Michael Yang 0a066cfd91
Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119)
* Reapply "feat: incremental gguf parser (#10822)" (#11114)

This reverts commit a6e64fbdf2.

* fix older ggufs
2025-06-20 11:11:40 -07:00
Jeffrey Morgan a6e64fbdf2
Revert "feat: incremental gguf parser (#10822)" (#11114)
This reverts commit 6b04cad7e8.
2025-06-18 05:42:44 -07:00
Michael Yang 6b04cad7e8
feat: incremental gguf parser (#10822)
* incremental gguf parser
* gguf: update test to not rely on gguf on disc
* re-use existing create gguf
* read capabilities from gguf kv
* kv exists
* update tests
* s/doneFunc/successFunc/g
* new buffered reader

---------

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-06-12 11:04:11 -07:00
batuhankadioglu 7b68e254c2
all: update several golang.org/x packages (#10436) 2025-04-29 16:51:09 -07:00
Blake Mizerany e2252d0fc6
server/internal/registry: take over pulls from server package (#9485)
This commit replaces the old pull implementation in the server package
with the new, faster, more robust pull implementation in the registry
package.

The new endpoint, and now the remove endpoint too, are behind the
feature gate "client2" enabled only by setting the OLLAMA_EXPERIMENT
environment variable include "client2".

Currently, the progress indication is wired to perform the same as the
previous implementation to avoid making changes to the CLI, and because
the status reports happen at the start of the download, and the end of
the write to disk, the progress indication is not as smooth as it could
be. This is a known issue and will be addressed in a future change.

This implementation may be ~0.5-1.0% slower in rare cases, depending on
network and disk speed, but is generally MUCH faster and more robust
than the its predecessor in all other cases.
2025-03-05 14:48:18 -08:00
Blake Mizerany 2412adf42b
server/internal: replace model delete API with new registry handler. (#9347)
This commit introduces a new API implementation for handling
interactions with the registry and the local model cache. The new API is
located in server/internal/registry. The package name is "registry" and
should be considered temporary; it is hidden and not bleeding outside of
the server package. As the commits roll in, we'll start consuming more
of the API and then let reverse osmosis take effect, at which point it
will surface closer to the root level packages as much as needed.
2025-02-27 12:04:53 -08:00
Michael Yang dcfb7a105c
next build (#8539)
* add build to .dockerignore

* test: only build one arch

* add build to .gitignore

* fix ccache path

* filter amdgpu targets

* only filter if autodetecting

* Don't clobber gpu list for default runner

This ensures the GPU specific environment variables are set properly

* explicitly set CXX compiler for HIP

* Update build_windows.ps1

This isn't complete, but is close.  Dependencies are missing, and it only builds the "default" preset.

* build: add ollama subdir

* add .git to .dockerignore

* docs: update development.md

* update build_darwin.sh

* remove unused scripts

* llm: add cwd and build/lib/ollama to library paths

* default DYLD_LIBRARY_PATH to LD_LIBRARY_PATH in runner on macOS

* add additional cmake output vars for msvc

* interim edits to make server detection logic work with dll directories like lib/ollama/cuda_v12

* remove unncessary filepath.Dir, cleanup

* add hardware-specific directory to path

* use absolute server path

* build: linux arm

* cmake install targets

* remove unused files

* ml: visit each library path once

* build: skip cpu variants on arm

* build: install cpu targets

* build: fix workflow

* shorter names

* fix rocblas install

* docs: clean up development.md

* consistent build dir removal in development.md

* silence -Wimplicit-function-declaration build warnings in ggml-cpu

* update readme

* update development readme

* llm: update library lookup logic now that there is one runner (#8587)

* tweak development.md

* update docs

* add windows cuda/rocm tests

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-01-29 15:03:38 -08:00
Michael Yang cb40d60469 chore: upgrade to gods v2
gods v2 uses go generics rather than interfaces which simplifies the
code considerably
2024-12-21 00:05:16 -08:00
Squishedmac 9ab62eb96f
update golang.org/x dependencies (#8172) 2024-12-20 09:29:30 -08:00
Meng Zhuo 2ebdb54fb3
all: update math32 go mod to v1.11.0 (#6627) 2024-11-23 15:21:54 -08:00
Mikel Olasagasti Uranga 597072ef1b
readme: update google/uuid module (#7310)
update uuid.New().String() to uuid.NewString()
2024-11-21 19:37:04 -08:00
Bruce MacDonald 0679d491fe
chore(deps): bump golang.org/x dependencies (#7655)
- golang.org/x/sync v0.3.0 -> v0.9.0
- golang.org/x/image v0.14.0 -> v0.22.0
- golang.org/x/text v0.15.0 -> v0.20.0
2024-11-14 13:58:25 -08:00
Patrick Devine c7cb0f0602
image processing for llama3.2 (#6963)
Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>
2024-10-18 16:12:35 -07:00
Michael Yang 9b6c2e6eb6 detect chat template from KV 2024-06-06 16:03:47 -07:00
Michael Yang 171eb040fc simplify safetensors reading 2024-05-21 11:28:22 -07:00
Patrick Devine 1e1634daca
update go deps (#4324) 2024-05-10 21:39:27 -07:00
Patrick Devine 9f8691c6c8
Add llama2 / torch models for `ollama create` (#3607) 2024-04-15 11:26:42 -07:00
Patrick Devine 2c017ca441
Convert Safetensors to an Ollama model (#2824) 2024-03-06 21:01:51 -08:00
Michael Yang fc483274ad clean up go.mod 2024-02-23 16:53:36 -08:00
vinjn 66ef308abd Import "containerd/console" lib to support colorful output in Windows terminal 2024-02-15 05:56:45 +00:00
Daniel Hiltgen 29e90cc13b Implement new Go based Desktop app
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Daniel Hiltgen d4cd695759 Add cgo implementation for llama.cpp
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Michael Yang 7232f1fa41 go mod tidy 2023-12-04 16:59:23 -08:00
Michael Yang 01ea6002c4 replace go-humanize with format.HumanBytes 2023-11-14 14:57:41 -08:00
Michael Yang 341fb7e35f go mod tidy 2023-11-01 11:54:25 -07:00
Patrick Devine deeac961bb
new readline library (#847) 2023-10-25 16:41:18 -07:00
Ajay Kemparaj bb8464c0d2
update golang.org/x/net fixes CVE-2023-3978,CVE-2023-39325,CVE-2023-44487 (#855) 2023-10-25 16:17:24 -07:00
Bruce MacDonald a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang 8544edca21 parallel chunked downloads 2023-10-06 12:56:43 -07:00
Patrick Devine c928ceb927
add word wrapping for lines which are longer than the terminal width (#553) 2023-09-22 13:36:08 -07:00
Patrick Devine 87d9efb364
switch to forked readline lib which doesn't wreck the repl prompt (#578) 2023-09-22 12:17:45 -07:00
Michael Yang e9f6df7dca use slices.DeleteFunc 2023-09-05 09:56:59 -07:00
Bruce MacDonald 42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Michael Yang d791df75dd check memory requirements before loading 2023-08-10 09:23:11 -07:00
Bruce MacDonald a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00
Bruce MacDonald 1c5a8770ee read runner parameter options from map
- read runner options from map to see what was specified explicitly and overwrite zero values
2023-08-01 13:38:19 -04:00
Bruce MacDonald daa0d1de7a allow specifying zero values in modelfile 2023-08-01 13:37:50 -04:00
Michael Yang 8609db77ea use gin-contrib/cors middleware 2023-07-22 09:39:08 -07:00
Patrick Devine e4d7f3e287
vendor in progress bar and change to bytes instead of bibytes (#130) 2023-07-19 17:24:03 -07:00
Michael Yang 84200dcde6 use readline 2023-07-19 13:34:56 -07:00
Patrick Devine 5bea29f610
add new list command (#97) 2023-07-18 09:09:45 -07:00
Michael Yang 28a136e9a3 modelfile params 2023-07-17 12:35:03 -07:00
Michael Yang a806b03f62 no errgroup 2023-07-11 14:58:10 -07:00
Michael Yang fd4792ec56 call llama.cpp directly from go 2023-07-11 11:59:18 -07:00
Michael Yang c4b9e84945 progress 2023-07-06 17:07:40 -07:00
Michael Yang 3d6009aae3 run prompts 2023-07-06 17:07:40 -07:00
Bruce MacDonald 7cf5905063 display pull progress 2023-07-06 16:34:44 -04:00
Michael Yang 68e6b4550c use prompt templates 2023-07-06 16:34:44 -04:00
Jeffrey Morgan fd962a36e5 client updates 2023-07-06 16:34:44 -04:00
Jeffrey Morgan 6093a88c1a add llama.cpp go bindings 2023-07-06 16:34:44 -04:00