Go to file

Michael Yang ec5e22ac85 Merge pull request #174 from jmorganca/tokenize allocate a large enough tokens slice		2023-07-24 08:22:51 -07:00
api	change error handler behavior and fix error when a model isn't found (#173 )	2023-07-21 23:02:12 -07:00
app	address comment	2023-07-21 17:29:07 -04:00
cmd	change push to chunked uploads from monolithic (#179 )	2023-07-22 17:31:26 -07:00
docs	Update modelfile.md (#177 )	2023-07-22 08:19:30 -07:00
examples	fix example `Modelfile`s	2023-07-20 15:46:32 -07:00
format	add new list command (#97 )	2023-07-18 09:09:45 -07:00
library	remove colon from library modelfiles	2023-07-20 09:51:30 -07:00
llama	allocate a large enough tokens slice	2023-07-21 23:05:15 -07:00
parser	add prompt back to parser	2023-07-20 01:13:30 -07:00
progressbar	vendor in progress bar and change to bytes instead of bibytes (#130 )	2023-07-19 17:24:03 -07:00
scripts	build app in publish script	2023-07-12 19:16:39 -07:00
server	change push to chunked uploads from monolithic (#179 )	2023-07-22 17:31:26 -07:00
web	web: tweak homepage text	2023-07-21 09:57:57 -07:00
.dockerignore	update `Dockerfile`	2023-07-06 16:34:44 -04:00
.gitignore	Update .gitignore	2023-07-22 17:00:52 +03:00
.prettierrc.json	move .prettierrc.json to root	2023-07-02 17:34:46 -04:00
Dockerfile	fix compilation issue in Dockerfile, remove from `README.md` until ready	2023-07-11 19:51:08 -07:00
LICENSE	`proto` -> `ollama`	2023-06-26 15:57:13 -04:00
README.md	add `/api/create` docs to readme	2023-07-23 18:01:05 -04:00
ggml-metal.metal	look for ggml-metal in the same directory as the binary	2023-07-11 15:58:56 -07:00
go.mod	use gin-contrib/cors middleware	2023-07-22 09:39:08 -07:00
go.sum	use gin-contrib/cors middleware	2023-07-22 09:39:08 -07:00
main.go	continue conversation	2023-07-13 17:13:00 -07:00

README.md

Ollama

Note: Ollama is in early preview. Please report any issues you find.

Run, create, and share large language models (LLMs).

Download

Download for macOS on Apple Silicon (Intel coming soon)
Download for Windows and Linux (coming soon)
Build from source

Quickstart

To run and chat with Llama 2, the new model by Meta:

ollama run llama2

Model library

ollama includes a library of open-source models:

Model	Parameters	Size	Download
Llama2	7B	3.8GB	`ollama pull llama2`
Llama2 13B	13B	7.3GB	`ollama pull llama2:13b`
Orca Mini	3B	1.9GB	`ollama pull orca`
Vicuna	7B	3.8GB	`ollama pull vicuna`
Nous-Hermes	13B	7.3GB	`ollama pull nous-hermes`
Wizard Vicuna Uncensored	13B	7.3GB	`ollama pull wizard-vicuna`

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Examples

Run a model

ollama run llama2
>>> hi
Hello! How can I help you today?

Create a custom model

Pull a base model:

ollama pull llama2

Create a Modelfile:

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more examples, see the examples directory.

Pull a model from the registry

ollama pull orca

Listing local models

ollama list

Model packages

Overview

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

Building

go build .

To run it start the server:

./ollama serve &

Finally, run a model!

./ollama run llama2

REST API

`POST /api/generate`

Generate text from a model.

curl -X POST http://localhost:11434/api/generate -d '{"model": "llama2", "prompt":"Why is the sky blue?"}'

`POST /api/create`

Create a model from a Modelfile.

curl -X POST http://localhost:11434/api/create -d '{"name": "my-model", "path": "/path/to/modelfile"}'