mirror of https://github.com/ollama/ollama.git
When computing the graph size estimate, the context size is already multiplied by numParallel so estimates reflect that. However, since sliding window models use a smaller, fixed context size, they need to manually take numParallel into account. |
||
|---|---|---|
| .. | ||
| ggml | ||
| gguf | ||
| util/bufioutil | ||
| config.go | ||