ollama

History

Devon Rifkin 05ba4ca1f4 parsers: fix unicode handling for qwen3-coder When trimming whitespace at the end of every chunk, we were iterating backwards over the string byte-by-byte instead of rune-by-rune. As an example of how this can cause corruption, suppose we have the multi-byte character ✅ (`"\u2705"`), which is represented in utf-8 as the three bytes `0xE2 0x9C 0x85`. It happens that `0x85` is NEL, which passes `unicode.IsSpace()`. Because we were iterating byte-by-byte, this caused us to mistakenly slice in the middle of the rune, removing `0x85` and leaving `0xE2 0x9C`, which beyond being the incorrect place to slice, is not even a valid utf-8 character. `trailingWhitespaceLen()` was modified to count from the end in a rune-aware way. Tests with various multibyte unicode characters were also added. Fixes: #12414		2025-09-25 15:47:46 -07:00
..
parsers.go	harmony: remove special casing in routes.go	2025-09-18 14:55:59 -07:00
qwen3coder.go	parsers: fix unicode handling for qwen3-coder	2025-09-25 15:47:46 -07:00
qwen3coder_test.go	parsers: fix unicode handling for qwen3-coder	2025-09-25 15:47:46 -07:00