Discovered by @dumbbell Ensure externally read strings are saved as utf-8 encoded binaries. This is necessary since `cmd.exe` on Windows uses ISO-8859-1 encoding and directories can have latin1 characters, like `RabbitMQ Sérvér`. The `é` is represented by decimal `233` in the ISO-8859-1 encoding. The unicode code point is the same decimal value, `233`, so you will see this in the charlist data. However, when encoded using utf-8, this becomes the two-byte sequence `C3 A9` (hexidecimal). When reading strings from env variables and configuration, they will be unicode charlists, with each list item representing a unicode code point. All of Erlang string functions can handle strings in this form. Once these strings are written to ETS or Mnesia, they will be converted to utf-8 encoded binaries. Prior to these changes just `list_to_binary/1` was used. Fix xref error re:replace requires an iodata, which is not a list of unicode code points Correctly parse unicode vhost tags Fix many format strings to account for utf8 input. Try again to fix unicode vhost tags More format string fixes, try to get the CONFIG_FILE var correct Be sure to use the `unicode` option for re:replace when necessary More unicode format strings, add unicode option to re:split More format strings updated Change ~s to ~ts for vhost format strings Change ~s to ~ts for more vhost format strings Change ~s to ~ts for more vhost format strings Add unicode format chars to disk monitor Quote the directory on unix Finally figure out the correct way to pass unicode to the port |
||
|---|---|---|
| .. | ||
| src | ||
| test | ||
| .gitignore | ||
| BUILD.bazel | ||
| Makefile | ||