This check is expected to succeed and the status is expected to be
printed to stdout rather than stderr. This change silences the status
output. The status text was printed mistakenly previously because we
captured stderr rather than stdout.
This previously emitted a warning because Elixir will rebind `this_node`
by default, so the `this_node` binding in the line above was unused.
(As opposed to Erlang which would treat this as a match - rejecting
the binding if `this_node` was not equal to the value being matched.)
The node needed to be adjusted as well - `node()` returned the ExUnit
runner's node while the command returned the remote node, which is
stored in the context under `opts.node`.
`rabbit_binary_generator:map_exception/3` will crash when there are
unicode characters in the `explaination` field of `Reason#amqp_error`
parameter. The explaination string (list) is assumed to be ascii, with
each character/member in the range of a byte. Any unicode characters
in the string will trigger `badarg` crash of `list_to_binary/1` in
`rabbit_binary_generator:amqp_exception_explanation/2`.
Amqp091 shovel crash due to this is reported,
https://github.com/rabbitmq/rabbitmq-server/discussions/12874
When a queue as shovel source/destination does not exist, and its
name contains non-ascii characters, the explaination of amqp_error
will be like `no queue non_ascii_name_😍 in vhost /`. It will
subsequently crash and even affect management console.
To fix this, `unicode:characters_to_binary/1` is used instead of
`list_to_binary/1`, and unicode-safe truncation of long explaination
with `io_lib:format/3` chars_limit replaces direct bytes truncation.
The `:io.format/2` call was originally passed a single-quote string
(i.e. a charlist in Elixir terminology) which emits a warning in more
recent Elixir versions:
warning: single-quoted strings represent charlists. Use ~c"" if you indeed want a charlist or use "" instead
└─ nofile:1:12
This warning would pop up a few times when using `make dialyze` within
a deps directory. To resolve it we can switch the quoting so that the
eval string is wrapped in single quotes (equivalent for shell since this
line doesn't use variables) and the format argument is wrapped in double
quotes. This uses a binary in Elixir instead, but that's ok because
`io:format/3`'s `io:format()` parameter may either be an atom, string,
or binary.
This trick was copied from Makefile:49 which uses the same quoting.
[Why]
The test configuration was querying a network interface IP address based
on its name. However, the name, "eth0", is very specific to Linux. This
broke the test on other systems.
[How]
We still have to set an explicit `bind_addr` because Consul refuses to
start if the host has multiple private IPv4 addresses, as it is the case
in CI.
Therefore, we hard-code 127.0.0.1 as the IPv4 address to use because it has a
great chance to exist about anywhere.
[Why]
Two reasons:
1. We need to set the correct feature flags on the test node we have to
start.
2. We can skip Mnesia- or Khepri-specific tests if they are marked.
[Why]
The `run-background-broker` does not wait for the node to be ready,
leading to some transient errors in the testsuite.
[How]
The `start-background-broker` does wait.
While here, export the value of `$(MAKE)`. Otherwise, nested uses of
make(1) may use the wrong make command.
[Why]
The code assumed that the transaction would always succeed. It was kind
of the case with Mnesia because it would throw an exception if it
failed.
Khepri returns an error instead. The code has to handle it. In
particular, we see timeouts in CI and before this patch, they caused a
crash because the list comprehension was asked to work on a tuple.
[How]
We now retry a few times for 10 seconds.
[Why]
We pin a version of Horus even if we don't use it directly (it is a
dependency of Khepri). But currently, we can't update Khepri while still
needing the fix in Horus 0.3.1.
Horus 0.3.1 works around a crash in `cover` that mostly affects CI for
now.
This pinning will have to go away with the next update of Khepri.
[Why]
The `ra:member_add/3` call returns before the change is committed. This
is ok for that addition but any follow-up changes to the cluster might
be rejected with the `cluster_change_not_permitted` error.
[How]
Instead of changing other places to wait or retry their cluster
membership change, this patch waits for the current add to be applied
before proceeding and returning.
This fixes some transient failures in CI where such follow-up changes
are rejected and not retried, leaving the cluster in an unexpected state
for the testcase.
An example is with
`quorum_queue_SUITE:force_shrink_member_to_current_member/1`
This check fails on a virin node, because the metadata store
is not yet ready to handle the query. However, a virin
node by definition can't have any queues, so let's just return
false without asking.