This release includes a new machine API `snapshot_installed/2`. This new
API will only be used indirectly through khepri.
This release also includes an performance improvement that reduces the chances
of building a large WAL mailbox backlog when a node is low on scheduling
resources and commands are committed by followers completing writes to disk
before the leader.
There is also a fix for a potential election deadlock.
When there is nothing to do we don't need this variable
so we don't want to calculate it unnecessarily.
Because this variable is only used once, when
producing the .app file, we don't have to worry
about the calculation being done multiple times.
If we ever do then it will need to be lazily
evaluated[1] instead.
[1] Managing Projects with GNU Make, 3rd Edition Chapter 10
Execution speed differences:
make -C deps/rabbit nope 0,02s user 0,03s system 101% cpu 0,051 total
make -C deps/rabbit nope 0,02s user 0,01s system 97% cpu 0,031 total
Compressed ETS tables may introduce a small throughput penalty (low single
digit %) but can reduce peak Ra memory use by 30-50%.
Also set a default wal_max_entries value to avoid mem tables growing
too large when using very small message sizes (as more than 1M tiny
messages can easily fit into one WAL file).
Ra 2.10.1 has a type spec fix needed.
This Ra release contains a number of fixes and improvements including:
* Much improved resiliency when Ra infrastructure such as the WAL or
segment writer encounters unexpected errors during disk operations.
It also includes the following features that are RabbitMQ does not
yet make use of (but will in the near future).
* Checkpoints: allow non truncating snapshots to be written
to allow faster recovery of quorum queues with long backlogs for example.
* Server recovery strategy configuration: allow dynamically started
ra servers to be optionally restarted.
* New handle_aux/5 callback with a better and safer API
Khepri v0.13.0 contains a fix for how projections are handled during
registration and recovery. The error returned from
`khepri:register_projection/1,2,3` has also been updated to use the
`?khepri_error(..)` helper macro.
Co-authored-by: Jean-Sébastien Pédron <jean-sebastien.pedron@dumbbell.fr>
This Ra release contains fixes for leaderboard updates as well
as a long standing bug fix that meant the latest cluster may not
be recovered correctly after an unclean shutdown.
Khepri 0.10.0 replaces `khepri:wait_for_async_ret/2,3` with
`khepri:handle_async_ret/1,2`. This will be used by the child commit:
the child commit will use Khepri's async interface and handle async
write events from Ra.
Changes to the bazel build files were done automatically with gazelle:
bazel run gazelle -- update-repos --verbose \
--build_files_dir=bazel github.com/rabbitmq/khepri@v0.10.1
This includes a new ra:key_metrics/1 API that is more available
than parsing the output of sys:get_status/1.
the rabbit_quorum_queue:status/1 function has been ported to use
this API instead as well as now inludes a few new fields.
Includes minor fixes and improvements such as:
* Don't overwrite Ra member config file in place to avoid potential
corruption scenario
* Make logging unicode compatible
* Optimisation to avoid spawning node connector process on ra member init
when nodes are already connected.
* Catch recovery failures in the Ra WAL rather than crashing hard.
We already were using Cowlib 2.12.1 and therefore were
compatible with OTP-26. This simply updates Cowboy to
the version that depends on Cowlib 2.12.1.
Returns reaching a Ra member that used to be leader but now has stepped
down would cause that follower to crash and restart.
This commit avoids this scenario as well as giving the return commands
a good chance of being resent to the new leader in a timeley manner.
(see the Ra release for this).
This Ra release includes improvements to Ra server GC behaviour when receiving a lot
of low priority commands with large binary payloads (e.g. quorum queue messages).
Practically this allows quorum queues to accept large amounts of messages in a more predicatble and performant manner.
This change also removes ra_file_handle cache that was used as a bridge between ra file operations and RabbitMQ io metrics. Lots of components in RabbitMQ such as streams and CQv2s do not record io metrics in the previous manner due to overhead incurred for every file io operation. These metrics are better inspected at the OS level anyway.
This Ra release
* Omproves election availability in certain mixed versions failure
scenarios
* Optimises segment reference compaction which may becomes expensive
in quorum queues with very long backlogs
* Various log message improvements and level tweaks
* Better cleans up machine monitor records after quorum queue rebalancing
Since 4.10.0 was released specifically to address an issue we
encountered in RabbitMQ integration with prometheus.erl, new test was
added to validate this functionality in the future.
This is the build error prior to these changes:
```
* rabbit_common (/home/bakkenl/development/rabbitmq/rabbitmq-server/deps/rabbit_common)
could not find an app file at "_build/dev/lib/rabbit_common/ebin/rabbit_common.app". This may happen if the dependency was not yet compiled or the dependency indeed has no app file (then you can pass app: false as option)
** (Mix) Can't continue due to errors on dependencies
```
Telling `mix` to compile `rabbit_common` ensures that the following
links are created:
```
$ ll deps/rabbitmq_cli/_build/dev/lib/rabbit_common/
total 8
drwxr-xr-x 2 bakkenl bakkenl 4096 Jan 20 09:46 .
drwxr-xr-x 10 bakkenl bakkenl 4096 Jan 20 09:46 ..
lrwxrwxrwx 1 bakkenl bakkenl 33 Jan 20 09:46 ebin -> ../../../../../rabbit_common/ebin
lrwxrwxrwx 1 bakkenl bakkenl 36 Jan 20 09:46 include -> ../../../../../rabbit_common/include
```
All these metrics, except publishers & consumers, are handled by
rabbitmq_global_metrics, so we currently have duplicates. As I started
removing these, I realised that tests were written in Java - why not
Erlang? - and they seemed way too complicated for what was needed. After
the new rabbitmq_global_metrics, we are left with 2 metrics, and all the
extra code simply doesn't justify them. I am proposing that we add them to
rabbit_global_counters as gauges. Let's discuss @dcorbacho @acogoluegnes
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
when the rabbitmq-components.mk file is used in non-monorepo
plugins. Note that with this change all deps of type git_rmq-subfolder
must target the same branches for it to behave properly.
This allows including additional applications or third party
plugins when creating a release, running the broker locally,
or just building from the top-level Makefile.
To include Looking Glass in a release, for example:
$ make package-generic-unix ADDITIONAL_PLUGINS="looking_glass"
A Docker image can then be built using this release and will
contain Looking Glass:
$ make docker-image
Beware macOS users! Applications such as Looking Glass include
NIFs. NIFs must be compiled in the right environment. If you
are building a Docker image then make sure to build the NIF
on Linux! In the two steps above, this corresponds to Step 1.
To run the broker with Looking Glass available:
$ make run-broker ADDITIONAL_PLUGINS="looking_glass"
This commit also moves Looking Glass dependency information
into rabbitmq-components.mk so it is available at all times.
The configuration remains the same for the end-user. The only exception
is the log root directory: it is now set through the `log_root`
application env. variable in `rabbit`. People using the Cuttlefish-based
configuration file are not affected by this exception.
The main change is how the logging facility is configured. It now
happens in `rabbit_prelaunch_logging`. The `rabbit_lager` module is
removed.
The supported outputs remain the same: the console, text files, the
`amq.rabbitmq.log` exchange and syslog.
The message text format slightly changed: the timestamp is more precise
(now to the microsecond) and the level can be abbreviated to always be
4-character long to align all messages and improve readability. Here is
an example:
2021-03-03 10:22:30.377392+01:00 [dbug] <0.229.0> == Prelaunch DONE ==
2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>
2021-03-03 10:22:30.377860+01:00 [info] <0.229.0> Starting RabbitMQ 3.8.10+115.g071f3fb on Erlang 23.2.5
2021-03-03 10:22:30.377860+01:00 [info] <0.229.0> Licensed under the MPL 2.0. Website: https://rabbitmq.com
The example above also shows that multiline messages are supported and
each line is prepended with the same prefix (the timestamp, the level
and the Erlang process PID).
JSON is also supported as a message format and now for any outputs.
Indeed, it is possible to use it with e.g. syslog or the exchange. Here
is an example of a JSON-formatted message sent to syslog:
Mar 3 11:23:06 localhost rabbitmq-server[27908] <0.229.0> - {"time":"2021-03-03T11:23:06.998466+01:00","level":"notice","msg":"Logging: configured log handlers are now ACTIVE","meta":{"domain":"rabbitmq.prelaunch","file":"src/rabbit_prelaunch_logging.erl","gl":"<0.228.0>","line":311,"mfa":["rabbit_prelaunch_logging","configure_logger",1],"pid":"<0.229.0>"}}
For quick testing, the values accepted by the `$RABBITMQ_LOGS`
environment variables were extended:
* `-` still means stdout
* `-stderr` means stderr
* `syslog:` means syslog on localhost
* `exchange:` means logging to `amq.rabbitmq.log`
`$RABBITMQ_LOG` was also extended. It now accepts a `+json` modifier (in
addition to the existing `+color` one). With that modifier, messages are
formatted as JSON intead of plain text.
The `rabbitmqctl rotate_logs` command is deprecated. The reason is
Logger does not expose a function to force log rotation. However, it
will detect when a file was rotated by an external tool.
From a developer point of view, the old `rabbit_log*` API remains
supported, though it is now deprecated. It is implemented as regular
modules: there is no `parse_transform` involved anymore.
In the code, it is recommended to use the new Logger macros. For
instance, `?LOG_INFO(Format, Args)`. If possible, messages should be
augmented with some metadata. For instance (note the map after the
message):
?LOG_NOTICE("Logging: switching to configured handler(s); following "
"messages may not be visible in this log output",
#{domain => ?RMQLOG_DOMAIN_PRELAUNCH}),
Domains in Erlang Logger parlance are the way to categorize messages.
Some predefined domains, matching previous categories, are currently
defined in `rabbit_common/include/logging.hrl` or headers in the
relevant plugins for plugin-specific categories.
At this point, very few messages have been converted from the old
`rabbit_log*` API to the new macros. It can be done gradually when
working on a particular module or logging.
The Erlang builtin console/file handler, `logger_std_h`, has been forked
because it lacks date-based file rotation. The configuration of
date-based rotation is identical to Lager. Once the dust has settled for
this feature, the goal is to submit it upstream for inclusion in Erlang.
The forked module is calld `rabbit_logger_std_h` and is based
`logger_std_h` in Erlang 23.0.
Once the monorepo is built, from within it one can run `make
fetch-topic-branch-${TOPIC_BRANCH}` then `make
topic-branch-${TOPIC_BRANCH}` to rebase the commits from all the
sources back onto the monorepo
Add GitHub Actions workflows for Erlang/OTP 22.3 & 23.0.
The workflows run tests for each component that is now part of this
repo, with test suite parallelization specifically for the rabbit
erlang application.