as it nicer categorises if there will be a future
"message_interceptors.outgoing.*" key.
We leave the advanced config file key because simple single value
settings should not require using the advanced config file.
As reported in https://groups.google.com/g/rabbitmq-users/c/x8ACs4dBlkI/
plugins that implement rabbit_channel_interceptor break with
Native MQTT in 3.12 because Native MQTT does not use rabbit_channel anymore.
Specifically, these plugins don't work anymore in 3.12 when sending a message
from an MQTT publisher to an AMQP 0.9.1 consumer.
Two of these plugins are
https://github.com/rabbitmq/rabbitmq-message-timestamp
and
https://github.com/rabbitmq/rabbitmq-routing-node-stamp
This commit moves both plugins into rabbitmq-server.
Therefore, these plugins are deprecated starting in 3.12.
Instead of using these plugins, the user gets the same behaviour by
configuring rabbitmq.conf as follows:
```
incoming_message_interceptors.set_header_timestamp.overwrite = false
incoming_message_interceptors.set_header_routing_node.overwrite = false
```
While both plugins were incompatible to be used together, this commit
allows setting both headers.
We name the top level configuration key `incoming_message_interceptors`
because only incoming messages are intercepted.
Currently, only `set_header_timestamp` and `set_header_routing_node` are
supported. (We might support more in the future.)
Both can set `overwrite` to `false` or `true`.
The meaning of `overwrite` is the same as documented in
https://github.com/rabbitmq/rabbitmq-message-timestamp#always-overwrite-timestamps
i.e. whether headers should be overwritten if they are already present
in the message.
Both `set_header_timestamp` and `set_header_routing_node` behave exactly
to plugins `rabbitmq-message-timestamp` and `rabbitmq-routing-node-stamp`,
respectively.
Upon node boot, the configuration is put into persistent_term to not
cause any performance penalty in the default case where these settings
are disabled.
The channel and MQTT connection process will intercept incoming messages
and - if configured - add the desired AMQP 0.9.1 headers.
For now, this allows using Native MQTT in 3.12 with the old plugins
behaviour.
In the future, once "message containers" are implemented,
we can think about more generic message interceptors where plugins can be
written to modify arbitrary headers or message contents for various protocols.
Likewise, in the future, once MQTT 5.0 is implemented, we can think
about an MQTT connection interceptor which could function similar to a
`rabbit_channel_interceptor` allowing to modify any MQTT packet.
PR #7390 introduces deprecated features and their lifecycle
management. One of the first applications should be kick-starting the
process of Classic Mirrored Queues deprecation. But that would be to
big of a change to be backported to any of the current releases, so
this commit introduces a simplified version of that deprecation.
To disable CMQs one need to add the following line to the config:
```
permit_deprecated_features.classic_mirrored_queues = false
```
What it does when CMQs are disabled via configuration:
- Doesn't allow to create user/operator policies that enable mirroring ("ha-mode")
- Prevent RabbitMQ startup if such a policy was previously configured
Differences from the final implementation that will be using deprecated features system:
- No warnings are issued when CMQs are not disabled, but being used
- It's not possible to set `permit_deprecated_features` option to the `true` value
This differences ensure that one only enables this feature when they
are absolutely sure what they are doing, but in a way that won't
interfere with a subsequent phased deprecation process.
CQv2 is significantly more efficient (x2-4 on some workloads),
has lower and more predictable memory footprint, and eliminates
the need to make classic queues lazy to achieve that predictability.
Per several discussions with the team.
as `raft.adaptive_failure_detector.poll_interval`.
On systems under peak load, inter-node communication link congestion
can result in false positives and trigger QQ leader re-elections that
are unnecessary and could make the situation worse.
Using a higher poll interval would at least reduce the probability of
false positives.
Per discussion with @kjnilsson @mkuratczyk.
Limits are defined in the instance config:
default_limits.vhosts.1.pattern = ^device
default_limits.vhosts.1.max_connections = 10
default_limits.vhosts.1.max_queues = 10
default_limits.vhosts.2.pattern = ^system
default_limits.vhosts.2.max_connections = 100
default_limits.vhosts.3.pattern = .*
default_limits.vhosts.3.max_connections = 20
default_limits.vhosts.3.max_queues = 20
Where pattern is a regular expression used to match limits to a newly
created vhost, and the limits are non-negative integers. First matching
set of limits is applied, only once, during vhost creation.
This is meant to be used by deployment tools,
core features and plugins
that expect a certain minimum
number of cluster nodes
to be present.
For example, certain setup steps
in distributed plugins might require
at least three nodes to be available.
This is just a hint, not an enforced
requirement. The default value is 1
so that for single node clusters,
there would be no behavior changes.
* max_message_size had an off-by-one error and unfortunate naming
* classic mirrored queue batch size was not validating the size in messages.
The limit of over 2B messages did not make much sense. 1M is a still very
high but a more reasonable upper bound
Fixes#3390
The classic local filesystem source is still supported
using the same traditional configuration key, load_definitions.
Configuration schema follows peer discovery in spirit:
* definitions.import_backend configures the mechanism to use,
which can be a module provided by a plugin
* definitions.* keys can be defined by plugins and contain any
keys a specific mechanism needs
For example, the classic local filesystem source can now be
configured like this:
``` ini
definitions.import_backend = local_filesystem
definitions.local.path = /path/to/definitions.d/definition.json
```
``` ini
definitions.import_backend = https
definitions.https.url = https://hostname/path/to/definitions.json
```
HTTPS may require additional configuration keys related to TLS/x.509
peer verification. Such extra keys will be added as the need for them
becomes evident.
References #3249
On initial cluster formation, only one node in a multi node cluster
should initialize the Mnesia database schema (i.e. form the cluster).
To ensure that for nodes starting up in parallel,
RabbitMQ peer discovery backends have used
either locks or randomized startup delays.
Locks work great: When a node holds the lock, it either starts a new
blank node (if there is no other node in the cluster), or it joins
an existing node. This makes it impossible to have two nodes forming
the cluster at the same time.
Consul and etcd peer discovery backends use locks. The lock is acquired
in the consul and etcd infrastructure, respectively.
For other peer discovery backends (classic, DNS, AWS), randomized
startup delays were used. They work good enough in most cases.
However, in https://github.com/rabbitmq/cluster-operator/issues/662 we
observed that in 1% - 10% of the cases (the more nodes or the
smaller the randomized startup delay range, the higher the chances), two
nodes decide to form the cluster. That's bad since it will end up in a
single Erlang cluster, but in two RabbitMQ clusters. Even worse, no
obvious alert got triggered or error message logged.
To solve this issue, one could increase the randomized startup delay
range from e.g. 0m - 1m to 0m - 3m. However, this makes initial cluster
formation very slow since it will take up to 3 minutes until
every node is ready. In rare cases, we still end up with two nodes
forming the cluster.
Another way to solve the problem is to name a dedicated node to be the
seed node (forming the cluster). This was explored in
https://github.com/rabbitmq/cluster-operator/pull/689 and works well.
Two minor downsides to this approach are: 1. If the seed node never
becomes available, the whole cluster won't be formed (which is okay),
and 2. it doesn't integrate with existing dynamic peer discovery backends
(e.g. K8s, AWS) since nodes are not yet known at deploy time.
In this commit, we take a better approach: We remove randomized startup
delays altogether. We replace them with locks. However, instead of
implementing our own lock implementation in an external system (e.g. in K8s),
we re-use Erlang's locking mechanism global:set_lock/3.
global:set_lock/3 has some convenient properties:
1. It accepts a list of nodes to set the lock on.
2. The nodes in that list connect to each other (i.e. create an Erlang
cluster).
3. The method is synchronous with a timeout (number of retries). It
blocks until the lock becomes available.
4. If a process that holds a lock dies, or the node goes down, the lock
held by the process is deleted.
The list of nodes passed to global:set_lock/3 corresponds to the nodes
the peer discovery backend discovers (lists).
Two special cases worth mentioning:
1. That list can be all desired nodes in the cluster
(e.g. in classic peer discovery where nodes are known at
deploy time) while only a subset of nodes is available.
In that case, global:set_lock/3 still sets the lock not
blocking until all nodes can be connected to. This is good since
nodes might start sequentially (non-parallel).
2. In dynamic peer discovery backends (e.g. K8s, AWS), this
list can be just a subset of desired nodes since nodes might not startup
in parallel. That's also not a problem as long as the following
requirement is met: "The peer disovery backend does not list two disjoint
sets of nodes (on different nodes) at the same time."
For example, in a 2-node cluster, the peer discovery backend must not
list only node 1 on node 1 and only node 2 on node 2.
Existing peer discovery backends fullfil that requirement because the
resource the nodes are discovered from is global.
For example, in K8s, once node 1 is part of the Endpoints object, it
will be returned on both node 1 and node 2.
Likewise, in AWS, once node 1 started, the described list of instances
with a specific tag will include node 1 when the AWS peer discovery backend
runs on node 1 or node 2.
Removing randomized startup delays also makes cluster formation
considerably faster (up to 1 minute faster if that was the
upper bound in the range).
The implementation depends on erlang-systemd [1] which uses Unix socket
support introduced in Erlang 19. Therefore it doesn't rely on a native
library. We also don't need special handling if the host doesn't use
journald.
To enable the journald handler, add the following configuration
variable:
log.journald = true
The log level can also be set the same way it is with other handlers:
log.journald.level = debug
The log messages are communicated to journald using structured data. It
is possible to configure which fields are transmitted and how they are
named:
log.journald.fields = SYSLOG_IDENTIFIER="rabbitmq-server" syslog_timestamp syslog_pid priority ERL_PID=pid
In this example:
* the `SYSLOG_IDENTIFIER` is set to a string literal
* `syslog_timestamp and `syslog_pid` are aliases for
SYSLOG_TIMESTAMP=time and SYSLOG_PID=os_pid
* `priority` is a special field computed from the log level
* `ERL_PID=pid` indicates `pid` should be sent as the `ERL_PID`
field.
The message itself is implicit and always sent. Otherwise, the list of
fields must be exhaustive: fields which are unset in a particular log
event meta are sent as an empty string and non-mentionned fields are not
sent. The order is not important.
Here are some messages printed by `journalctl -f` during RabbitMQ
startup:
Mar 26 11:58:31 ip-172-31-43-179 rabbitmq-server[19286]: Ready to start client connection listeners
Mar 26 11:58:31 ip-172-31-43-179 rabbitmq-server[19286]: started TCP listener on [::]:5672
Mar 26 11:58:31 ip-172-31-43-179 rabbitmq-server[19286]: Server startup complete; 0 plugins started.
[1] https://github.com/rabbitmq/erlang-systemd
In addition to the existing configuration variables to configure
logging, the following variables were added to extend the settings.
log.*.formatter = plaintext | json
Selects between the plain text (default) and JSON formatters.
log.*.formatter.time_format = rfc3339_space | rfc3339_T | epoch_usecs | epoch_secs | lager_default
Configures how the timestamp should be formatted. It has several
values to get RFC3339 date & time, Epoch-based integers and Lager
default format.
log.*.formatter.level_format = lc | uc | lc3 | uc3 | lc4 | uc4
Configures how to format the level. Things like uppercase vs.
lowercase, full vs. truncated.
Examples:
lc: debug
uc: DEBUG
lc3: dbg
uc3: DBG
lw4: dbug
uc4: DBUG
log.*.formatter.single_line = on | off
Indicates if multi-line messages should be reformatted as a
single-line message. A multi-line message is converted to a
single-line message by joining all lines and separating them
with ", ".
log.*.formatter.plaintext.format
Set to a pattern to indicate the format of the entire message. The
format pattern is a string with $-based variables. Each variable
corresponds to a field in the log event. Here is a non-exhaustive list
of common fields:
time
level
msg
pid
file
line
Example:
$time [$level] $pid $msg
log.*.formatter.json.field_map
Indicates if fields should be renamed or removed, and the ordering
which they should appear in the final JSON object. The order is set by
the order of fields in that coniguration variable.
Example:
time:ts level msg *:-
In this example, `time` is renamed to `ts`. `*:-` tells to remove all
fields not mentionned in the list. In the end the JSON object will
contain the fields in the following order: ts, level, msg.
log.*.formatter.json.verbosity_map
Indicates if a verbosity field should be added and how it should be
derived from the level. If the verbosity map is not set, no verbosity
field is added to the JSON object.
Example:
debug:2 info:1 notice:1 *:0
In this example, debug verbosity is 2, info and notice verbosity is 1,
other levels have a verbosity of 0.
All of them work with the console, exchange, file and syslog outputs.
The console output has specific variables too:
log.console.stdio = stdout | stderr
Indicates if stdout or stderr should be used. The default is stdout.
log.console.use_colors = on | off
Indicates if colors should be used in log messages. The default
depends on the environment.
log.console.color_esc_seqs.*
Indicates how each level is mapped to a color. The value can be any
string but the idea is to use an ANSI escape sequence.
Example:
log.console.color_esc_seqs.error = \033[1;31m
V2: A custom time format pattern was introduced, first using variables,
then a reference date & time (e.g. "Mon 2 Jan 2006"), thanks to
@ansd. However, we decided to remove it for now until we have a
better implementation of the reference date & time parser.
V3: The testsuite was extended to cover new settings as well as the
syslog output. To test it, a fake syslogd server was added (Erlang
process, part of the testsuite).
V4: The dependency to cuttlefish is moved to rabbitmq_prelaunch which
actually uses the library. The version is updated to 3.0.1 because
we need Kyorai/cuttlefish#25.
The configuration remains the same for the end-user. The only exception
is the log root directory: it is now set through the `log_root`
application env. variable in `rabbit`. People using the Cuttlefish-based
configuration file are not affected by this exception.
The main change is how the logging facility is configured. It now
happens in `rabbit_prelaunch_logging`. The `rabbit_lager` module is
removed.
The supported outputs remain the same: the console, text files, the
`amq.rabbitmq.log` exchange and syslog.
The message text format slightly changed: the timestamp is more precise
(now to the microsecond) and the level can be abbreviated to always be
4-character long to align all messages and improve readability. Here is
an example:
2021-03-03 10:22:30.377392+01:00 [dbug] <0.229.0> == Prelaunch DONE ==
2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>
2021-03-03 10:22:30.377860+01:00 [info] <0.229.0> Starting RabbitMQ 3.8.10+115.g071f3fb on Erlang 23.2.5
2021-03-03 10:22:30.377860+01:00 [info] <0.229.0> Licensed under the MPL 2.0. Website: https://rabbitmq.com
The example above also shows that multiline messages are supported and
each line is prepended with the same prefix (the timestamp, the level
and the Erlang process PID).
JSON is also supported as a message format and now for any outputs.
Indeed, it is possible to use it with e.g. syslog or the exchange. Here
is an example of a JSON-formatted message sent to syslog:
Mar 3 11:23:06 localhost rabbitmq-server[27908] <0.229.0> - {"time":"2021-03-03T11:23:06.998466+01:00","level":"notice","msg":"Logging: configured log handlers are now ACTIVE","meta":{"domain":"rabbitmq.prelaunch","file":"src/rabbit_prelaunch_logging.erl","gl":"<0.228.0>","line":311,"mfa":["rabbit_prelaunch_logging","configure_logger",1],"pid":"<0.229.0>"}}
For quick testing, the values accepted by the `$RABBITMQ_LOGS`
environment variables were extended:
* `-` still means stdout
* `-stderr` means stderr
* `syslog:` means syslog on localhost
* `exchange:` means logging to `amq.rabbitmq.log`
`$RABBITMQ_LOG` was also extended. It now accepts a `+json` modifier (in
addition to the existing `+color` one). With that modifier, messages are
formatted as JSON intead of plain text.
The `rabbitmqctl rotate_logs` command is deprecated. The reason is
Logger does not expose a function to force log rotation. However, it
will detect when a file was rotated by an external tool.
From a developer point of view, the old `rabbit_log*` API remains
supported, though it is now deprecated. It is implemented as regular
modules: there is no `parse_transform` involved anymore.
In the code, it is recommended to use the new Logger macros. For
instance, `?LOG_INFO(Format, Args)`. If possible, messages should be
augmented with some metadata. For instance (note the map after the
message):
?LOG_NOTICE("Logging: switching to configured handler(s); following "
"messages may not be visible in this log output",
#{domain => ?RMQLOG_DOMAIN_PRELAUNCH}),
Domains in Erlang Logger parlance are the way to categorize messages.
Some predefined domains, matching previous categories, are currently
defined in `rabbit_common/include/logging.hrl` or headers in the
relevant plugins for plugin-specific categories.
At this point, very few messages have been converted from the old
`rabbit_log*` API to the new macros. It can be done gradually when
working on a particular module or logging.
The Erlang builtin console/file handler, `logger_std_h`, has been forked
because it lacks date-based file rotation. The configuration of
date-based rotation is identical to Lager. Once the dust has settled for
this feature, the goal is to submit it upstream for inclusion in Erlang.
The forked module is calld `rabbit_logger_std_h` and is based
`logger_std_h` in Erlang 23.0.
Since the validation fails with "or isn't readable", we should actually
check whether we can read the file. This way, when configuring TLS for
example, you get early feedback if the cert files are not readable.