Commit Graph

43 Commits

Author SHA1 Message Date
Arnaud Cogoluègnes eeb35d2688
Add stream replication port range in ini-style configuration
This is more straightforward than configuring Osiris in the advanced
configuration file.
2024-07-19 16:47:59 +02:00
Michael Klishin b822be02af
Revert "cuttlefish tls schema for amqp_client" 2024-06-22 04:16:50 -04:00
Simon Unge 24fb0334ac Add schema duplicate for amqp 1.0 2024-06-21 21:43:07 -04:00
Simon Unge 145592efe9 Remove server options and move to rabbit schema 2024-06-21 21:43:07 -04:00
Alex Valiushko d7374cb244 Add consumers per channel limit 2024-03-15 19:34:04 -04:00
Simon Unge 66a3cbcc94 Use infinity as config instead of 0 2024-01-17 20:19:38 +00:00
Simon Unge d137edc23a Add channel limit per node 2024-01-17 13:54:41 -05:00
Michael Klishin 48ce1b5ec7 Improve supported information units (Mi, Gi, Ti)
This revisits the information system conversion,
that is, support for suffixes like GiB, GB.

When configuration values like disk_free_limit.absolute,
vm_memory_high_watermark.absolute are set, the value
can contain an information unit (IU) suffix.

We now support several new suffixes and the meaning
a few more changes.

First, the changes:

 * k, K now mean kilobytes and not kibibytes
 * m, M now mean megabytes and not mebibytes
 * g, G now means gigabytes and not gibibytes

This is to match the system used by Kubernetes.
There is no consensus in the industry about how
"k", "m", "g", and similar single letter suffixes
should be treated. Previously it was a power of 2,
now a power of 10 to align with a very popular OSS
project that explicitly documents what suffixes it supports.

Now, the additions:

Finally, the node will now validate these suffixes
at boot time, so an unsupported value will cause
the node to stop with a rabbitmq.conf validation
error.

The message logged will look like this:

````
2024-01-15 22:11:17.829272-05:00 [error] <0.164.0> disk_free_limit.absolute invalid, supported formats: 500MB, 500MiB, 10GB, 10GiB, 2TB, 2TiB, 10000000000
2024-01-15 22:11:17.829376-05:00 [error] <0.164.0> Error preparing configuration in phase validation:
2024-01-15 22:11:17.829387-05:00 [error] <0.164.0>   - disk_free_limit.absolute invalid, supported formats: 500MB, 500MiB, 10GB, 10GiB, 2TB, 2TiB, 10000000000
````

Closes #10310
2024-01-15 22:11:57 -05:00
Michael Klishin e87a3995c5
Closes #9733 2023-10-19 11:27:14 -04:00
Jean-Sébastien Dominique 8c6ba6daca Add Classic Queue version to operator policies 2023-09-26 20:13:52 -04:00
Simon Unge e8a872ff42 Fix wrong queue-pattern type 2023-09-26 20:57:24 +00:00
Jean-Sébastien Pédron 469afafd86
Mark classic queue mirroring as deprecated
[Why]
Classic queue mirroring will be removed in RabbitMQ 4.0. Quorum queues
provide a better safer alternative. Non-replicated classic queues remain
supported.

[How]
Classic queue mirroring is marked as deprecated in the code using the
Deprecated features subsystem (based on feature flags). See #7390 for a
description of that subsystem.

To test RabbitMQ behavior as if the feature was removed, the following
configuration setting can be used:
deprecated_features.permit.classic_queue_mirroring = false

To turn off classic queue mirroring, there must be no classic mirrored
queues declared and no HA policy defined. A node with classic mirrored
queues will refuse to start if classic queue mirroring is turned off.

Once classic queue mirroring is turned off, users will not be able to
declare HA policies. Trying to do that from the CLI or the management
API will be rejected with a warning in the logs. This impacts clustering
too: a node with classic queue mirroring turned off will only cluster
with another node which has no HA policy or has classic queue mirroring
turned off.

Note that given the marketing calendar, the deprecated feature will go
directly from "permitted by default" to "removed" in RabbitMQ 4.0. It
won't go through the gradual deprecation process.

V2: Renamed the deprecated feature from `classic_mirrored_queues` to
    `classic_queue_mirroring` to better reflect the intention. Otherwise
    it could be unclear is only the mirroring property is
    deprecated/removed or classic queues entirely.
2023-07-06 11:02:49 +02:00
Michael Klishin 16f49d336f Add a shorthand for the OAuth 2 authN/authZ backend
References #8512
2023-06-10 00:51:00 +04:00
Jean-Sébastien Pédron ac0565287b
Deprecated features: New module to manage deprecated features (!)
This introduces a way to declare deprecated features in the code, not
only in our communication. The new module allows to disallow the use of
a deprecated feature and/or warn the user when he relies on such a
feature.

[Why]
Currently, we only tell people about deprecated features through blog
posts and the mailing-list. This might be insufficiant for our users
that a feature they use will be removed in a future version:
* They may not read our blog or mailing-list
* They may not understand that they use such a deprecated feature
* They might wait for the big removal before they plan testing
* They might not take it seriously enough

The idea behind this patch is to increase the chance that users notice
that they are using something which is about to be dropped from
RabbitMQ. Anopther benefit is that they should be able to test how
RabbitMQ will behave in the future before the actual removal. This
should allow them to test and plan changes.

[How]
When a feature is deprecated in other large projects (such as FreeBSD
where I took the idea from), it goes through a lifecycle:
1. The feature is still available, but users get a warning somehow when
   they use it. They can disable it to test.
2. The feature is still available, but disabled out-of-the-box. Users
   can re-enable it (and get a warning).
3. The feature is disconnected from the build. Therefore, the code
   behind it is still there, but users have to recompile the thing to be
   able to use it.
4. The feature is removed from the source code. Users have to adapt or
   they can't upgrade anymore.

The solution in this patch offers the same lifecycle. A deprecated
feature will be in one of these deprecation phases:
1. `permitted_by_default`: The feature is available. Users get a warning
   if they use it. They can disable it from the configuration.
2. `denied_by_default`: The feature is available but disabled by
   default. Users get an error if they use it and RabbitMQ behaves like
   the feature is removed. They can re-enable is from the configuration
   and get a warning.
3. `disconnected`: The feature is present in the source code, but is
   disabled and can't be re-enabled without recompiling RabbitMQ. Users
   get the same behavior as if the code was removed.
4. `removed`: The feature's code is gone.

The whole thing is based on the feature flags subsystem, but it has the
following differences with other feature flags:
* The semantic is reversed: the feature flag behind a deprecated feature
  is disabled when the deprecated feature is permitted, or enabled when
  the deprecated feature is denied.
* The feature flag behind a deprecated feature is enabled out-of-the-box
  (meaning the deprecated feature is denied):
    * if the deprecation phase is `permitted_by_default` and the
      configuration denies the deprecated feature
    * if the deprecation phase is `denied_by_default` and the
      configuration doesn't permit the deprecated feature
    * if the deprecation phase is `disconnected` or `removed`
* Feature flags behind deprecated feature don't appear in feature flags
  listings.

Otherwise, deprecated features' feature flags are managed like other
feature flags, in particular inside clusters.

To declare a deprecated feature:

    -rabbit_deprecated_feature(
       {my_deprecated_feature,
        #{deprecation_phase => permitted_by_default,
          msgs => #{when_permitted => "This feature will be removed in RabbitMQ X.0"},
         }}).

Then, to check the state of a deprecated feature in the code:

    case rabbit_deprecated_features:is_permitted(my_deprecated_feature) of
        true ->
            %% The deprecated feature is still permitted.
            ok;
        false ->
            %% The deprecated feature is gone or should be considered
            %% unavailable.
            error
    end.

Warnings and errors are logged automatically. A message is generated
automatically, but it is possible to define a message in the deprecated
feature flag declaration like in the example above.

Here is an example of a logged warning that was generated automatically:

    Feature `my_deprecated_feature` is deprecated.
    By default, this feature can still be used for now.
    Its use will not be permitted by default in a future minor RabbitMQ version and the feature will be removed from a future major RabbitMQ version; actual versions to be determined.
    To continue using this feature when it is not permitted by default, set the following parameter in your configuration:
        "deprecated_features.permit.my_deprecated_feature = true"
    To test RabbitMQ as if the feature was removed, set this in your configuration:
        "deprecated_features.permit.my_deprecated_feature = false"

To override the default state of `permitted_by_default` and
`denied_by_default` deprecation phases, users can set the following
configuration:

    # In rabbitmq.conf:
    deprecated_features.permit.my_deprecated_feature = true # or false

The actual behavior protected by a deprecated feature check is out of
scope for this subsystem. It is the repsonsibility of each deprecated
feature code to determine what to do when the deprecated feature is
denied.

V1: Deprecated feature states are initially computed during the
    initialization of the registry, based on their deprecation phase and
    possibly the configuration. They don't go through the `enable/1`
    code at all.

V2: Manage deprecated feature states as any other non-required
    feature flags. This allows to execute an `is_feature_used()`
    callback to determine if a deprecated feature can be denied. This
    also allows to prevent the RabbitMQ node from starting if it
    continues to use a deprecated feature.

V3: Manage deprecated feature states from the registry initialization
    again. This is required because we need to know very early if some
    of them are denied, so that an upgrade to a version of RabbitMQ
    where a deprecated feature is disconnected or removed can be
    performed.

    To still prevent the start of a RabbitMQ node when a denied
    deprecated feature is actively used, we run the `is_feature_used()`
    callback of all denied deprecated features as part of the
    `sync_cluster()` task. This task is executed as part of a feature
    flag refresh executed when RabbitMQ starts or when plugins are
    enabled. So even though a deprecated feature is marked as denied in
    the registry early in the boot process, we will still abort the
    start of a RabbitMQ node if the feature is used.

V4: Support context-dependent warnings. It is now possible to set a
    specific message when deprecated feature is permitted, when it is
    denied and when it is removed. Generic per-context messages are
    still generated.

V5: Improve default warning messages, thanks to @pstack2021.

V6: Rename the configuration variable from `permit_deprecated_features.*`
    to `deprecated_features.permit.*`. As @michaelklishin said, we tend
    to use shorter top-level names.
2023-06-06 13:02:03 +02:00
David Ansari ddabc35191 Change rabbitmq.conf key to message_interceptors.incoming.*
as it nicer categorises if there will be a future
"message_interceptors.outgoing.*" key.

We leave the advanced config file key because simple single value
settings should not require using the advanced config file.
2023-05-15 10:06:01 +00:00
David Ansari 044f6e3bac Move plugin rabbitmq-message-timestamp to the core
As reported in https://groups.google.com/g/rabbitmq-users/c/x8ACs4dBlkI/
plugins that implement rabbit_channel_interceptor break with
Native MQTT in 3.12 because Native MQTT does not use rabbit_channel anymore.
Specifically, these plugins don't work anymore in 3.12 when sending a message
from an MQTT publisher to an AMQP 0.9.1 consumer.

Two of these plugins are
https://github.com/rabbitmq/rabbitmq-message-timestamp
and
https://github.com/rabbitmq/rabbitmq-routing-node-stamp

This commit moves both plugins into rabbitmq-server.
Therefore, these plugins are deprecated starting in 3.12.

Instead of using these plugins, the user gets the same behaviour by
configuring rabbitmq.conf as follows:
```
incoming_message_interceptors.set_header_timestamp.overwrite = false
incoming_message_interceptors.set_header_routing_node.overwrite = false
```

While both plugins were incompatible to be used together, this commit
allows setting both headers.

We name the top level configuration key `incoming_message_interceptors`
because only incoming messages are intercepted.
Currently, only `set_header_timestamp` and `set_header_routing_node` are
supported. (We might support more in the future.)
Both can set `overwrite` to `false` or `true`.
The meaning of `overwrite` is the same as documented in
https://github.com/rabbitmq/rabbitmq-message-timestamp#always-overwrite-timestamps
i.e. whether headers should be overwritten if they are already present
in the message.

Both `set_header_timestamp` and `set_header_routing_node` behave exactly
to plugins `rabbitmq-message-timestamp` and `rabbitmq-routing-node-stamp`,
respectively.

Upon node boot, the configuration is put into persistent_term to not
cause any performance penalty in the default case where these settings
are disabled.

The channel and MQTT connection process will intercept incoming messages
and - if configured - add the desired AMQP 0.9.1 headers.

For now, this allows using Native MQTT in 3.12 with the old plugins
behaviour.

In the future, once "message containers" are implemented,
we can think about more generic message interceptors where plugins can be
written to modify arbitrary headers or message contents for various protocols.

Likewise, in the future, once MQTT 5.0 is implemented, we can think
about an MQTT connection interceptor which could function similar to a
`rabbit_channel_interceptor` allowing to modify any MQTT packet.
2023-05-15 08:37:52 +00:00
Simon Unge d0fadf9e08 Fix so that default policy ha-mode and ha-sync-mode are are converted to binary 2023-05-01 14:46:05 -07:00
Simon Unge 367b1f0a6d Add ha-sync-mode as an operator policy 2023-04-27 15:16:39 -07:00
Alex Valiushko 4c30d9a6b4 address feedback 2023-04-17 17:51:38 -07:00
Alex Valiushko 13a37f512b add config fields 2023-04-17 11:26:43 -07:00
Simon Unge b42e99acfe See #7593. Use connection_max to stop connections in rabbitmq 2023-03-28 17:07:57 -07:00
Michael Klishin 87b65c2142 permit_deprecated_features.* => deprecated_features.permit.* 2023-03-24 19:54:58 +04:00
Alex Valiushko 89582422f5 Add default_users per #7208 2023-02-24 15:41:25 -08:00
Simon Unge d66b38d333 See #7323. Rename default policy for ha-* and add option to massage key/value for aggregate_props 2023-02-22 11:46:03 -08:00
Alex Valiushko e07ed47d83 Parse and apply default_policies.operator
Example:

  default_policies.operator.policy-name.vhost_pattern = ^device
  default_policies.operator.policy-name.queue_pattern = .*
  default_policies.operator.policy-name.max_length_bytes = 1GB
  default_policies.operator.policy-name.max_length = 1000000
2022-12-16 10:25:30 -08:00
Michael Klishin 8326ec3983
Expose aten poll interval in rabbitmq.conf
as `raft.adaptive_failure_detector.poll_interval`.

On systems under peak load, inter-node communication link congestion
can result in false positives and trigger QQ leader re-elections that
are unnecessary and could make the situation worse.

Using a higher poll interval would at least reduce the probability of
false positives.

Per discussion with @kjnilsson @mkuratczyk.
2022-12-12 16:45:45 +04:00
Simon Unge 9af4567342 See #4980. Give *.absolute precedence over *.relative configuration 2022-11-30 12:44:18 -08:00
Michael Klishin 20bc656d14 Rename a couple of snippets 2022-10-20 04:26:16 +04:00
Michael Klishin edcc31ef58 Update default virtual host limit tests 2022-10-20 03:40:36 +04:00
Michael Klishin 919248293b Rename a schema key
References #6172
2022-10-20 03:08:06 +04:00
Alex Valiushko 27ebc04dc9 Add ability to set default vhost limits by pattern
Limits are defined in the instance config:

    default_limits.vhosts.1.pattern = ^device
    default_limits.vhosts.1.max_connections = 10
    default_limits.vhosts.1.max_queues = 10

    default_limits.vhosts.2.pattern = ^system
    default_limits.vhosts.2.max_connections = 100

    default_limits.vhosts.3.pattern = .*
    default_limits.vhosts.3.max_connections = 20
    default_limits.vhosts.3.max_queues = 20

Where pattern is a regular expression used to match limits to a newly
created vhost, and the limits are non-negative integers. First matching
set of limits is applied, only once, during vhost creation.
2022-10-19 20:00:25 +00:00
David Ansari ceb5c72bbb Do not compute checksums for quorum queues
Make use of https://github.com/rabbitmq/ra/pull/292

The new default will be to NOT compute CRC32 for quorum queue segments
and to NOT compute Adler32 for WAL to achieve better performance.

See https://github.com/rabbitmq/ra/pull/292#pullrequestreview-1013194678
for performance improvements.
2022-07-06 13:37:50 +02:00
Michael Klishin 26f00b40db
rabbit.classic_queue_default_version => classic_queue.default_version
we do not use this prefix for any keys in rabbitmq.conf
2022-02-09 18:21:07 +03:00
Luke Bakken c352525e0c
Rename `variable_queue_default_version` to `classic_queue_default_version` 2022-01-25 11:23:23 +01:00
Luke Bakken 5da7396bf3
Add rabbit.variable_queue_default_version to the cuttlefish schema 2022-01-25 11:23:23 +01:00
Michael Klishin 8a30cf1c86
Distribution listener settings support in rabbitmq.conf
* distribution.listener.interface
 * distribution.listener.port_range.min
 * distribution.listener.port_range.max

Closes #3739
2021-11-16 16:37:28 +03:00
Michael Klishin 686dccf410 Introduce a target cluster size hint setting
This is meant to be used by deployment tools,
core features and plugins
that expect a certain minimum
number of cluster nodes
to be present.

For example, certain setup steps
in distributed plugins might require
at least three nodes to be available.

This is just a hint, not an enforced
requirement. The default value is 1
so that for single node clusters,
there would be no behavior changes.
2021-11-03 08:42:58 +00:00
Michael Klishin 6a0058fe7c
Introduce TLS-related rabbitmq.conf settings for definition import
currently only used by the HTTPS mechanism but can be used by
any other.
2021-08-17 20:42:53 +03:00
Michael Klishin f3a5235408
Refactor definition import to allow for arbitrary sources
The classic local filesystem source is still supported
using the same traditional configuration key, load_definitions.

Configuration schema follows peer discovery in spirit:

 * definitions.import_backend configures the mechanism to use,
   which can be a module provided by a plugin
 * definitions.* keys can be defined by plugins and contain any
   keys a specific mechanism needs

For example, the classic local filesystem source can now be
configured like this:

``` ini
definitions.import_backend = local_filesystem
definitions.local.path = /path/to/definitions.d/definition.json
```

``` ini
definitions.import_backend = https
definitions.https.url = https://hostname/path/to/definitions.json
```

HTTPS may require additional configuration keys related to TLS/x.509
peer verification. Such extra keys will be added as the need for them
becomes evident.

References #3249
2021-08-14 14:53:45 +03:00
David Ansari 0876746d5f Remove randomized startup delays
On initial cluster formation, only one node in a multi node cluster
should initialize the Mnesia database schema (i.e. form the cluster).
To ensure that for nodes starting up in parallel,
RabbitMQ peer discovery backends have used
either locks or randomized startup delays.

Locks work great: When a node holds the lock, it either starts a new
blank node (if there is no other node in the cluster), or it joins
an existing node. This makes it impossible to have two nodes forming
the cluster at the same time.
Consul and etcd peer discovery backends use locks. The lock is acquired
in the consul and etcd infrastructure, respectively.

For other peer discovery backends (classic, DNS, AWS), randomized
startup delays were used. They work good enough in most cases.
However, in https://github.com/rabbitmq/cluster-operator/issues/662 we
observed that in 1% - 10% of the cases (the more nodes or the
smaller the randomized startup delay range, the higher the chances), two
nodes decide to form the cluster. That's bad since it will end up in a
single Erlang cluster, but in two RabbitMQ clusters. Even worse, no
obvious alert got triggered or error message logged.

To solve this issue, one could increase the randomized startup delay
range from e.g. 0m - 1m to 0m - 3m. However, this makes initial cluster
formation very slow since it will take up to 3 minutes until
every node is ready. In rare cases, we still end up with two nodes
forming the cluster.

Another way to solve the problem is to name a dedicated node to be the
seed node (forming the cluster). This was explored in
https://github.com/rabbitmq/cluster-operator/pull/689 and works well.
Two minor downsides to this approach are: 1. If the seed node never
becomes available, the whole cluster won't be formed (which is okay),
and 2. it doesn't integrate with existing dynamic peer discovery backends
(e.g. K8s, AWS) since nodes are not yet known at deploy time.

In this commit, we take a better approach: We remove randomized startup
delays altogether. We replace them with locks. However, instead of
implementing our own lock implementation in an external system (e.g. in K8s),
we re-use Erlang's locking mechanism global:set_lock/3.

global:set_lock/3 has some convenient properties:
1. It accepts a list of nodes to set the lock on.
2. The nodes in that list connect to each other (i.e. create an Erlang
cluster).
3. The method is synchronous with a timeout (number of retries). It
blocks until the lock becomes available.
4. If a process that holds a lock dies, or the node goes down, the lock
held by the process is deleted.

The list of nodes passed to global:set_lock/3 corresponds to the nodes
the peer discovery backend discovers (lists).

Two special cases worth mentioning:

1. That list can be all desired nodes in the cluster
(e.g. in classic peer discovery where nodes are known at
deploy time) while only a subset of nodes is available.
In that case, global:set_lock/3 still sets the lock not
blocking until all nodes can be connected to. This is good since
nodes might start sequentially (non-parallel).

2. In dynamic peer discovery backends (e.g. K8s, AWS), this
list can be just a subset of desired nodes since nodes might not startup
in parallel. That's also not a problem as long as the following
requirement is met: "The peer disovery backend does not list two disjoint
sets of nodes (on different nodes) at the same time."
For example, in a 2-node cluster, the peer discovery backend must not
list only node 1 on node 1 and only node 2 on node 2.

Existing peer discovery backends fullfil that requirement because the
resource the nodes are discovered from is global.
For example, in K8s, once node 1 is part of the Endpoints object, it
will be returned on both node 1 and node 2.
Likewise, in AWS, once node 1 started, the described list of instances
with a specific tag will include node 1 when the AWS peer discovery backend
runs on node 1 or node 2.

Removing randomized startup delays also makes cluster formation
considerably faster (up to 1 minute faster if that was the
upper bound in the range).
2021-06-03 08:01:28 +02:00
Jean-Sébastien Pédron 2f648da118
config_schema_SUITE: Stop testing log configuration
The design of the rabbit_ct_config_schema helper makes it impossible to
do pattern matching and thus handle default values in the schema. As a
consequence, the helper explicitly removes the `{rabbit, {log, _}}`
configuration key to work around this limitation until a proper solution
is implemented and all testsuites rewritten. See
rabbitmq/rabbitmq-ct-helpers@b1f1f1ce68.

Therefore, we can't test log configuration variables anymore using this
helper. Thatt's ok because logging_SUITE already tests many things.
2021-03-30 10:21:26 +02:00
Michal Kuratczyk 6a81589c11 Expose `bypass_pem_cache` through rabbitmq.conf
Bypassing PEM cache may speed up TLS handshakes in some cases as described
here:
https://blog.heroku.com/how-we-sped-up-sni-tls-handshakes-by-5x
2020-12-17 16:53:14 +01:00
Philip Kuryloski a1fe3ab061 Change repo "root" to deps/rabbit
rabbit must not be the monorepo root application, as other applications depend on it
2020-11-13 14:34:42 +01:00