Commit Graph

3395 Commits

Author SHA1 Message Date
Thuan Duong Ba 3aeeed5f57
Support rabbit_peer_discovery_aws to work with instance metadata service v2 (IMDSv2).
IMDSv2 uses session-oriented requests. With session-oriented requests, a session token is retrieved first
then used in subsequent GET requests for instance metadata values such as instance-id, credentials, etc.

Details could be found here https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html
2021-04-08 12:28:58 +03:00
kjnilsson e2fd14b996 Bump timeouts for peer discovery suite 2021-04-07 10:00:07 +01:00
kjnilsson b576242952 Increase rabbit_stream_queue_SUITE timetrap
And set the default of make start-cluster to 3 nodes.
2021-04-06 15:50:22 +01:00
Loïc Hoguin 3cab7d59a6
Add new configuration variable to BUILD.bazel 2021-04-06 12:25:36 +02:00
Loïc Hoguin 9063bcbd5c
Add queue_index_segment_entry_count configuration
The default value of ?SEGMENT_ENTRY_COUNT is 16384. Due to
the index file format the entire segment file has to be loaded
into memory whenever messages from this segment must be accessed.

This can result in hundreds of kilobytes that are read, processed
and converted to Erlang terms. This creates a lot of garbage in
memory, and this garbage unfortunately ends up in the old heap
and so cannot be reclaimed before a full GC. Even when forcing
a full GC every run (fullsweep_after=0) the process ends up
allocating a lot of memory to read segment files and this can
create issues in some scenarios.

While this is not a problem when there are only a small number
of queues, this becomes a showstopper when the number of queues
is more important (hundreds or thousands of queues). This only
applies to classic/lazy queues.

This commit allows configuring the segment file entry count
so that the size of the file can be greatly reduced, as well
as the memory footprint when reading from the file becomes
necessary.

Experiments using a segment entry count of 1024 show no
noticeable downside other than the natural increase in the
number of segment files.

The segment entry count can only be set on nodes that have
no messages in their queue index. This is because the index
uses this value to calculate in which segment file specific
messages are sitting in.
2021-04-06 12:25:36 +02:00
Jean-Sébastien Pédron 95f9e92caa
unit_log_management_SUITE: Use $RABBITMQ_LOGS to configure logging
Now that the Cuttlefish schema sets default values for the application
environment in `{rabbit, [{log, ...}]}`, the values set in the testsuite
using application:setenv() are overwritten.

By using the $RABBITMQ_LOGS environment variable, we can override those
default values.
2021-04-06 11:52:55 +02:00
Michael Klishin 34b03d6728
Merge pull request #2940 from rabbitmq/support-journald-logging
Logging: Add journald support
2021-04-01 06:58:11 +03:00
Philip Kuryloski 64f6c18cb8 Add the rabbitmq_auth_backend_oauth2 suite
requires recent @bazel-erlang updates
2021-03-31 19:11:32 +02:00
Philip Kuryloski e94649a398 Mark the eager_sync_SUITE/eager_sync test case as flaky 2021-03-31 19:09:46 +02:00
Philip Kuryloski 0caeb65d04 Shard the eager_sync_SUITE by case
This suite contains only one group, but is long enough to warrant
sharding. This is probably a bit of a time penalty in absolute terms
because init_per_suite and init_per_group re-run in each shard.
2021-03-31 15:47:36 +02:00
Jean-Sébastien Pédron 91583a0c0e
Logging: Add journald support
The implementation depends on erlang-systemd [1] which uses Unix socket
support introduced in Erlang 19. Therefore it doesn't rely on a native
library. We also don't need special handling if the host doesn't use
journald.

To enable the journald handler, add the following configuration
variable:

    log.journald = true

The log level can also be set the same way it is with other handlers:

    log.journald.level = debug

The log messages are communicated to journald using structured data. It
is possible to configure which fields are transmitted and how they are
named:

    log.journald.fields = SYSLOG_IDENTIFIER="rabbitmq-server" syslog_timestamp syslog_pid priority ERL_PID=pid

In this example:
  * the `SYSLOG_IDENTIFIER` is set to a string literal
  * `syslog_timestamp and `syslog_pid` are aliases for
    SYSLOG_TIMESTAMP=time and SYSLOG_PID=os_pid
  * `priority` is a special field computed from the log level
  * `ERL_PID=pid` indicates `pid` should be sent as the `ERL_PID`
    field.

The message itself is implicit and always sent. Otherwise, the list of
fields must be exhaustive: fields which are unset in a particular log
event meta are sent as an empty string and non-mentionned fields are not
sent. The order is not important.

Here are some messages printed by `journalctl -f` during RabbitMQ
startup:

    Mar 26 11:58:31 ip-172-31-43-179 rabbitmq-server[19286]: Ready to start client connection listeners
    Mar 26 11:58:31 ip-172-31-43-179 rabbitmq-server[19286]: started TCP listener on [::]:5672
    Mar 26 11:58:31 ip-172-31-43-179 rabbitmq-server[19286]: Server startup complete; 0 plugins started.

[1] https://github.com/rabbitmq/erlang-systemd
2021-03-31 14:14:35 +02:00
Jean-Sébastien Pédron 571b97513f
Logging: Allow to set timezone in rfc3339- and format-string-based time formats
This is not exposed to the end user (yet) through the Cuttlefish
configuration. But this is required to make logging_SUITE timezone
agnostic (i.e. the timezone of the host running the testsuite should not
affect the formatted times).
2021-03-31 14:13:40 +02:00
Loïc Hoguin e128c6f78a
Merge pull request #2942 from carlhoerberg/proxy-protocol-dest
Get destination address from PROXY protocol
2021-03-31 13:20:54 +02:00
Philip Kuryloski 128785b863 Adopt enhancements from @bazel-erlang
- no need to test version of ra since +debug_info is now default
- adjust the way default erlc_opts are handled
2021-03-31 10:06:56 +02:00
Carl Hörberg 330b820a0f Update proxy protocol test cases 2021-03-30 16:55:36 +02:00
Philip Kuryloski 7b6af92adf Update rabbit PROJECT_ENV bazel equivalent
with the latest value from the Makefile
2021-03-30 10:44:55 +02:00
Jean-Sébastien Pédron 2f648da118
config_schema_SUITE: Stop testing log configuration
The design of the rabbit_ct_config_schema helper makes it impossible to
do pattern matching and thus handle default values in the schema. As a
consequence, the helper explicitly removes the `{rabbit, {log, _}}`
configuration key to work around this limitation until a proper solution
is implemented and all testsuites rewritten. See
rabbitmq/rabbitmq-ct-helpers@b1f1f1ce68.

Therefore, we can't test log configuration variables anymore using this
helper. Thatt's ok because logging_SUITE already tests many things.
2021-03-30 10:21:26 +02:00
Jean-Sébastien Pédron aca638abbb
Logging: Add configuration variables to set various formats
In addition to the existing configuration variables to configure
logging, the following variables were added to extend the settings.

log.*.formatter = plaintext | json
  Selects between the plain text (default) and JSON formatters.

log.*.formatter.time_format = rfc3339_space | rfc3339_T | epoch_usecs | epoch_secs | lager_default
  Configures how the timestamp should be formatted. It has several
  values to get RFC3339 date & time, Epoch-based integers and Lager
  default format.

log.*.formatter.level_format = lc | uc | lc3 | uc3 | lc4 | uc4
  Configures how to format the level. Things like uppercase vs.
  lowercase, full vs. truncated.
  Examples:
    lc: debug
    uc: DEBUG
    lc3: dbg
    uc3: DBG
    lw4: dbug
    uc4: DBUG

log.*.formatter.single_line = on | off
  Indicates if multi-line messages should be reformatted as a
  single-line message. A multi-line message is converted to a
  single-line message by joining all lines and separating them
  with ", ".

log.*.formatter.plaintext.format
  Set to a pattern to indicate the format of the entire message. The
  format pattern is a string with $-based variables. Each variable
  corresponds to a field in the log event. Here is a non-exhaustive list
  of common fields:
    time
    level
    msg
    pid
    file
    line
  Example:
    $time [$level] $pid $msg

log.*.formatter.json.field_map
  Indicates if fields should be renamed or removed, and the ordering
  which they should appear in the final JSON object. The order is set by
  the order of fields in that coniguration variable.
  Example:
    time:ts level msg *:-
  In this example, `time` is renamed to `ts`. `*:-` tells to remove all
  fields not mentionned in the list. In the end the JSON object will
  contain the fields in the following order: ts, level, msg.

log.*.formatter.json.verbosity_map
  Indicates if a verbosity field should be added and how it should be
  derived from the level. If the verbosity map is not set, no verbosity
  field is added to the JSON object.
  Example:
    debug:2 info:1 notice:1 *:0
  In this example, debug verbosity is 2, info and notice verbosity is 1,
  other levels have a verbosity of 0.

All of them work with the console, exchange, file and syslog outputs.

The console output has specific variables too:

log.console.stdio = stdout | stderr
  Indicates if stdout or stderr should be used. The default is stdout.

log.console.use_colors = on | off
  Indicates if colors should be used in log messages. The default
  depends on the environment.

log.console.color_esc_seqs.*
  Indicates how each level is mapped to a color. The value can be any
  string but the idea is to use an ANSI escape sequence.
  Example:
    log.console.color_esc_seqs.error = \033[1;31m

V2: A custom time format pattern was introduced, first using variables,
    then a reference date & time (e.g. "Mon 2 Jan 2006"), thanks to
    @ansd. However, we decided to remove it for now until we have a
    better implementation of the reference date & time parser.

V3: The testsuite was extended to cover new settings as well as the
    syslog output. To test it, a fake syslogd server was added (Erlang
    process, part of the testsuite).

V4: The dependency to cuttlefish is moved to rabbitmq_prelaunch which
    actually uses the library. The version is updated to 3.0.1 because
    we need Kyorai/cuttlefish#25.
2021-03-29 17:39:50 +02:00
kjnilsson ca1afe5223 type fix in stream coordinator 2021-03-29 15:23:49 +01:00
Philip Kuryloski 388654c542
Add a partial Bazel build (#2938)
Adds WORKSPACE.bazel, BUILD.bazel & *.bzl files for partial build & test with Bazel. Introduces a build-time dependency on https://github.com/rabbitmq/bazel-erlang
2021-03-29 11:01:43 +02:00
dcorbacho e98b343095 Fix variable match 2021-03-26 17:55:07 +01:00
Philip Kuryloski 09e85d2e3d
Merge pull request #2935 from rabbitmq/rabbitmq-queue-int-tests
Fix integration tests to wait until ra cluster is ready
2021-03-26 17:28:06 +01:00
dcorbacho a1caff2a86 Fix integration tests to wait until ra cluster is ready
Publish/confirm before grow/shrink members is enough
2021-03-26 17:04:50 +01:00
Philip Kuryloski 1ead01081a Increase startup delay range in peer_discovery_classic_config_SUITE
I suspect the second ra system for coordination requires a bit more
time in boot, as this seems to flake more often since the merge
2021-03-26 14:11:36 +01:00
Michael Klishin 2eac4debbf
rabbitmq.conf.example: mention pause_minority 2021-03-25 22:10:11 +03:00
Philip Kuryloski 3c0c0901b1 Restore retry in peer_discovery_classic_config_SUITE
It was accidentally left commented out
2021-03-25 20:05:36 +01:00
Philip Kuryloski c313f36b57 Fix Makefile for feature_flags_SUITE_data/my_plugin
It was not updated for the rabbitmq-components.mk consolidation
2021-03-25 19:43:48 +01:00
Jean-Sébastien Pédron f3f5606f22
rabbit_prelaunch_errors: Handle exception stacktraces with args list
... instead of function arity.

I don't know when this was introduced, perhaps Erlang 23. Anyway, each
stacktrace entry has now the form:

    {Mod, Fun, ArgListOrArity, Props}

where `ArgListOrArity` is either the function arity (an integer) or the
actual list of arguments passed to the function.

If we get the latter, we format the stacktrace line as usual and we add
the list of arguments to the line below it.
2021-03-25 14:49:25 +01:00
Philip Kuryloski 008e47ef3c Fixup the behavior of rabbit_mnesia:is_virgin_node/0
Given the addition of the Coord ra system (and additional files on disk)
2021-03-25 10:49:17 +01:00
Michael Klishin 647b2ad453
Revisit what drain and revive do when their feature flag is not enabled
If maintenance mode feature flag is not enable, drain and revive should return an error
2021-03-24 23:11:59 +03:00
kjnilsson 8d8b67bb34 fix rabbit_fifo_int_SUITE 2021-03-24 14:17:34 +00:00
Michael Klishin 8eac876bc8
Use "quorum_queues" for QQ Ra system
"quorum" and "coordination" are not very distinctive
2021-03-22 21:44:19 +03:00
kjnilsson 75cea78415
fixes 2021-03-22 21:44:19 +03:00
kjnilsson f6f02a5d2d
ra systems wip 2021-03-22 21:44:15 +03:00
Michael Klishin 246f50598b
Stacktrace arity can be an argument list in some cases
According to [1]. What even are types.

1. https://erlang.org/doc/reference_manual/errors.html
2021-03-22 21:08:31 +03:00
Philip Kuryloski a63f169fcb Remove duplicate rabbitmq-components.mk and erlang.mk files
Also adjust the references in rabbitmq-components.mk to account for
post monorepo locations
2021-03-22 15:40:19 +01:00
dcorbacho 9c1766df43 Rename policies to unsupported_policies in capabilities/0 2021-03-22 11:06:10 +01:00
Michael Klishin 373285093e
Merge pull request #2899 from rabbitmq/parallel-stream-suite
Run most stream tests in parallel
2021-03-19 22:21:18 +03:00
Michael Klishin 5a6c288395
Merge pull request #2902 from rabbitmq/unsupported-stream-rebalance
Filter out stream queues from rebalance command
2021-03-19 22:00:09 +03:00
Michael Klishin 5e0d7041cd
Merge pull request #2910 from rabbitmq/configure-num-conns-sup
Make ranch parameter `num_conns_sups` configurable
2021-03-19 21:59:30 +03:00
Michael Klishin 68bec4c945
Ranch max connection is per connection supervisor in Ranch 2.0 2021-03-19 21:54:45 +03:00
Jean-Sébastien Pédron 9fd2d68e7a
rabbit_prelaunch_logging: $RABBITMQ_LOGS doesn't override log level
... if it is set in the configuration file.

Here is an example of that use case:
* The official Docker image sets RABBITMQ_LOGS=- in the environment
* A user of that image adds a configuration file with:
      log.console.level = debug

The initial implementation, introduced in rabbitmq/rabbitmq-server#2861,
considered that if the output is overriden in the environment (through
$RABBITMQ_LOGS), any output configuration in the configuration file is
ignored.

The problem is that the output-specific configuration could also set the
log level which is not changed by $RABBITMQ_LOGS. This patch fixes that
by keeping the log level from the configuration (if it is set obviously)
even if the output is overridden in the environment.
2021-03-19 15:43:28 +01:00
dcorbacho a41ece3950 Make ranch parameter `num_conns_sups` configurable
Defaults to 1
rabbit - num_conns_sup
rabbitmq_mqtt - num_conns_sup
rabbitmq_stomp - num_conns_sup
2021-03-18 21:38:13 +01:00
kjnilsson 52f745dcde Update rabbitmq-components.mk
use v1.x branch of ra
2021-03-18 15:14:40 +00:00
dcorbacho 75e37ce1db Filter out stream queues from rebalance command
It's not yet supported by streams, so avoid them altogether to avoid crashes
2021-03-17 23:26:29 +01:00
dcorbacho 9b3b5d48ec Run most stream tests in parallel
The test suite isn't faster, I guess some contention on the coordinator,
but is finding some bugs.
2021-03-17 21:32:42 +01:00
kjnilsson cbf0107605 Stream coordinator bug fix
Fix issue where a deleted replica could be restarted if the leader went
down whilst the replica was still running it's start phase.
2021-03-17 13:54:28 +00:00
kjnilsson 9d83e0c5d9 Add logging to config decryption test
To possibly get a bit more information on failure reasons on GH Actions.
2021-03-16 16:28:41 +00:00
Karl Nilsson 1b7379d266
Merge pull request #2876 from rabbitmq/stream-coord-refactor
Stream Coordinator refactor
2021-03-16 11:10:44 +00:00
kjnilsson eb91d50fd4 stream coordinator fall back to consistent query 2021-03-16 09:01:56 +00:00
Arnaud Cogoluègnes e46216b5a8 Check PID on leader lookup in stream plugin
To make sure the PID is alive, as the mnesia record can stale after a
failure.

Make also the local PID lookup in the stream coordinator do a consistent
query over the cluster if the PID is not alive.

Co-authored-by: Karl Nilsson <kjnilsson@users.noreply.github.com>
2021-03-12 15:04:40 +00:00
kjnilsson 3a26cf8654 Stream coordinator: handle commands for unknown streams
To avoid crashing.
2021-03-12 15:04:40 +00:00
kjnilsson 1709208105 Throw resource error when no local stream member
As well as some additional tests
2021-03-12 15:04:40 +00:00
dcorbacho e19aca8075 Use right map fields to compute streams info 2021-03-12 15:04:40 +00:00
kjnilsson 7fa3f6b6e1 Stream Coordinator: primitive backoff
Sleep for 5s after a failure due to a node being down before reporting
back to stream coordinator (which will immediately retry).

stream coordinator: correct command type spec

tidy up

fix rabbit_fifo_prop tests

stream coord: add function for member state query
2021-03-12 15:03:47 +00:00
kjnilsson bb3e0a7674 Move stream coordinator unit tests into ct suite 2021-03-12 15:03:10 +00:00
kjnilsson 9fb2e6d2dd Stream Coordinator refactor 2021-03-12 15:03:08 +00:00
Loïc Hoguin d5e3bdd623
Add ADDITIONAL_PLUGINS variable
This allows including additional applications or third party
plugins when creating a release, running the broker locally,
or just building from the top-level Makefile.

To include Looking Glass in a release, for example:

$ make package-generic-unix ADDITIONAL_PLUGINS="looking_glass"

A Docker image can then be built using this release and will
contain Looking Glass:

$ make docker-image

Beware macOS users! Applications such as Looking Glass include
NIFs. NIFs must be compiled in the right environment. If you
are building a Docker image then make sure to build the NIF
on Linux! In the two steps above, this corresponds to Step 1.

To run the broker with Looking Glass available:

$ make run-broker ADDITIONAL_PLUGINS="looking_glass"

This commit also moves Looking Glass dependency information
into rabbitmq-components.mk so it is available at all times.
2021-03-12 12:29:28 +01:00
David Ansari 18ba5b803f Avoid unnecessary network calls
by flipping the two list comprehension conditions.
If not is_local_to_node, then is_down will not be evaluated.
This saves (R-1) * Q network calls every 2 minutes where R is the number
of replicas per quorum queue and Q is the number of quorum queues in the
RabbitMQ cluster.
2021-03-11 16:29:05 +01:00
Michael Klishin 97ff62d3b2
Drop trailing newlines from logged messages where possible
Lager strips trailing newline characters but OTP logger with the default
formatter adds a newline at the end. To avoid unintentional multi-line log
messages we have to revisit most messages logged.

Some log entries are intentionally multiline, others
are printed to stdout directly: newlines are required there
for sensible formatting.
2021-03-11 15:17:37 +01:00
Michael Klishin b67c030953
Use Cuttlefish master for now
it no longer depends on Lager
2021-03-11 15:17:37 +01:00
Jean-Sébastien Pédron cdcf602749
Switch from Lager to the new Erlang Logger API for logging
The configuration remains the same for the end-user. The only exception
is the log root directory: it is now set through the `log_root`
application env. variable in `rabbit`. People using the Cuttlefish-based
configuration file are not affected by this exception.

The main change is how the logging facility is configured. It now
happens in `rabbit_prelaunch_logging`. The `rabbit_lager` module is
removed.

The supported outputs remain the same: the console, text files, the
`amq.rabbitmq.log` exchange and syslog.

The message text format slightly changed: the timestamp is more precise
(now to the microsecond) and the level can be abbreviated to always be
4-character long to align all messages and improve readability. Here is
an example:

    2021-03-03 10:22:30.377392+01:00 [dbug] <0.229.0> == Prelaunch DONE ==
    2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>
    2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>  Starting RabbitMQ 3.8.10+115.g071f3fb on Erlang 23.2.5
    2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>  Licensed under the MPL 2.0. Website: https://rabbitmq.com

The example above also shows that multiline messages are supported and
each line is prepended with the same prefix (the timestamp, the level
and the Erlang process PID).

JSON is also supported as a message format and now for any outputs.
Indeed, it is possible to use it with e.g. syslog or the exchange. Here
is an example of a JSON-formatted message sent to syslog:

    Mar  3 11:23:06 localhost rabbitmq-server[27908] <0.229.0> - {"time":"2021-03-03T11:23:06.998466+01:00","level":"notice","msg":"Logging: configured log handlers are now ACTIVE","meta":{"domain":"rabbitmq.prelaunch","file":"src/rabbit_prelaunch_logging.erl","gl":"<0.228.0>","line":311,"mfa":["rabbit_prelaunch_logging","configure_logger",1],"pid":"<0.229.0>"}}

For quick testing, the values accepted by the `$RABBITMQ_LOGS`
environment variables were extended:
  * `-` still means stdout
  * `-stderr` means stderr
  * `syslog:` means syslog on localhost
  * `exchange:` means logging to `amq.rabbitmq.log`

`$RABBITMQ_LOG` was also extended. It now accepts a `+json` modifier (in
addition to the existing `+color` one). With that modifier, messages are
formatted as JSON intead of plain text.

The `rabbitmqctl rotate_logs` command is deprecated. The reason is
Logger does not expose a function to force log rotation. However, it
will detect when a file was rotated by an external tool.

From a developer point of view, the old `rabbit_log*` API remains
supported, though it is now deprecated. It is implemented as regular
modules: there is no `parse_transform` involved anymore.

In the code, it is recommended to use the new Logger macros. For
instance, `?LOG_INFO(Format, Args)`. If possible, messages should be
augmented with some metadata. For instance (note the map after the
message):

    ?LOG_NOTICE("Logging: switching to configured handler(s); following "
                "messages may not be visible in this log output",
                #{domain => ?RMQLOG_DOMAIN_PRELAUNCH}),

Domains in Erlang Logger parlance are the way to categorize messages.
Some predefined domains, matching previous categories, are currently
defined in `rabbit_common/include/logging.hrl` or headers in the
relevant plugins for plugin-specific categories.

At this point, very few messages have been converted from the old
`rabbit_log*` API to the new macros. It can be done gradually when
working on a particular module or logging.

The Erlang builtin console/file handler, `logger_std_h`, has been forked
because it lacks date-based file rotation. The configuration of
date-based rotation is identical to Lager. Once the dust has settled for
this feature, the goal is to submit it upstream for inclusion in Erlang.
The forked module is calld `rabbit_logger_std_h` and is based
`logger_std_h` in Erlang 23.0.
2021-03-11 15:17:36 +01:00
Michael Klishin cca6b22720
This message arguably belongs to the channel category
Since it documents something that happens to channels.
There are other messages that document the fact that
this connection is being closed.
2021-03-11 08:20:11 +03:00
Michael Klishin f6e8320fc9
Merge branch 'otp-24-ranch' 2021-03-10 07:37:51 +03:00
dcorbacho 61f7b2a723 Update to ranch 2.0 2021-03-08 23:11:05 +01:00
Michael Klishin 3eee3eecff
Cuttlefish 2.7.0 for Erlang 24 compatibility 2021-03-08 10:53:25 +01:00
Michael Klishin b6c4831e75
Bump Lager to 3.9.1 2021-03-04 04:36:39 +03:00
dcorbacho 82512ce206
Clean up rabbit_fifo_usage table on queue.delete 2021-03-04 04:06:40 +03:00
Michael Klishin ea41934d9c
Require Erlang/OTP 23 for RabbitMQ 3.9 2021-03-04 04:06:40 +03:00
Michael Klishin 3127a2f5d6
Remove an stdout log entry 2021-03-01 21:55:27 +03:00
Michael Klishin 4ab0c2e44f
Restore Erlang 22.3 compatibility for direct reply-to 2021-03-01 21:55:27 +03:00
Michal Kuratczyk a0aa8139b1
Replace Async threads warning with Dirty I/O
Async threads are basically not used these days.
Dirty I/O schedulers, on the other hand, are used a lot.
2021-03-01 21:55:27 +03:00
Loïc Hoguin 66ac1bf5e9
Bump observer_cli to 1.6.1
More responsive when the system is overloaded with file calls.
2021-03-01 21:55:27 +03:00
Michael Klishin ffcc590238
Update heartbeat timeout docs in rabbitmq.conf.example
Per suggestion from @adamhooper in #2852
2021-03-01 21:55:26 +03:00
Michael Klishin 4b98a0db7e
Adaptation to Lager 3.9 2021-02-26 04:23:04 +03:00
Michael Klishin 8fe3df9343
Upgrade Lager to 3.9.0 for OTP 24 compatibility
`lager_util:expand_path/1` use changes are
due to erlang-lager/lager#540
2021-02-26 00:52:15 +03:00
Michael Klishin a2f98f25e9
Merge pull request #2804 from rabbitmq/rabbitmq-server-2756
Add federation support for quorum queues
2021-02-25 19:10:15 +03:00
Michael Klishin 17b082abeb
Merge pull request #2843 from rabbitmq/consumer-capacity
Rename consumer_utilisation to consumer_capacity
2021-02-25 16:17:09 +03:00
dcorbacho 6778c1fea3 Small code enhancements 2021-02-25 11:27:40 +01:00
Michael Klishin 5f72779d57
Squash a warning on OTP 24 2021-02-25 01:06:41 +03:00
Michael Klishin cd1a271499
As of Lager 3.8.2, Lager has a log_root default
so override it unconditionally.
2021-02-25 00:43:02 +03:00
Michael Klishin 046db4be92
OTP 22 compatibility
(cherry picked from commit 652ffd2a15)
2021-02-24 22:37:55 +03:00
Michael Klishin f183ee4609
Adapt to the types used by string:lexemes/2 2021-02-24 20:41:32 +03:00
Michael Klishin 7ba2bde260
Correctly use string:lexemes/2 2021-02-24 20:31:10 +03:00
Michael Klishin 33d8ac4f79
Revisit type signatures that tripped up Dialyzer 2021-02-24 20:17:20 +03:00
Michael Klishin a11f98ccd8
Fall back to v1 direct reply-to encoding 2021-02-24 20:03:02 +03:00
Michael Klishin 00b7a84191
Limit direct reply-to identifier length growth
as node names grow.

Prior to this change, direct reply-to consumer channels
were encoded using term_to_binary/1, which means the result
would grow together with node name (since node name
is one of the components of an Erlang pid type).

This means that with long enough hostnames, reply-to
identifiers could overflow the 255 character limit of
message property field type, longstr.

With this change, the encoded value uses a hash of the node name
and then locates the actual node name from a map of
hashes to current cluster members.

In addition, instead of generating non-predictable "secure"
GUIDs the feature now generates "regular" predictable GUIDs
which compensates some of the additional PID pre- and post-processing
outlined above.
2021-02-24 18:21:26 +03:00
Michael Klishin 129a57dcef
Extract direct reply-to PID encoding into a new module 2021-02-24 18:21:26 +03:00
dcorbacho 930c78795c Rename consumer_utilisation to consumer_capacity
Capacity is 100% when there are online consumers and no messages
2021-02-24 16:20:52 +01:00
Michael Klishin f73e851f9c
Bump observer_cli to 1.6.0 2021-02-24 12:53:55 +03:00
Michael Klishin a5098b28a7
Bump Lager to 3.8.2 for OTP 24 compatibility 2021-02-24 12:53:30 +03:00
dcorbacho 8592291afa Fix notify_decorators and policy notifications 2021-02-21 22:56:06 +01:00
dcorbacho 0e63a7d79c Select applicable policies from exclusion list
It's not possible to know all aplicable policies since plugins can extend
these, i.e. federation. Thus, we'll exclude the known unapplicable core policies
and allow through any other policy.
2021-02-19 16:45:57 +01:00
Michael Klishin f37b3ca5d0
Ditto for ERTS_MINIMUM
(cherry picked from commit b78f45260f)
2021-02-19 17:33:35 +03:00
Michael Klishin fda5c0745c
Bump validated Erlang/OTP version minimum to 22.3
After several releases and several months of
docs saying that 22.3 is the minimum supported
version, we pull the trigger.

This is required by rabbitmq/credentials-obfuscation#10
as it uses the new crypto API available
in 22.1+ only.

(cherry picked from commit 36d6693776)
2021-02-19 16:48:09 +03:00
dcorbacho 699cd1ab29 Add federation support for quorum queues 2021-02-18 17:15:47 +01:00
Michael Klishin 7af5802a0e
rabbit_looking_glass: add an xref ignore
Looking Glass is a conditionally added dependency that
must be excluded in CI environments.
2021-02-18 18:26:48 +03:00
Michael Klishin f58a8370c8
Bump Cuttlefish to 2.6.0 2021-02-18 18:01:01 +03:00
Michael Klishin 5c84cf80ab
Make it possible to start Looking Glass without starting tracing
by exporting RABBITMQ_TRACER="true"
2021-02-16 14:09:46 +03:00
Michael Klishin d52f8da763
Correct TLS settings not compatible with TLSv1.3 2021-02-14 01:35:08 +03:00
Michael Klishin 6d4c005731
rabbitmq.conf.example: clarify TLS settings incompatible with TLSv1.3 2021-02-14 01:08:24 +03:00
Michael Klishin 36262814fe
Document TLSv1.3-specific cipher suites in rabbitmq.conf.example 2021-02-14 00:01:23 +03:00
Michael Klishin 0e4436ff37
More TLS-related edits in rabbitmq.conf.example 2021-02-13 23:25:12 +03:00
Michael Klishin b6f7a58487
Drive-by change: correctly format connection name 2021-02-13 23:23:41 +03:00
Gabriele Santomaggio 68120f3467 fix space 2021-02-13 14:14:48 +01:00
Gabriele Santomaggio 1ac2e22f54 Add tls info 2021-02-13 14:12:32 +01:00
Gabriele Santomaggio b6635ab52e Add tls info 2021-02-13 14:09:30 +01:00
Michael Klishin a3713aae54
Merge pull request #2803 from carlhoerberg/default-no-busy-wait
Disable Erlang busy wait threshold by default
2021-02-10 21:21:50 +03:00
D Corbacho 09f74b3a47
Merge pull request #2773 from rabbitmq/is-unresponsive-stream
Implement `is_unresponsive` for stream queues
2021-02-10 14:09:29 +00:00
Carl Hörberg 413bfe7b37 Disable Erlang busy wait by default
By disabling Erlang busy wait threshold CPU usage with 5000 idle connection
drops from 110% to 14%. Throughput does not seem to be affected at all,
if any thing it actually goes up a bit when you have 5000 idle connections
(because less CPU cycles are wasted polling idle connections).

rabbitmq-perf-test-2.13.0/bin/runjava com.rabbitmq.perf.PerfTest -s 8000 -z 15

With default erlang busy wait threshold:
id: test-115706-497, sending rate avg: 39589 msg/s
id: test-115706-497, receiving rate avg: 39570 msg/s

With busy wait disabled:
id: test-115807-719, sending rate avg: 40340 msg/s
id: test-115807-719, receiving rate avg: 40301 msg/s

rabbitmq-diagnostics runtime_thread_stats output while running the
PerfTest:

with default busy wait threshold:

Stats per type:
         async    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%
           aux    0.01%    0.00%    0.00%    0.00%    0.00%    0.00%   99.98%
dirty_cpu_sche    0.00%    0.00%    0.00%    0.03%    0.05%    0.00%   99.92%
dirty_io_sched    0.00%    0.00%    0.00%    0.00%    0.01%    0.00%   99.99%
          poll    0.00%    0.67%    0.00%    0.00%    0.00%    0.00%   99.33%
     scheduler    0.69%    0.18%   28.41%    5.49%    9.50%    7.43%   48.29%

without busy wait threshold:

Stats per type:
         async    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%
           aux    0.01%    0.00%    0.00%    0.00%    0.01%    0.00%   99.98%
dirty_cpu_sche    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%
dirty_io_sched    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%
          poll    0.00%    0.77%    0.00%    0.00%    0.00%    0.00%   99.23%
     scheduler    0.70%    0.14%   28.29%    5.41%    0.86%    7.22%   57.38%
2021-02-10 12:35:12 +01:00
Michael Klishin 8e250ae7c4
x-arg validation fixes, improved error reporting for queue declaration
Part of #2798.
2021-02-10 07:30:14 +03:00
Michael Klishin 927a9ddb52
Make it possible to specify optional queue arguments for dynamic Shovels
when shovels declare queues, it is currently not possible to declare
a quorum queue.

Closes #2798.
2021-02-10 06:13:18 +03:00
Michael Klishin 68c04358a5
Drive-by change: improve wording used by 'rabbitmq-queues rebalance' 2021-02-09 21:02:52 +03:00
Michael Klishin 686d462035
Avoid unintentional matching here 2021-02-09 20:32:00 +03:00
Michael Klishin f177a12cca
Handle queue deletions in the middle of leadership transfers
If we cannot transfer a queue because its record is gone
(e.g. it was a queue with TTL or an auto-delete queue),
skip it and move on.

Closes #2796
2021-02-09 19:39:45 +03:00
Michael Klishin ad20bfbc40
Use new crypto API cipher name here
References rabbitmq/credentials-obfuscation#10
2021-02-09 11:22:48 +03:00
Michael Klishin 83e3f75f01
Import runtime parameters after the topology
To reduce the likelihood of Shovel startup racing with
queue startup. See #2799 for details.

Closes #2799 for details.
2021-02-08 20:47:32 +03:00
Michael Klishin e7a3f30fd5
Don't consider exclusive classic queues to be mirrored
Because they are not, and should not be considered for operations
or features that assume mirroring, such as queue rebalacing.

Closes #2795.
2021-02-08 16:59:23 +03:00
Michael Klishin 0939cec51a
Exclude aes_ige256 in one more test suite 2021-02-08 11:21:16 +03:00
Michael Klishin b11a79cccf
Bump (c) year in header files 2021-02-04 07:04:58 +03:00
Michael Klishin d99e56173a
Merge pull request #2771 from rabbitmq/issue-2715
New command to close all connections of a user
2021-02-01 20:35:20 +03:00
Michael Klishin 6cb015933b
Correct a type spec 2021-02-01 20:25:48 +03:00
Michael Klishin 394d36ab76
Use a record here
accessing record/tuple elements by index is
increases the risk of code breakage when
the record changes.
2021-02-01 20:19:12 +03:00
Michael Klishin 0d5aa1b0f3
Compile 2021-02-01 18:56:32 +03:00
dcorbacho bfaea09df9 Implement `is_unresponsive` for stream queues 2021-02-01 16:49:10 +01:00
Michal Kuratczyk ea1f4a355a New command: `rabbitmqctl close_all_user_connections` 2021-02-01 16:04:16 +01:00
dcorbacho 6220b454cb Avoid federation crashes for non-classic queue types 2021-02-01 16:02:28 +01:00
Michael Klishin 20984b9a07
Wording 2021-02-01 15:37:31 +03:00
Michal Kuratczyk ecd2d738c0
Check whether the file is readable
Since the validation fails with "or isn't readable", we should actually
check whether we can read the file. This way, when configuring TLS for
example, you get early feedback if the cert files are not readable.
2021-02-01 15:19:57 +03:00
Michael Klishin 589352c31a
Remove a leftover binding
The line that used it was removed in #2765
2021-02-01 15:05:07 +03:00
Michal Kuratczyk 7cc2c7889d
Remove a log line related to CMQ leader transfer
We've removed this functionality in c7b9c39352
so we shouldn't log that we would do that.
2021-01-28 20:55:11 +01:00
Michael Klishin ecd3df13f5
Merge pull request #2759 from rabbitmq/default-quorum-cluster-size
Use `quorum_cluster_size` as the default cluster size parameter from configuration
2021-01-28 15:48:44 +03:00
Michael Klishin 1a8975bb80
Be extra defensive when fetching initial quorum group size
if the value is not set by mistake, quorum queue will fail
to start.
2021-01-28 15:35:36 +03:00
Michael Klishin 93388d55bd
Correctly fetch 'rabbit.quorum_cluster_size' 2021-01-28 15:05:22 +03:00
dcorbacho dd986b17d9 Fix capture of `application:get_env` return value 2021-01-28 13:03:27 +01:00
Michal Kuratczyk 5a967affdd WIP: close_all_user_connections command 2021-01-28 12:57:18 +01:00
dcorbacho 477f542ad2 Set default cluster size to 3 2021-01-28 10:46:28 +01:00
dcorbacho c3cc5568b2 Fix default cluster size config name 2021-01-28 10:38:51 +01:00
Michael Klishin 44ee2305d0
Merge pull request #2755 from rabbitmq/mk-remove-cmq-leadership-transfer-from-maintenance-mode
Don't perform CMQ leadership transfer when entering maintenance mode
2021-01-27 19:12:10 +03:00
Michael Klishin c7b9c39352
Don't perform CMQ leadership transfer when entering maintenance mode
The time this operation can take in clusters with a lot of classic
mirrored queue (say, 10s or 100s of thousands) be prohibitive for
upgrades.

Upgrades that use a health check to ensure that there are in-sync
replicas before entering maintenance mode, in which case
the transfer is not really necessary.

All of the above is more obvious with the recent changes in #2749.
2021-01-27 19:11:26 +03:00
Arnaud Cogoluègnes b921ac11a8
Merge pull request #2712 from rabbitmq/rabbitmq-stream-prometheus
Add stream prometheus plugin
2021-01-27 16:46:37 +01:00
Michael Klishin 3a44bca803
Improve a log message
not only it uses unfortunate terms to refer to secondaries,
it is not particularly clear.
2021-01-26 15:58:23 +03:00
Michael Klishin bc87b9a1bb
Less intrusive CMQ leadership transfer
Since we only consider nodes hosting
in-sync replicas for transfer candidates,
we can drop only one mirror instead of N,
and reduce the load caused by this operation.

This does not affect CMQ leadership
transfer when performed in the context
'rabbitmq-queues rebalance'

Pair: @dcorbacho, @mkuratczyk
2021-01-26 15:35:16 +03:00
Michael Klishin c15024844d
rabbit_amqqueue:maybe_rebalance/4: only consider nodes not under maintenance 2021-01-26 15:32:47 +03:00
Michael Klishin 429f87913e
Synchronously add mirrors when transferring ownership
of classic mirrored queues.

There are cases when asynchronously adding mirrors makes
a lot of sense: e.g. when a new node joins the cluster.
In this case, if we add mirrors asynchronously, this
operation will race with the step that removes mirrors.
As a result, we can end up with a queue that decided
that it had no promotable replicas => data loss
from the transfer.

Closes #2749.

Pairs: @dcorbacho, @mkuratczyk
2021-01-26 14:47:15 +03:00
Michael Klishin 52479099ec
Bump (c) year 2021-01-22 09:00:14 +03:00
Michael Klishin 42f326c125
Avoid double resolving client hostname
when reverse_dns_lookups is set to true.

Closes #2730.
2021-01-20 20:08:04 +03:00
kjnilsson 6d1f3a160b rabbit_fifo: handle unhandled commands
To avoid crashes.
2021-01-20 14:27:37 +00:00
kjnilsson f2418cfe4c Fix crash bug in QQ state conversion
When there are consumers in the service queue.
2021-01-20 14:19:33 +00:00
Arnaud Cogoluègnes b5315c0166
Merge branch 'master' into rabbitmq-stream-prometheus 2021-01-18 11:26:06 +01:00
kjnilsson 9835a43b99 type name fix 2021-01-13 17:12:14 +00:00
kjnilsson ca53234ce3 Remove debug log message 2021-01-13 12:54:59 +00:00
kjnilsson 03ed11c055 bugfix 2021-01-13 12:09:47 +00:00
kjnilsson 2f0dba45d8 Stream: Channel resend on leader change
Detect when a new stream leader is elected and make stream_queues
re-send any unconfirmed, pending messages to ensure they did not get
lost during the leader change. This is done using the osiris
deduplication feature to ensure the resend does not create duplicates of
messages in the stream.
2021-01-13 12:09:44 +00:00
kjnilsson 61de203fc5 Remove use of non-deterministic function
Inside stream coordinator.
2021-01-13 10:40:44 +00:00
kjnilsson 9b8d38d2b9 stream coord: fixes
Make use of rabbit_stream_queue:update_stream_conf deterministic in the
state machine.
2021-01-12 15:09:02 +00:00
dcorbacho 9ef9dde6ce Apply retention policy in all osiris members 2021-01-12 12:18:13 +00:00
dcorbacho e5a2eaaa0d Update retention when only stream retention policy has changed
In any other case, the worker needs to be restarted
2021-01-12 12:18:13 +00:00
Karl Nilsson 015ec8b47f
Merge pull request #2681 from rabbitmq/stream-coordinator-restart-failure
Ensure leader is deleted from supervisor in case of re-election
2021-01-12 12:17:04 +00:00
kjnilsson dfa0775914 Remove unused value 2021-01-12 12:14:42 +00:00
Arnaud Cogoluègnes bf72683eb2
Add stream prometheus plugin 2021-01-11 16:49:56 +01:00
Jean-Sébastien Pédron a0cd2e5fd0
rabbit: Run plugins' boot steps during rabbit start/2
This restores the behavior prior the commit making `rabbit` closer to a
standard Erlang application.

Plugins are still actually started after rabbit is started (because they
depend on the `rabbit` application). Only the execution of their boot
steps was moved earlier.

With the behavior restored, it also means that a plugin's dependencies
are not started yet when its boot steps are executed.

V2: Move the maintenance mode reset before the plugin boot steps run.

V3: Add a `core_started` boot state. That state is reached at the end of
    the `rabbit` app start function. It indicates when the RabbitMQ core
    is started but the full service is not yet ready.

    We now use this state in direct connection code to determine if
    clients can open a direct connection. We have to do that because
    some plugins open a direct connection as part of their own startup
    (i.e. they can't wait for the `ready` boot state which comes later).
2021-01-08 12:31:25 +01:00
Arnaud Cogoluègnes cbd3c8dfdd
Merge branch 'master' into rabbitmq-stream-management 2021-01-04 09:50:47 +01:00
Michal Kuratczyk 6a81589c11 Expose `bypass_pem_cache` through rabbitmq.conf
Bypassing PEM cache may speed up TLS handshakes in some cases as described
here:
https://blog.heroku.com/how-we-sped-up-sni-tls-handshakes-by-5x
2020-12-17 16:53:14 +01:00
Arnaud Cogoluègnes e3bbdfe6df
Merge pull request #2676 from rabbitmq/rabbitmq-server-2667
Definition export: change user tags to a JSON array
2020-12-15 14:14:23 +01:00
dcorbacho 43ee7d45b5 Ensure leader is deleted from supervisor in case of re-election
If the supervisor returns {error, already_present} we can't assume
is the same pid stored as the process is dead
2020-12-14 15:37:53 +01:00
Michael Klishin 8abe0c4328
rabbitmq-upgrade(8): add missing commands
References rabbitmq/rabbitmq-website#1109
2020-12-13 20:58:15 +03:00
Arnaud Cogoluègnes c4d07467da
Merge branch 'master' into rabbitmq-stream-management 2020-12-09 12:00:56 +01:00
Michael Klishin 4ea9ce1c0b
Clarify what version will be the first to use this format 2020-12-09 12:48:56 +03:00
Michael Klishin ab0ade0e4c
Remove a stray debug logging line 2020-12-09 12:47:42 +03:00
Michael Klishin bcf6ac0515
Export user tags as a list
instead of a comma-separated list in a string.

When importing, both formats are now supported (as of
the previous commit).

Closes #2667.
2020-12-09 11:17:55 +03:00
Michael Klishin e4c37db689
Support importing users with arrays of tags
as opposed to a comma-separated binary.

Part of #2667.
2020-12-08 18:22:56 +03:00
Arnaud Cogoluègnes e5ed53c5e2
Merge branch 'master' into rabbitmq-stream-management 2020-12-08 11:41:22 +01:00
Luke Bakken 7a3bd539d3
Pass RABBITMQ_NODENAME via Windows service environment
Without this change using anything other than `rabbit` or the `rabbitmq-env-conf.bat` file will result in `erlang_dist_running_with_unexpected_nodename`

Follow-up to #2673

cc @dumbbell @michaelklishin
2020-12-07 12:03:06 -08:00
Luke Bakken 4306902309
Merge pull request #2673 from luos/allow_configuring_rabbitmq_base_multiple_times2
Windows service: allow overriding base service location
2020-12-07 10:04:00 -08:00
Lajos Gerecs 8fdbc222e3 allow configuring multiple rmq bases for multiple services
Currently RABBITMQ_BASE is always dynamically picked up from the
environment. This change would fix it at the time of configuration
of the service allowing multiple RabbitMQ services to be configured.
2020-12-05 22:13:00 +01:00
Arnaud Cogoluègnes 224e9914b2
Merge branch 'master' into rabbitmq-stream-management 2020-12-04 10:26:42 +01:00
Arnaud Cogoluègnes db5a5f57e8
Send shutdown message to non network/direct connection
Connections to the stream plugin does not have a type, so they can
trigger some function_clause errors. This was the case when trying to
close a connection from rabbit_connection_tracking module. The function
now falls back to a simple gen_server call to the connection process for
connections without a type.
2020-12-04 09:54:21 +01:00
kjnilsson 6fdb7d29ec Handle errors in crashing_queues_SUITE
As the connection may crash during the previous declaration and a caught
error would be returned in amqp_connection:open_channel/1 that wasn't
handled previously. Exactly how things fail in this test is most likely
very timing dependent and may vary.

Also fixes mqtt test where the process that set up a mock auth ETS table
was transient when an rpc timeout was introduced
2020-12-03 13:56:09 +00:00
Luke Bakken 8525a65970
Handle undefined args case
Fixes #2668
2020-12-02 12:33:02 -08:00
Luke Bakken ccf624211a
Add test that fails prior to the change for #2668 2020-12-02 12:33:02 -08:00
Arnaud Cogoluègnes 08891a734e
Merge branch 'master' into rabbitmq-stream-management 2020-11-30 09:42:54 +01:00
Arnaud Cogoluègnes ffd66027af
Merge pull request #2506 from rabbitmq/stream-timestamp-offset
Support timestamp offsets for stream consumers
2020-11-27 14:49:38 +01:00
Arnaud Cogoluègnes 43cfb45a74
Convert AMQP 091 timestamp to millisecond
For start offset in stream queue.
2020-11-27 14:47:36 +01:00
Arnaud Cogoluègnes 8f97ea400a
Start adding publishing dedup support for streams 2020-11-24 17:48:41 +01:00
Arnaud Cogoluègnes c8249a304f
Filter stream connections where metrics are disabled
This implied defining the protocol field in tracked connection to be
able to filter out non-stream connections.
2020-11-20 09:29:55 +01:00
kjnilsson ea7c9e9b61 QQ: Emit release cursor for empty basic gets
Else an application that polled an empty quorum queue frequntly using basic.get
would never result in a snapshot being taken and results in unlimited
log growth.
2020-11-19 15:59:51 +00:00
Arnaud Cogoluègnes 23d7e8114c
Introduce stream management plugin 2020-11-19 14:48:25 +01:00
dcorbacho f23a51261d Merge remote-tracking branch 'origin/master' into stream-timestamp-offset 2020-11-18 14:27:41 +00:00
Jean-Sébastien Pédron 778e8dad5c
rabbit_common: Remove the rabbitmq-github-actions Erlang.mk plugin
This is unused after the switch to the "monorepository".
2020-11-17 15:29:05 +01:00
Jean-Sébastien Pédron 47686ee1f0
Remove unused .github directories
They were valid until the switch to the "monorepository" when everything
was merged into a single Git repository.
2020-11-17 13:33:16 +01:00
Michael Klishin 5b8dba5e2f
Move issue and PR templates to monorepo root 2020-11-17 14:23:58 +03:00
Philip Kuryloski 8ff5273827 Remove unused function 2020-11-16 10:45:10 +01:00
kjnilsson d88b623c18 Use correct credit mode x-credit
When the x-credit consumer arg is defined Quorum Queues should use use
credit mode `credited` and not `simple_prefetch`.
2020-11-16 10:45:10 +01:00
Philip Kuryloski a1fe3ab061 Change repo "root" to deps/rabbit
rabbit must not be the monorepo root application, as other applications depend on it
2020-11-13 14:34:42 +01:00