Commit Graph

2225 Commits

Author SHA1 Message Date
Péter Gömöri 321039c353
fixup: Add rabbit_misc:process_info/2 that also works for remote PIDs
Co-authored-by: Luke Bakken <lukerbakken@gmail.com>
(cherry picked from commit 095f702093)
2024-12-09 22:33:54 -05:00
Péter Gömöri e777c0b263
Add rabbit_misc:process_info/2 that also works for remote PIDs
(cherry picked from commit 7b7708f367)
2024-12-09 22:33:38 -05:00
Michael Klishin 9b1953994e
One more test case for #12888 2024-12-03 23:07:58 -05:00
Ace Breakpoint ee41983e84 Fix crash caused by mishandling of non-ascii amqp_error explaination
`rabbit_binary_generator:map_exception/3` will crash when there are
unicode characters in the `explaination` field of `Reason#amqp_error`
parameter. The explaination string (list) is assumed to be ascii, with
each character/member in the range of a byte. Any unicode characters
in the string will trigger `badarg` crash of `list_to_binary/1` in
`rabbit_binary_generator:amqp_exception_explanation/2`.

Amqp091 shovel crash due to this is reported,
https://github.com/rabbitmq/rabbitmq-server/discussions/12874
When a queue as shovel source/destination does not exist, and its
name contains non-ascii characters, the explaination of amqp_error
will be like `no queue non_ascii_name_😍 in vhost /`. It will
subsequently crash and even affect management console.

To fix this, `unicode:characters_to_binary/1` is used instead of
`list_to_binary/1`, and unicode-safe truncation of long explaination
with `io_lib:format/3` chars_limit replaces direct bytes truncation.
2024-12-04 10:39:16 +08:00
Michael Davis 38091430b5
make: Suppress Elixir charlist warning for dialyze target
The `:io.format/2` call was originally passed a single-quote string
(i.e. a charlist in Elixir terminology) which emits a warning in more
recent Elixir versions:

    warning: single-quoted strings represent charlists. Use ~c"" if you indeed want a charlist or use "" instead
    └─ nofile:1:12

This warning would pop up a few times when using `make dialyze` within
a deps directory. To resolve it we can switch the quoting so that the
eval string is wrapped in single quotes (equivalent for shell since this
line doesn't use variables) and the format argument is wrapped in double
quotes. This uses a binary in Elixir instead, but that's ok because
`io:format/3`'s `io:format()` parameter may either be an atom, string,
or binary.

This trick was copied from Makefile:49 which uses the same quoting.
2024-12-03 12:02:25 -05:00
Michael Klishin d3a3acee16
Refactor: as_list/1 belongs to rabbit_data_coercion 2024-11-25 12:33:33 -05:00
David Ansari 3db4a97cfb Expose AMQP connection metrics
Expose the same metrics for AMQP 1.0 connections as for AMQP 0.9.1 connections.

Display the following AMQP 1.0 metrics on the Management UI:
* Network bytes per second from/to client on connections page
* Number of sessions/channels on connections page
* Network bytes per second from/to client graph on connection page
* Reductions graph on connection page
* Garbage colletion info on connection page

Expose the following AMQP 1.0 per-object Prometheus metrics:
* rabbitmq_connection_incoming_bytes_total
* rabbitmq_connection_outgoing_bytes_total
* rabbitmq_connection_process_reductions_total
* rabbitmq_connection_incoming_packets_total
* rabbitmq_connection_outgoing_packets_total
* rabbitmq_connection_pending_packets
* rabbitmq_connection_channels

The rabbit_amqp_writer proc:
* notifies the rabbit_amqp_reader proc if it sent frames
* hibernates eventually if it doesn't send any frames

The rabbit_amqp_reader proc:
* does not emit stats (update ETS tables) if no frames are received
or sent to save resources when there are many idle connections.
2024-11-02 19:08:24 +01:00
Michal Kuratczyk 2c0fc70135
Abort restart-cluster if something goes wrong
For example, if the first restarted node doesn't start,
don't try to restart the other nodes. This mimics what
orchestrators such as Kubernetes or BOSH would do
(although they perform this check differently)
2024-10-30 12:58:35 +01:00
Jean-Sébastien Pédron c0be3c0648
rabbitmq-run.mk: Restart nodes in a cluster sequentially
... not in parallel.
2024-10-29 11:41:20 +01:00
Jean-Sébastien Pédron 624d9bae0c
rabbitmq-run.mk: Use a 60 seconds timeout for `rabbitmqctl wait`
... not 60 milliseconds.
2024-10-29 11:37:50 +01:00
Loïc Hoguin f68fc8bb94
Make CI: Add mixed version testing
This is enabled on main and for pull requests. Bazel remains
used in previous branches.
2024-10-25 13:50:05 +02:00
Lois Soto Lopez 3ff7e82c5c Provide specific f. to fix client ssl options
Provides a specific function to fix client ssl options, i.e.: apply all
fixes that are applied for TLS listeneres and clients on previous
versions but also sets `cacerts` option to CA certificates obtained by
`public_key:cacerts_get`, only when no `cacertfile` or `cacerts` are
provided.
2024-10-21 18:00:06 -04:00
Karl Nilsson 465b19e8e8 Adjust vheap sizes for message handling processes in OTP 27
OTP 27 reset all assumptions on how the vm reacts to processes that
buffer and process a lot of large binaries.

Substantially increasing the vheap sizes for such process restores
most of the same performance by allowing processes to hold more binary
data before major garbage collections are triggered.

This introduces a new module to capture process flag configurations.

The new vheap sizes are only applied when running on OTP 27 or
above.
2024-10-09 20:08:34 -04:00
Jean-Sébastien Pédron 9b2c6d95f8
rabbit_env: Drop $RABBITMQ_LOG_FF_REGISTRY
[Why]
Its use was removed when the registry was converted from a compiled
module to a persistent_term.
2024-10-07 14:02:50 +02:00
Jean-Sébastien Pédron 6a0008b06c
rabbit_feature_flags: Accept "+feature1,-feature2" in $RABBITMQ_FEATURE_FLAGS
[Why]
Before this patch, the $RABBITMQ_FEATURE_FLAGS environment variable took
an exhaustive list of feature flags to enable. This list overrode the
default of enabling all stable feature flags.

It made it inconvenient when a user wanted to enable an experimental
feature flag like `khepri_db` while still leaving the default behavior.

[How]
$RABBITMQ_FEATURE_FLAGS now acceps the following syntax:

    RABBITMQ_FEATURE_FLAGS=+feature1,-feature2

This will start RabbitMQ with all stable feature flags, plus `feature1`,
but without `feature2`.

For users setting `forced_feature_flags_on_init` in the config, the
corresponding syntax is:

    {forced_feature_flags_on_init, {rel, [feature1], [feature2]}}
2024-10-07 14:02:50 +02:00
Loïc Hoguin ae984cc364
make: Set CT_LOGS_DIR to top-level logs/ directory
All CT logs will now be under <toplevel>/logs. An improved
test workflow would be to always keep the logs/all_runs.html
page open in the browser and refresh it whenever tests are
run in any of the rabbit applications.
2024-09-30 12:35:43 +02:00
Loïc Hoguin ec95c1a88d
rabbit_common: Remove 'cover' related code from 'rabbit_misc'
This is very old code that is likely no longer used. Removing
it helps avoid depending on cover.
2024-09-30 12:35:42 +02:00
Loïc Hoguin 861943835f
Fix OTP-27 Dialyzer errors in rabbit_common 2024-09-30 12:35:42 +02:00
Loïc Hoguin 9f8c17f587
make: Fix build errors for apps that have rabbit in TEST_DEPS
We want them to install CLI scripts only for the test build,
otherwise Dialyzer or others will fail in a clean run.
2024-09-30 12:35:42 +02:00
Loïc Hoguin f95c87082a
make: Include rabbitmq_cli ebin in code path only if in deps 2024-09-30 12:35:42 +02:00
Loïc Hoguin a17fb13a03
make: Initial work on using ct_master to run tests
Because `ct_master` is yet another Erlang node, and it is used
to run multiple CT nodes, meaning it is in a cluster of CT
nodes, the tests that change the net_ticktime could not
work properly anymore. This is because net_ticktime must
be the same value across the cluster.

The same value had to be set for all tests in order to solve
this. This is why it was changed to 5s across the board. The
lower net_ticktime was used in most places to speed up tests
that must deal with cluster failures, so that value is good
enough for these cases.

One test in amqp_client was using the net_ticktime to test
the behavior of the direct connection timeout with varying
net_ticktime configurations. The test now mocks the
`net_kernel:get_net_ticktime()` function to achieve the
same result.
2024-08-29 15:23:31 +02:00
Loïc Hoguin 7ad8e2856b
make: Restrict Erlang.mk plugin inclusion
This has no real impact on performance[1] but should
make it clear which application can run the broker
and/or publish to Hex.pm. In particular, applications
that we can't run the broker from will now give up
early if we try to.

Note that while the broker can't normally run from the
amqp_client application's directory, it can run from
tests and some of the tests start the broker.

[1] on my machine
2024-08-29 15:19:50 +02:00
Loïc Hoguin 445f3c9270
make: Move rabbitmq-early-test.mk to rabbitmq-early-plugin.mk
No real need to have two files, especially since it contains
only a few variable definitions. Plan is to only keep
separate files for larger features such as dist or run.
2024-08-29 15:19:50 +02:00
Loïc Hoguin 7421d4d15f
make: Additional cleanups 2024-08-29 15:19:50 +02:00
Loïc Hoguin d4222f8216
make: Remove emptied rabbitmq-tools.mk 2024-08-29 15:19:14 +02:00
Loïc Hoguin 7cb0c1b217
make: Refactor PROJECT_VERSION computation 2024-08-29 15:19:14 +02:00
Loïc Hoguin 48795d7cf3
make: Remove update-contributor-code-of-conduct target
The relevant files have been symlinked to the root file
for the past two years.
2024-08-29 15:19:14 +02:00
Loïc Hoguin d9d74d0964
make: Remove ct-logs-archive target
Hasn't been used for a long time.
2024-08-29 15:19:14 +02:00
Loïc Hoguin f3d0d4e113
make: Remove sync-gitremote sync-gituser targets
They are not useful for the monorepo.
2024-08-29 15:19:14 +02:00
Loïc Hoguin a5cfb1ea9a
make: Remove show-upstream-git-fetch-url and co
They haven't been necessary for quite some time.
2024-08-29 15:19:14 +02:00
Loïc Hoguin 4e8ad90cd0
make: Remove commits-since-release
This was only relevant before the monorepo.
2024-08-29 15:19:14 +02:00
Loïc Hoguin 31409e86b0
make: Remove show-branch target
Not useful in the monorepo.
2024-08-29 15:19:14 +02:00
Loïc Hoguin b8bcd5c27c
make: Remove sync-gitignore-from-main target
No longer relevant because of the monorepo
2024-08-29 15:19:14 +02:00
Loïc Hoguin e947e098bd
make: Remove rabbitmq-deps.mk related targets 2024-08-29 15:19:14 +02:00
Loïc Hoguin 7e7e6feb9d
make: Remove rabbitmq-tests.mk
Everything in this file seems to be dead code except
ct-slow/ct-fast, which have been replaced by their
equivalent in the rabbit Makefile.
2024-08-29 15:19:13 +02:00
David Ansari ffefefba0f Run with default wal_sync_method
...which is `datasync`

RA never pre-allocates the WAL anymore unless explicitly configured to.
2024-08-22 16:24:07 +00:00
Michael Davis 0dd26f0c52
rabbit_db_queue: Transactionally delete transient queues from Khepri
The prior code skirted transactions because the filter function might
cause Khepri to call itself. We want to use the same idea as the old
code - get all queues, filter them, then delete them - but we want to
perform the deletion in a transaction and fail the transaction if any
queues changed since we read them.

This fixes a bug - that the call to `delete_in_khepri/2` could return
an error tuple that would be improperly recognized as `Deletions` -
but should also make deleting transient queues atomic and fast.
Each call to `delete_in_khepri/2` needed to wait on Ra to replicate
because the deletion is an individual command sent from one process.
Performing all deletions at once means we only need to wait for one
command to be replicated across the cluster.

We also bubble up any errors to delete now rather than storing them as
deletions. This fixes a crash that occurs on node down when Khepri is
in a minority.
2024-08-13 11:40:18 -04:00
Michael Davis 96c60a2de4
Move 'for_each_while_ok/2' helper to rabbit_misc 2024-07-22 16:02:03 -04:00
Lois Soto Lopez bb93e718c2 Prometheus: some per-exchange/per-queue metrics aggregated per-channel
Add copies of some per-object metrics that are labeled per-channel
aggregated to reduce cardinality. These metrics are valuable and
easier to process if exposed on per-exchange and per-queue basis.
2024-07-16 14:30:25 +02:00
Michal Kuratczyk f398892bda
Deprecate queue-master-locator (#11565)
* Deprecate queue-master-locator

This should not be a breaking change - all validation should still pass
* CQs can now use `queue-leader-locator`
* `queue-leader-locator` takes precedence over `queue-master-locator` if both are used
* regardless of which name is used, effectively there are only two  values: `client-local` (default) or `balanced`
* other values (`min-masters`, `random`, `least-leaders`) are mapped to `balanced`
* Management UI no longer shows `master-locator` fields when declaring a queue/policy, but such arguments can still be used manually (unless not permitted)
* exclusive queues are always declared locally, as before
2024-07-12 13:22:55 +02:00
Michael Klishin 0700e1cdc4 Revert "Provide per-exchange/queue metrics w/out channelID"
This reverts commit 3ed2e30e3a.
2024-07-11 21:34:52 -04:00
Lois Soto Lopez ec5e258825 Provide per-exchange/queue metrics w/out channelID 2024-07-11 17:34:18 -04:00
Loïc Hoguin bbfa066d79
Cleanup .gitignore files for the monorepo
We don't need to duplicate so many patterns in so many
files since we have a monorepo (and want to keep it).

If I managed to miss something or remove something that
should stay, please put it back. Note that monorepo-wide
patterns should go in the top-level .gitignore file.
Other .gitignore files are for application or folder-
specific patterns.
2024-06-28 12:00:52 +02:00
Loïc Hoguin 18f8ee1457
Merge pull request #11549 from rabbitmq/loic-make-cleanups
Various make cleanup/consolidation
2024-06-27 11:42:24 +02:00
Loïc Hoguin af49f5c526
make: Remove FAST_RUN_BROKER; make normal run-broker fast
The DIST step used rsync for copying files; changing this
to using cp/rm provides a noticeable speed boost.

Before this commit the situation was as follows. With
FAST_RUN_BROKER=1 we are pretty fast but don't benefit
from parallel make:

  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=1
    2,04s user 1,57s system 90% cpu 4,016 total
  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=1 -j8
    2,08s user 1,55s system 89% cpu 4,069 total

With FAST_RUN_BROKER=0 we are slow; on the other hand
we greatly benefit from parallel make:

  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0
    3,29s user 1,93s system 81% cpu 6,425 total
  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0 -j8
    3,36s user 1,90s system 142% cpu 3,695 total

The reason this method achieves such a result is because
the DIST step that takes a lot of time can be run in
parallel. In addition, this method results on only
the necessary plugins being available in the path,
therefore it doesn't discover unrelated plugins
during node startup, saving time.

By changing rsync to cp/rm, we get great results even
without parallel make:

  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0
    3,28s user 1,64s system 105% cpu 4,684 total
  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0 -j8
    3,27s user 1,65s system 135% cpu 3,640 total

We are within 1s of FAST_RUN_BROKER=1 by default, and
faster than FAST_RUN_BROKER=1 with parallel make. On
top of that, we greatly benefit when rebuilding as the
DIST files do not need to be rebuilt every time:

  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0
    2,94s user 1,40s system 107% cpu 4,035 total
  make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0 -j8
    2,85s user 1,51s system 138% cpu 3,140 total

Therefore it only makes sense to remove FAST_RUN_BROKER,
and instead use the old method which is both more correct
and has more potential for optimisation.
2024-06-26 16:53:07 +02:00
Loïc Hoguin a64d1e67fc
Remove looking_glass
It has largely been superseded by `perf`. It is no longer
generally useful. It can always be added to BUILD_DEPS for
the rare cases it is needed, or installed locally and
pointed to by setting its path to ERL_LIBS.
2024-06-26 09:56:46 +02:00
Loïc Hoguin 2b03233ac1
make: Remove rabbitmq-macros.mk
It hasn't been used for some time. If compare_version
becomes necessary again in the future, it's in the history.
2024-06-25 13:39:38 +02:00
Loïc Hoguin 9f15e978b1
make: Remove xrefr
It is no longer used by Erlang.mk.
2024-06-25 13:08:08 +02:00
Loïc Hoguin 7e9cac3d00
make: Remove Travis-specific targets/config
This should no longer be used.
2024-06-24 14:12:02 +02:00
Loïc Hoguin 881ebc6138
make: Remove ANT variables
This should no longer be used.
2024-06-24 14:06:48 +02:00