Commit Graph

521 Commits

Author SHA1 Message Date
Michael Klishin 61dcfd5fa6
Use the standard 'undefined' here 2025-06-04 12:31:27 +04:00
Michael Klishin e9fc656241
Wrap TLS options password into a function in more places
A follow-up to #13958 #13999.

Pair: @dcorbacho.
2025-06-04 12:24:45 +04:00
Michal Kuratczyk c0368a0d24
[skip ci] Update dashboards for RabbitMQ 4.1
Key changes:
- endpoint variable to handle scraping multiple endpoints
- message size panels (new metric in 4.1)
- panels at the top of the Overview dashboard should be more up to date
  (they show the latest value)
- values should be accurate if multiple endpoints are scraped
  (previously, many would be doubled)
- Nodes table shows fewer volumns and shows node uptime
2025-04-16 17:48:21 +02:00
Michal Kuratczyk f0976b48b2
queue info metric: guard against whereis returning `undefined` (#13646) 2025-03-28 12:37:42 +01:00
Michal Kuratczyk 2a93bbcebd
RMQ-1460: Emit queue_info metric (#13583)
To allow filtering on queue type or membership status,
we need an info metric for queues; see
https://grafana.com/blog/2021/08/04/how-to-use-promql-joins-for-more-effective-queries-of-prometheus-metrics-at-scale/#info-metrics

With this change, per-object metrics and the detailed metrics
(if queue-related families are requested) will contain
rabbitmq_queue_info / rabbitmq_detailed_queue_info with a value of 1
and labels including the queue name, vhost, queue type and membership
status.
2025-03-27 15:54:26 +01:00
Arnaud Cogoluègnes b8244f70f4
Pull from socket up to 10 times in stream test utils (#13588)
To make sure to have enough data to complete a command.
2025-03-24 09:13:31 +01:00
Loïc Hoguin c5d150a7ef
Use Erlang.mk's native Elixir support for CLI
This avoids using Mix while compiling which simplifies
a number of things and let us do further build improvements
later on.

Elixir is only enabled from within rabbitmq_cli currently.

Eunit is disabled since there are only Elixir tests.

Dialyzer will force-enable Elixir in order to process
Elixir-compiled beam files.

This commit also includes a few changes that are
related:

 * The Erlang distribution will now be started for parallel-ct

 * Many unnecessary PROJECT_MOD lines have been removed

 * `eunit_formatters` has been removed, it provides little value

 * The new `maybe_flock` Erlang.mk function is used where possible

 * Build test deps when testing rabbitmq_cli (Mix won't do it anymore)

 * rabbitmq_ct_helpers now use the early plugins to have Dialyzer
   properly set up
2025-03-18 10:02:49 +01:00
Aitor Perez 07adc3e571
Remove Bazel files 2025-03-13 13:42:34 +00:00
Tony Lewis Hiroaki URAHAMA 3c5f4d3d39
Bump Prometheus Version 2025-03-01 18:21:51 +00:00
Michal Kuratczyk 703ee8529e
Add rabbitmq_endpoint label to rabbitmq_identity_info 2025-02-07 15:51:40 +01:00
Michal Kuratczyk 16700f9f19
Better metric description 2025-01-30 11:33:53 +01:00
Arnaud Cogoluègnes b3b0940024
Fix wait-for-confirms sequence in stream test utils
And refine the implementation and its usage.
2025-01-21 17:38:58 +01:00
Michael Klishin 968eefa1bb
Bump (c) line year
There are no functional changes to this massive diff.
2025-01-01 17:54:10 -05:00
Diana Parra Corbacho 40cb4f46e8 Tests: rabbit_prometheus_http_SUITE longer wait 2024-12-16 11:58:05 +01:00
Péter Gömöri bbc902ef23
Add test for stream consumer max offset lag prometheus metric
(cherry picked from commit 0c76054a0c)
2024-11-19 19:14:12 -05:00
markus812498 085ec75253
Expose max offset lag of stream consumers via Prometheus
Supports both per stream (detailed) and aggregated (metrics) values.

(cherry picked from commit e82058e872)
2024-11-19 19:14:06 -05:00
Anh Nguyen dc9311a561 Update Erlang Distribution dashboard panel and instance filtering
- Modified metric expression and legend format in State of distribution links
- Changed panel type from 'flant-statusmap-panel' to 'status-history' for Process state
2024-11-14 11:04:07 +07:00
Anh Nguyen b9dc0ea3b4 Add instance filtering to Erlang BEAM Grafana dashboard metrics
- Updated metric expressions to include instance filtering with {instance=\"$node\"}
  for the following metrics:
  - erlang_vm_statistics_run_queues_length
  - erlang_vm_statistics_dirty_io_run_queue_length
  - erlang_vm_statistics_dirty_cpu_run_queue_length
- Added 'DS_PROMETHEUS' as a templated data source variable
2024-11-13 20:20:02 +07:00
Jean-Sébastien Pédron d6024e30f4
rabbit_prometheus_http_SUITE: Start broker once in `special_chars` group
`init_per_group/3`, which starts the broker, was already called earlier
in the function.

This fixes a bug where the node can't be stopped in `end_per_group/2`,
attecting the next group ability to start one.
2024-10-30 10:08:56 +01:00
Luke Bakken 3d668fda46
Grafana: add a runtime/Erlang/BEAM dashboard (#12456)
* Add BEAM dashboard

Also update the other dashboards by opening in Grafana v11.2.2 and ensuring they work as expected.

* Update the Erlang-Distributions-Compare dashboard

* Update the RabbitMQ-Overview dashboard

* Update the RabbitMQ-Quorum-Queues-Raft dashboard

* Update the RabbitMQ-Stream dashboard

* Update distribution link status panel

---------

Co-authored-by: Michal Kuratczyk <mkuratczyk@vmware.com>
2024-10-17 07:10:54 -07:00
Michael Klishin 80f4797e76 Remove multiple mentions of global prefetch
As suggested by @johanrhodin in #12454.

This keeps the Prometheus plugin part but
marks it as deprecated. We can remove it in
4.1.
2024-10-04 20:47:37 -04:00
David Ansari 1e3f4e5db9 Emit histogram metric for received message sizes per protocol (#12342)
* Add global histogram metrics for received message sizes per-protocol

fixup: add new files to bazel

fixup: expose message_size_bytes as prometheus classic histogram type

`rabbit_msg_size_metrics` does not use `seshat` any more, but
`counters` directly.

fixup: add msg_size_metrics unit test

* Improve message size histogram

1.
Avoid unnecessary time series emitted for stream protocol
The stream protocol cannot observe message sizes.
This commit ensures that the following time series are omitted:
```
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="64"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="256"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1024"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4096"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16384"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="65536"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="262144"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1048576"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4194304"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16777216"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="67108864"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="268435456"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="+Inf"} 0
rabbitmq_global_message_size_bytes_count{protocol="stream"} 0
rabbitmq_global_message_size_bytes_sum{protocol="stream"} 0
```

This reduces the number of time series by 15.

2.
Further reduce the number of time series by reducing the number of
buckets. Instead of 13 bucktes, emit only 9 buckets. Buckets are not
free, each is an extra time series stored.

Prior to this commit:
```
curl -s -u guest:guest localhost:15692/metrics | ag message_size | wc -l
      92
```

After this commit:
```
curl -s -u guest:guest localhost:15692/metrics | ag message_size | wc -l
      57
```

3.
The emitted metric should be called
`rabbitmq_message_size_bytes_bucket` instead of `rabbitmq_global_message_size_bytes_bucket`.
The latter is poor naming. There is no need to use `global` in
the metric name given that this metric doesn't exist in the old flawed
aggregated metrics.

4.
This commit simplies module `rabbit_global_counters`.

5.
Avoid garbage collecting the 10-elements list of buckets per message
being received.

---------

Co-authored-by: Péter Gömöri <peter@84codes.com>
2024-09-25 09:00:44 -04:00
Lois Soto Lopez 8377eda336 Comment added label clause to clarify need for it 2024-09-25 11:04:25 +02:00
Lois Soto Lopez 79f04c23f3 Avoid duplicate vhost label for prometh. queue-exchange metrics
Adds a specific clause on the
`prometheus_rabbitmq_core_metrics_collector:labels` function when the
associated metric item is a Queue + Exchange combo (`{Queue, Exchange}`)
2024-09-24 10:38:38 +02:00
Michael Davis 512f8838fd
Add prometheus tags for raft_cluster to non-QQ raft metrics
By default Ra will use the cluster name as the metrics key. Currently
atom values are ignored by the prometheus plugin's tag rendering
functions, so if you have a QQ and Khepri running and request the
`/metrics/per-object` or `/metrics/detailed` endpoints you'll see values
that don't have labels set for the `ra_metrics` metrics:

    # TYPE rabbitmq_raft_term_total counter
    # HELP rabbitmq_raft_term_total Current Raft term number
    rabbitmq_raft_term_total{vhost="/",queue="qq"} 9
    rabbitmq_raft_term_total 10

With this change we map the name of the Ra cluster to a "raft_cluster"
tag, so instead an example metric might be:

    # TYPE rabbitmq_raft_term_total counter
    # HELP rabbitmq_raft_term_total Current Raft term number
    rabbitmq_raft_term_total{vhost="/",queue="qq"} 9
    rabbitmq_raft_term_total{raft_cluster="rabbitmq_metadata"} 10

This affects metrics for Khepri and the stream coordinator.
2024-08-30 12:41:37 -04:00
Michal Kuratczyk 9b828c08b7
Remove HiPE 2024-08-28 09:18:28 +02:00
Michal Kuratczyk 116ab4f6fe
Remove memory_high_watermark_paging_ratio 2024-08-28 08:12:49 +02:00
Simon Unge 2766122836 Move shovel prometheus to its own plugin 2024-08-08 01:26:49 -04:00
Michael Klishin 1f1d422fa2 rabbitmq_shovel is a runtime dependency of rabbitmq_prometheus now 2024-08-02 23:15:28 -04:00
Simon Unge 4c44ebd8eb Add dynamic and static promethues metric gauge 2024-08-02 22:19:20 +00:00
Michal Kuratczyk 618f695645
Move memory breakdown metrics to new endpoint
Collecting them on a large system (tens of thousands of processes
or more) can be time consuming as we iterate over all processes.
By putting them on a separate endpoint, we make that opt-in
2024-07-23 10:17:37 +02:00
Michael Klishin 0caea225c6 Assertions for #11743 2024-07-18 21:32:42 -04:00
Michael Klishin e9b5f52512 Prometheus: expose memory breakdown metrics
Closes #11743.
2024-07-18 21:32:42 -04:00
Lois Soto Lopez bb93e718c2 Prometheus: some per-exchange/per-queue metrics aggregated per-channel
Add copies of some per-object metrics that are labeled per-channel
aggregated to reduce cardinality. These metrics are valuable and
easier to process if exposed on per-exchange and per-queue basis.
2024-07-16 14:30:25 +02:00
Michael Klishin 0700e1cdc4 Revert "Provide per-exchange/queue metrics w/out channelID"
This reverts commit 3ed2e30e3a.
2024-07-11 21:34:52 -04:00
Michael Klishin 2bd3a2d307 Revert "Update deps/rabbitmq_prometheus/src/collectors/prometheus_rabbitmq_core_metrics_collector.erl"
This reverts commit 64e0812ced.
2024-07-11 21:34:46 -04:00
Michael Klishin 6b1e003afe Revert "New metrics return on detailed only"
This reverts commit 1aec73b21c.
2024-07-11 21:34:40 -04:00
Lois Soto Lopez 18e667fc8f New metrics return on detailed only
Make new metrics return on detailed only and adjust some of the
help messages.
2024-07-11 17:34:18 -04:00
LoisSotoLopez cb2de0d9ea Update deps/rabbitmq_prometheus/src/collectors/prometheus_rabbitmq_core_metrics_collector.erl
Co-authored-by: Péter Gömöri <gomoripeti@users.noreply.github.com>
2024-07-11 17:34:18 -04:00
Lois Soto Lopez ec5e258825 Provide per-exchange/queue metrics w/out channelID 2024-07-11 17:34:18 -04:00
Loïc Hoguin bbfa066d79
Cleanup .gitignore files for the monorepo
We don't need to duplicate so many patterns in so many
files since we have a monorepo (and want to keep it).

If I managed to miss something or remove something that
should stay, please put it back. Note that monorepo-wide
patterns should go in the top-level .gitignore file.
Other .gitignore files are for application or folder-
specific patterns.
2024-06-28 12:00:52 +02:00
Loïc Hoguin cd35f7e7fa
Remove sockets_used/sockets_total metrics from UIs
Part of the removal of file_handle_cache.

The Prometheus endpoint was updated but the Grafana dashboard
was not.

The FD stats are using the system's state rather than
file_handle_cache so there's no need to remove them.
2024-06-24 12:07:51 +02:00
Michael Klishin 9e97c5d8e7 rabbitmq_prometheus.schema: wording 2024-06-21 21:58:43 -04:00
Michal Kuratczyk 141659a638 OTP27 support (#11366)
* "maybe" is now a keyword
* Bump horus to 0.2.5 and switch to hex
* Get rid of some deprecated callbacks/functions
2024-06-21 21:46:33 -04:00
Lois Soto Lopez 30ecf6af75 fix: bring back schema mapping
Issue #11315
2024-06-20 09:53:30 +02:00
Lois Soto Lopez 27a0b0f191 fix: Remove queues filter for prometh. collector
Remove queue filters for prometheus_rabbitmq_core_metrics_collector

Closes #11315
2024-06-13 11:09:44 +02:00
Michael Klishin a8d9b9c1db
Merge pull request #11245 from cloudamqp/expose_segment_count_prometheus
Prometheus: add segment file count to queue_metrics and expose
2024-05-24 21:24:57 -04:00
Michal Kuratczyk cfa3de4b2b
Remove unused imports (thanks elp!) 2024-05-23 16:36:08 +02:00
markus812498 3b1ff80f0a added segment file count to queue_metrics ets and exposed in /metrics endpoint 2024-05-23 09:50:24 +12:00
Rin Kuryloski c427bd0ea0 Match bazel & make for rabbitmq_prometheus 2024-03-26 16:27:10 +01:00