rabbitmq-server

Commit Graph

Author	SHA1	Message	Date
Michal Kuratczyk	c34c803754	Remove flake in prometheus_http_SUITE (#14367 ) Sometimes the metrics for streams created by `stream_pub_sub_metrics` would be returned when the next test starts, breaking the assertions.	2025-08-12 15:03:20 +02:00
Jean-Sébastien Pédron	267445680f	rabbit_prometheus_http_SUITE: Run `stream_pub_sub_metrics` first [Why] I wonder if a previous test interferes with the metrics verified by this test case. To be safer, execute it first and let's see what happens.	2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron	2bc8d117b6	rabbit_prometheus_http_SUITE: Log more details for a future failure in CI [Why] The `stream_pub_sub_metrics` test failed at least once in CI because the `rabbitmq_stream_consumer_max_offset_lag` was 4 instead of the expected 3 on line 815. I couldn't reproduce the problem so far. [How] The test case now logs the initial value of that metric at the beginning of the test function. Hopefully this will give us some clue for the day it fails again.	2025-08-08 10:12:56 +02:00
Jean-Sébastien Pédron	c6729351b6	rabbit_prometheus_http_SUITE: Use another Erlang metric [Why] It looks like `erlang_vm_dist_node_queue_size_bytes` is not always present, even though other Erlang-specific metrics are present. [How] The goal is to ensure Erlang metrics are present in the output, so just use another one that is likely to be there.	2025-07-30 15:04:48 +02:00
Michal Kuratczyk	a5106c6a61	Expose ra counters (#13895 ) Trigger a 4.2.x alpha release build / trigger_alpha_build (push) Waiting to run Details Test (make) / Build and Xref (1.18, 26) (push) Waiting to run Details Test (make) / Build and Xref (1.18, 27) (push) Waiting to run Details Test (make) / Build and Xref (1.18, 28) (push) Waiting to run Details Test (make) / Test (1.18, 28, khepri) (push) Waiting to run Details Test (make) / Test (1.18, 28, mnesia) (push) Waiting to run Details Test (make) / Test mixed clusters (1.18, 28, khepri) (push) Waiting to run Details Test (make) / Test mixed clusters (1.18, 28, mnesia) (push) Waiting to run Details Test (make) / Type check (1.18, 28) (push) Waiting to run Details Switch from ra_metrics to ra_counters * Expose many more metrics (they are also up to date) * Bump Seshat, Ra, Osiris, Prometheus.erl * switch from proplists to maps	2025-07-24 10:43:20 +02:00
Michal Kuratczyk	175ba70e8c	[skip ci] Remove rabbit_log and switch to LOG_ macros	2025-07-18 08:42:59 +02:00
Michael Klishin	61dcfd5fa6	Use the standard 'undefined' here	2025-06-04 12:31:27 +04:00
Michael Klishin	e9fc656241	Wrap TLS options password into a function in more places A follow-up to #13958 #13999. Pair: @dcorbacho.	2025-06-04 12:24:45 +04:00
Michal Kuratczyk	c0368a0d24	[skip ci] Update dashboards for RabbitMQ 4.1 Key changes: - endpoint variable to handle scraping multiple endpoints - message size panels (new metric in 4.1) - panels at the top of the Overview dashboard should be more up to date (they show the latest value) - values should be accurate if multiple endpoints are scraped (previously, many would be doubled) - Nodes table shows fewer volumns and shows node uptime	2025-04-16 17:48:21 +02:00
Michal Kuratczyk	f0976b48b2	queue info metric: guard against whereis returning `undefined` (#13646 )	2025-03-28 12:37:42 +01:00
Michal Kuratczyk	2a93bbcebd	RMQ-1460: Emit queue_info metric (#13583 ) To allow filtering on queue type or membership status, we need an info metric for queues; see https://grafana.com/blog/2021/08/04/how-to-use-promql-joins-for-more-effective-queries-of-prometheus-metrics-at-scale/#info-metrics With this change, per-object metrics and the detailed metrics (if queue-related families are requested) will contain rabbitmq_queue_info / rabbitmq_detailed_queue_info with a value of 1 and labels including the queue name, vhost, queue type and membership status.	2025-03-27 15:54:26 +01:00
Arnaud Cogoluègnes	b8244f70f4	Pull from socket up to 10 times in stream test utils (#13588 ) To make sure to have enough data to complete a command.	2025-03-24 09:13:31 +01:00
Loïc Hoguin	c5d150a7ef	Use Erlang.mk's native Elixir support for CLI This avoids using Mix while compiling which simplifies a number of things and let us do further build improvements later on. Elixir is only enabled from within rabbitmq_cli currently. Eunit is disabled since there are only Elixir tests. Dialyzer will force-enable Elixir in order to process Elixir-compiled beam files. This commit also includes a few changes that are related: * The Erlang distribution will now be started for parallel-ct * Many unnecessary PROJECT_MOD lines have been removed * `eunit_formatters` has been removed, it provides little value * The new `maybe_flock` Erlang.mk function is used where possible * Build test deps when testing rabbitmq_cli (Mix won't do it anymore) * rabbitmq_ct_helpers now use the early plugins to have Dialyzer properly set up	2025-03-18 10:02:49 +01:00
Aitor Perez	07adc3e571	Remove Bazel files	2025-03-13 13:42:34 +00:00
Tony Lewis Hiroaki URAHAMA	3c5f4d3d39	Bump Prometheus Version	2025-03-01 18:21:51 +00:00
Michal Kuratczyk	703ee8529e	Add rabbitmq_endpoint label to rabbitmq_identity_info	2025-02-07 15:51:40 +01:00
Michal Kuratczyk	16700f9f19	Better metric description	2025-01-30 11:33:53 +01:00
Arnaud Cogoluègnes	b3b0940024	Fix wait-for-confirms sequence in stream test utils And refine the implementation and its usage.	2025-01-21 17:38:58 +01:00
Michael Klishin	968eefa1bb	Bump (c) line year There are no functional changes to this massive diff.	2025-01-01 17:54:10 -05:00
Diana Parra Corbacho	40cb4f46e8	Tests: rabbit_prometheus_http_SUITE longer wait	2024-12-16 11:58:05 +01:00
Péter Gömöri	bbc902ef23	Add test for stream consumer max offset lag prometheus metric (cherry picked from commit `0c76054a0c`)	2024-11-19 19:14:12 -05:00
markus812498	085ec75253	Expose max offset lag of stream consumers via Prometheus Supports both per stream (detailed) and aggregated (metrics) values. (cherry picked from commit `e82058e872`)	2024-11-19 19:14:06 -05:00
Anh Nguyen	dc9311a561	Update Erlang Distribution dashboard panel and instance filtering - Modified metric expression and legend format in State of distribution links - Changed panel type from 'flant-statusmap-panel' to 'status-history' for Process state	2024-11-14 11:04:07 +07:00
Anh Nguyen	b9dc0ea3b4	Add instance filtering to Erlang BEAM Grafana dashboard metrics - Updated metric expressions to include instance filtering with {instance=\"$node\"} for the following metrics: - erlang_vm_statistics_run_queues_length - erlang_vm_statistics_dirty_io_run_queue_length - erlang_vm_statistics_dirty_cpu_run_queue_length - Added 'DS_PROMETHEUS' as a templated data source variable	2024-11-13 20:20:02 +07:00
Jean-Sébastien Pédron	d6024e30f4	rabbit_prometheus_http_SUITE: Start broker once in `special_chars` group `init_per_group/3`, which starts the broker, was already called earlier in the function. This fixes a bug where the node can't be stopped in `end_per_group/2`, attecting the next group ability to start one.	2024-10-30 10:08:56 +01:00
Luke Bakken	3d668fda46	Grafana: add a runtime/Erlang/BEAM dashboard (#12456 ) * Add BEAM dashboard Also update the other dashboards by opening in Grafana v11.2.2 and ensuring they work as expected. * Update the Erlang-Distributions-Compare dashboard * Update the RabbitMQ-Overview dashboard * Update the RabbitMQ-Quorum-Queues-Raft dashboard * Update the RabbitMQ-Stream dashboard * Update distribution link status panel --------- Co-authored-by: Michal Kuratczyk <mkuratczyk@vmware.com>	2024-10-17 07:10:54 -07:00
Michael Klishin	80f4797e76	Remove multiple mentions of global prefetch As suggested by @johanrhodin in #12454. This keeps the Prometheus plugin part but marks it as deprecated. We can remove it in 4.1.	2024-10-04 20:47:37 -04:00
David Ansari	1e3f4e5db9	Emit histogram metric for received message sizes per protocol (#12342 ) * Add global histogram metrics for received message sizes per-protocol fixup: add new files to bazel fixup: expose message_size_bytes as prometheus classic histogram type `rabbit_msg_size_metrics` does not use `seshat` any more, but `counters` directly. fixup: add msg_size_metrics unit test * Improve message size histogram 1. Avoid unnecessary time series emitted for stream protocol The stream protocol cannot observe message sizes. This commit ensures that the following time series are omitted: ``` rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="64"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="256"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1024"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4096"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16384"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="65536"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="262144"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1048576"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4194304"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16777216"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="67108864"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="268435456"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="+Inf"} 0 rabbitmq_global_message_size_bytes_count{protocol="stream"} 0 rabbitmq_global_message_size_bytes_sum{protocol="stream"} 0 ``` This reduces the number of time series by 15. 2. Further reduce the number of time series by reducing the number of buckets. Instead of 13 bucktes, emit only 9 buckets. Buckets are not free, each is an extra time series stored. Prior to this commit: ``` curl -s -u guest:guest localhost:15692/metrics \| ag message_size \| wc -l 92 ``` After this commit: ``` curl -s -u guest:guest localhost:15692/metrics \| ag message_size \| wc -l 57 ``` 3. The emitted metric should be called `rabbitmq_message_size_bytes_bucket` instead of `rabbitmq_global_message_size_bytes_bucket`. The latter is poor naming. There is no need to use `global` in the metric name given that this metric doesn't exist in the old flawed aggregated metrics. 4. This commit simplies module `rabbit_global_counters`. 5. Avoid garbage collecting the 10-elements list of buckets per message being received. --------- Co-authored-by: Péter Gömöri <peter@84codes.com>	2024-09-25 09:00:44 -04:00
Lois Soto Lopez	8377eda336	Comment added label clause to clarify need for it	2024-09-25 11:04:25 +02:00
Lois Soto Lopez	79f04c23f3	Avoid duplicate vhost label for prometh. queue-exchange metrics Adds a specific clause on the `prometheus_rabbitmq_core_metrics_collector:labels` function when the associated metric item is a Queue + Exchange combo (`{Queue, Exchange}`)	2024-09-24 10:38:38 +02:00
Michael Davis	512f8838fd	Add prometheus tags for raft_cluster to non-QQ raft metrics By default Ra will use the cluster name as the metrics key. Currently atom values are ignored by the prometheus plugin's tag rendering functions, so if you have a QQ and Khepri running and request the `/metrics/per-object` or `/metrics/detailed` endpoints you'll see values that don't have labels set for the `ra_metrics` metrics: # TYPE rabbitmq_raft_term_total counter # HELP rabbitmq_raft_term_total Current Raft term number rabbitmq_raft_term_total{vhost="/",queue="qq"} 9 rabbitmq_raft_term_total 10 With this change we map the name of the Ra cluster to a "raft_cluster" tag, so instead an example metric might be: # TYPE rabbitmq_raft_term_total counter # HELP rabbitmq_raft_term_total Current Raft term number rabbitmq_raft_term_total{vhost="/",queue="qq"} 9 rabbitmq_raft_term_total{raft_cluster="rabbitmq_metadata"} 10 This affects metrics for Khepri and the stream coordinator.	2024-08-30 12:41:37 -04:00
Michal Kuratczyk	9b828c08b7	Remove HiPE	2024-08-28 09:18:28 +02:00
Michal Kuratczyk	116ab4f6fe	Remove memory_high_watermark_paging_ratio	2024-08-28 08:12:49 +02:00
Simon Unge	2766122836	Move shovel prometheus to its own plugin	2024-08-08 01:26:49 -04:00
Michael Klishin	1f1d422fa2	rabbitmq_shovel is a runtime dependency of rabbitmq_prometheus now	2024-08-02 23:15:28 -04:00
Simon Unge	4c44ebd8eb	Add dynamic and static promethues metric gauge	2024-08-02 22:19:20 +00:00
Michal Kuratczyk	618f695645	Move memory breakdown metrics to new endpoint Collecting them on a large system (tens of thousands of processes or more) can be time consuming as we iterate over all processes. By putting them on a separate endpoint, we make that opt-in	2024-07-23 10:17:37 +02:00
Michael Klishin	0caea225c6	Assertions for #11743	2024-07-18 21:32:42 -04:00
Michael Klishin	e9b5f52512	Prometheus: expose memory breakdown metrics Closes #11743.	2024-07-18 21:32:42 -04:00
Lois Soto Lopez	bb93e718c2	Prometheus: some per-exchange/per-queue metrics aggregated per-channel Add copies of some per-object metrics that are labeled per-channel aggregated to reduce cardinality. These metrics are valuable and easier to process if exposed on per-exchange and per-queue basis.	2024-07-16 14:30:25 +02:00
Michael Klishin	0700e1cdc4	Revert "Provide per-exchange/queue metrics w/out channelID" This reverts commit `3ed2e30e3a`.	2024-07-11 21:34:52 -04:00
Michael Klishin	2bd3a2d307	Revert "Update deps/rabbitmq_prometheus/src/collectors/prometheus_rabbitmq_core_metrics_collector.erl" This reverts commit `64e0812ced`.	2024-07-11 21:34:46 -04:00
Michael Klishin	6b1e003afe	Revert "New metrics return on detailed only" This reverts commit `1aec73b21c`.	2024-07-11 21:34:40 -04:00
Lois Soto Lopez	18e667fc8f	New metrics return on detailed only Make new metrics return on detailed only and adjust some of the help messages.	2024-07-11 17:34:18 -04:00
LoisSotoLopez	cb2de0d9ea	Update deps/rabbitmq_prometheus/src/collectors/prometheus_rabbitmq_core_metrics_collector.erl Co-authored-by: Péter Gömöri <gomoripeti@users.noreply.github.com>	2024-07-11 17:34:18 -04:00
Lois Soto Lopez	ec5e258825	Provide per-exchange/queue metrics w/out channelID	2024-07-11 17:34:18 -04:00
Loïc Hoguin	bbfa066d79	Cleanup .gitignore files for the monorepo We don't need to duplicate so many patterns in so many files since we have a monorepo (and want to keep it). If I managed to miss something or remove something that should stay, please put it back. Note that monorepo-wide patterns should go in the top-level .gitignore file. Other .gitignore files are for application or folder- specific patterns.	2024-06-28 12:00:52 +02:00
Loïc Hoguin	cd35f7e7fa	Remove sockets_used/sockets_total metrics from UIs Part of the removal of file_handle_cache. The Prometheus endpoint was updated but the Grafana dashboard was not. The FD stats are using the system's state rather than file_handle_cache so there's no need to remove them.	2024-06-24 12:07:51 +02:00
Michael Klishin	9e97c5d8e7	rabbitmq_prometheus.schema: wording	2024-06-21 21:58:43 -04:00
Michal Kuratczyk	141659a638	OTP27 support (#11366 ) * "maybe" is now a keyword * Bump horus to 0.2.5 and switch to hex * Get rid of some deprecated callbacks/functions	2024-06-21 21:46:33 -04:00

1 2 3 4 5 ...

527 Commits