rabbitmq-server

Commit Graph

Author	SHA1	Message	Date
Michal Kuratczyk	2a93bbcebd	RMQ-1460: Emit queue_info metric (#13583 ) To allow filtering on queue type or membership status, we need an info metric for queues; see https://grafana.com/blog/2021/08/04/how-to-use-promql-joins-for-more-effective-queries-of-prometheus-metrics-at-scale/#info-metrics With this change, per-object metrics and the detailed metrics (if queue-related families are requested) will contain rabbitmq_queue_info / rabbitmq_detailed_queue_info with a value of 1 and labels including the queue name, vhost, queue type and membership status.	2025-03-27 15:54:26 +01:00
Arnaud Cogoluègnes	b8244f70f4	Pull from socket up to 10 times in stream test utils (#13588 ) To make sure to have enough data to complete a command.	2025-03-24 09:13:31 +01:00
Arnaud Cogoluègnes	b3b0940024	Fix wait-for-confirms sequence in stream test utils And refine the implementation and its usage.	2025-01-21 17:38:58 +01:00
Michael Klishin	968eefa1bb	Bump (c) line year There are no functional changes to this massive diff.	2025-01-01 17:54:10 -05:00
Diana Parra Corbacho	40cb4f46e8	Tests: rabbit_prometheus_http_SUITE longer wait	2024-12-16 11:58:05 +01:00
Péter Gömöri	bbc902ef23	Add test for stream consumer max offset lag prometheus metric (cherry picked from commit `0c76054a0c`)	2024-11-19 19:14:12 -05:00
Jean-Sébastien Pédron	d6024e30f4	rabbit_prometheus_http_SUITE: Start broker once in `special_chars` group `init_per_group/3`, which starts the broker, was already called earlier in the function. This fixes a bug where the node can't be stopped in `end_per_group/2`, attecting the next group ability to start one.	2024-10-30 10:08:56 +01:00
David Ansari	960808e6b2	Emit histogram metric for received message sizes per protocol (#12342 ) * Add global histogram metrics for received message sizes per-protocol fixup: add new files to bazel fixup: expose message_size_bytes as prometheus classic histogram type `rabbit_msg_size_metrics` does not use `seshat` any more, but `counters` directly. fixup: add msg_size_metrics unit test * Improve message size histogram 1. Avoid unnecessary time series emitted for stream protocol The stream protocol cannot observe message sizes. This commit ensures that the following time series are omitted: ``` rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="64"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="256"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1024"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4096"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16384"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="65536"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="262144"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1048576"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4194304"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16777216"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="67108864"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="268435456"} 0 rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="+Inf"} 0 rabbitmq_global_message_size_bytes_count{protocol="stream"} 0 rabbitmq_global_message_size_bytes_sum{protocol="stream"} 0 ``` This reduces the number of time series by 15. 2. Further reduce the number of time series by reducing the number of buckets. Instead of 13 bucktes, emit only 9 buckets. Buckets are not free, each is an extra time series stored. Prior to this commit: ``` curl -s -u guest:guest localhost:15692/metrics \| ag message_size \| wc -l 92 ``` After this commit: ``` curl -s -u guest:guest localhost:15692/metrics \| ag message_size \| wc -l 57 ``` 3. The emitted metric should be called `rabbitmq_message_size_bytes_bucket` instead of `rabbitmq_global_message_size_bytes_bucket`. The latter is poor naming. There is no need to use `global` in the metric name given that this metric doesn't exist in the old flawed aggregated metrics. 4. This commit simplies module `rabbit_global_counters`. 5. Avoid garbage collecting the 10-elements list of buckets per message being received. --------- Co-authored-by: Péter Gömöri <peter@84codes.com>	2024-09-24 18:08:24 +02:00
Simon Unge	2766122836	Move shovel prometheus to its own plugin	2024-08-08 01:26:49 -04:00
Simon Unge	4c44ebd8eb	Add dynamic and static promethues metric gauge	2024-08-02 22:19:20 +00:00
Michal Kuratczyk	618f695645	Move memory breakdown metrics to new endpoint Collecting them on a large system (tens of thousands of processes or more) can be time consuming as we iterate over all processes. By putting them on a separate endpoint, we make that opt-in	2024-07-23 10:17:37 +02:00
Michael Klishin	0caea225c6	Assertions for #11743	2024-07-18 21:32:42 -04:00
Lois Soto Lopez	bb93e718c2	Prometheus: some per-exchange/per-queue metrics aggregated per-channel Add copies of some per-object metrics that are labeled per-channel aggregated to reduce cardinality. These metrics are valuable and easier to process if exposed on per-exchange and per-queue basis.	2024-07-16 14:30:25 +02:00
Michael Klishin	0700e1cdc4	Revert "Provide per-exchange/queue metrics w/out channelID" This reverts commit `3ed2e30e3a`.	2024-07-11 21:34:52 -04:00
Lois Soto Lopez	ec5e258825	Provide per-exchange/queue metrics w/out channelID	2024-07-11 17:34:18 -04:00
Michal Kuratczyk	cfa3de4b2b	Remove unused imports (thanks elp!)	2024-05-23 16:36:08 +02:00
Iliia Khaprov	8925dfa916	Close #10345 . Add promtheus_rabbitmq_federation_collector. rabbitmq_federation_links gauge metric with status lable.	2024-03-14 09:29:01 +01:00
Michael Klishin	f414c2d512	More missed license header updates #9969	2024-02-05 11:53:50 -05:00
Michael Klishin	01092ff31f	(c) year bumps	2024-01-01 22:02:20 -05:00
Péter Gömöri	fec09c0792	Escape prometheus core metric label values For example special characters like double quotes are allowed in queue names, in which case detailed metrics could produce unparsable text format output.	2023-12-03 01:14:44 +01:00
Michael Klishin	1b642353ca	Update (c) according to [1] 1. https://investors.broadcom.com/news-releases/news-release-details/broadcom-and-vmware-intend-close-transaction-november-22-2023	2023-11-21 23:18:22 -05:00
Simon Unge	8b3ca4c972	See #8605 . Add authentcation support to prometheus.	2023-06-23 13:54:45 -07:00
Chunyi Lyu	4ddb0c2038	Support TLS-only listener for Prometheus - tcp listener can be turned off by setting 'prometheus.tcp.listener = none' - config schema follows web_mqtt and web_stomp	2023-05-05 15:44:53 +01:00
Michal Kuratczyk	510415f8b9	Update prometheus.erl to 4.10.0 Since 4.10.0 was released specifically to address an issue we encountered in RabbitMQ integration with prometheus.erl, new test was added to validate this functionality in the future.	2023-01-13 10:24:41 +01:00
Michael Klishin	ec4f1dba7d	(c) year bump: 2022 => 2023	2023-01-01 23:17:36 -05:00
Luke Bakken	7fe159edef	Yolo-replace format strings Replaces `~s` and `~p` with their unicode-friendly counterparts. ``` git ls-files *.erl \| xargs sed -i.ORIG -e s/~s>/~ts/g -e s/~p>/~tp/g ```	2022-10-10 10:32:03 +04:00
Loïc Hoguin	73dd0acf01	rabbit_prometheus_http_SUITE: Update tests for new CQs CQs without consumers will have only one message in memory.	2022-09-27 12:00:10 +02:00
Jean-Sébastien Pédron	6e9ee4d0da	Remove test code which depended on the `quorum_queue` feature flags These checks are now irrelevant as the feature flag is required.	2022-08-01 12:41:30 +02:00
Michael Klishin	7c47d0925a	Revert "Correct a double quote introduced in #4603" This reverts commit `6a44e0e2ef`. That wiped a lot of files unintentionally	2022-04-20 16:05:56 +04:00
Michael Klishin	6a44e0e2ef	Correct a double quote introduced in #4603	2022-04-20 16:01:29 +04:00
Michael Klishin	c38a3d697d	Bump (c) year	2022-03-21 01:21:56 +04:00
Alexey Lebedeff	7676ed9685	Use `rabbitmq_cluster_` prefix for cluster-wide metrics	2021-11-24 16:49:43 +01:00
Alexey Lebedeff	6e3012aaf9	Add optional metrics for vhost and exchange count These can make sense in some scenarios, e.g. when vhost/exchanges are +created using self-service automation	2021-11-24 11:00:41 +01:00
Alexey Lebedeff	b9ebfb8980	Fix ssl port handling in prometheus plugin All ssl options were stored in the same proplist, and the code was then trying to determine whether an option actually belongs to ranch ssl options or not. Some keys landed in the wrong place, like it did happen in #2975 - different ports were mentioned in listener config (default at top-level, and non-default in `ssl_opts`). Then `ranch` and `rabbitmq_web_dispatch` were treating this differently. This change just moves all ranch ssl opts into proper place using schema, removing any need for guessing in code. The only downside is that advanced config compatibility is broken.	2021-10-20 14:55:33 +02:00
Michael Klishin	3826a0df25	Compile #3561	2021-10-13 01:27:16 +03:00
Johannes Würbach	84de860b4c	feat(prom): expose cluster id in identity	2021-10-12 15:43:46 +02:00
Alexey Lebedeff	989a299720	Emit identity info in prometheus /metrics/detailed endpoint This is needed to make filtering metrics on a cluster name possible.	2021-09-28 19:35:02 +02:00
Alexey Lebedeff	5501d07b8b	Use rabbitmq_ct_helpers to allocate prometheus port This test always used standard 15692 before, which were causing conflicts with e.g. local `make run-broker`.	2021-09-22 15:23:35 +02:00
Alexey Lebedeff	4bb2262140	Allow selective querying for prometheus plugin	2021-09-20 14:59:17 +02:00
dcorbacho	c9305d948a	Use number of publishing channels as global publishers in amqp091	2021-06-29 08:10:42 +01:00
Gerhard Lazu	c7971252cd	Global counters per protocol + protocol AND queue_type This way we can show how many messages were received via a certain protocol (stream is the second real protocol besides the default amqp091 one), as well as by queue type, which is something that many asked for a really long time. The most important aspect is that we can also see them by protocol AND queue_type, which becomes very important for Streams, which have different rules from regular queues (e.g. for example, consuming messages is non-destructive, and deep queue backlogs - think billions of messages - are normal). Alerting and consumer scaling due to deep backlogs will now work correctly, as we can distinguish between regular queues & streams. This has gone through a few cycles, with @mkuratczyk & @dcorbacho covering most of the ground. @dcorbacho had most of this in https://github.com/rabbitmq/rabbitmq-server/pull/3045, but the main branch went through a few changes in the meantime. Rather than resolving all the conflicts, and then making the necessary changes, we (@gerhard + @kjnilsson) took all learnings and started re-applying a lot of the existing code from #3045. We are confident in this approach and would like to see it through. We continued working on this with @dumbbell, and the most important changes are captured in https://github.com/rabbitmq/seshat/pull/1. We expose these global counters in rabbitmq_prometheus via a new collector. We don't want to keep modifying the existing collector, which grew really complex in parts, especially since we introduced aggregation, but start with a new namespace, `rabbitmq_global_`, and continue building on top of it. The idea is to build in parallel, and slowly transition to the new metrics, because semantically the changes are too big since streams, and we have been discussing protocol-specific metrics with @kjnilsson, which makes me think that this approach is least disruptive and... simple. While at this, we removed redundant empty return value handling in the channel. The function called no longer returns this. Also removed all DONE / TODO & other comments - we'll handle them when the time comes, no need to leave TODO reminders. Pairs @kjnilsson @dcorbacho @dumbbell (this is multiple commits squashed into one) Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2021-06-22 14:14:21 +01:00
Gerhard Lazu	f3f3e8aae9	Always show aggregated auth_attempts, add detailed when per object enabled The metrics have different names now, so we can't end up with duplicate TYPEs. Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2021-01-22 16:38:44 +00:00
Gerhard Lazu	5a6e3f235b	Single auth_attempts declarations when per-object metrics enabled Closes #2740 Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2021-01-22 11:36:42 +00:00
Michael Klishin	52479099ec	Bump (c) year	2021-01-22 09:00:14 +03:00
Mirah Gary	fe9881687c	Change per-object endpoint to `/metrics/per-object`. This conforms with other http endpoints.	2020-11-26 10:35:26 +01:00
Michal Kuratczyk	8b8a66cf0b	Add /metrics/per_object endpoint Regardless of the value of `return_per_object_metrics`, this endpoint always returns per-object metrics. This allows scraping both endpoints at different intervals or scraping per-object metrics only during debugging. Co-authored-by: Mirah Gary <mgary@vmware.com>	2020-11-19 18:00:42 +01:00
Michael Klishin	898a46d7bc	Switch to MPL2	2020-07-14 16:42:52 +03:00
Gerhard Lazu	cab99c29f0	Add failing test for erlang_vm_dist_node_queue_size_bytes Have to force prometheus.erl to a version that does not have this feature, otherwise the test would succeed. pwd /Users/gerhard/github.com/rabbitmq/3.9.x/deps/rabbitmq_prometheus rm -fr ../prometheus.erl make tests open logs/index.html Pull request content: Expose & visualise distribution buffer busy limit - zdbbl > This will be closed after TGIR S01E04 gets recorded. > The goal is to demonstrate how to do this, and then let an external contributor have a go. Before this patch, the Data buffered in the distribution links queue graph was empty. This is what that graph looks like after this gets applied: ![image](https://user-images.githubusercontent.com/3342/80223464-3bf28580-8640-11ea-8851-8f33f1c4fd4f.png) ## References - [RabbitMQ Runtime Tuning - Inter-node Communication Buffer Size](https://www.rabbitmq.com/runtime.html#distribution-buffer) - [erl +zdbbl](https://erlang.org/doc/man/erl.html#+zdbbl) Fixes #39 Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-06-24 16:34:48 +01:00
Gerhard Lazu	db2f70753e	Add tests for product name & version Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-06-18 11:51:47 +01:00
Gerhard Lazu	cba6aa06f4	Fix test that was made to fail on purpose Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-04-25 00:14:05 +01:00

1 2

87 Commits