rabbitmq-server

Commit Graph

Author	SHA1	Message	Date
Gerhard Lazu	9cc33c571d	Print the response body by default Makes is easier to spot why a match failed. Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-04-25 00:06:32 +01:00
Jean-Sébastien Pédron	636c3f78dc	Update copyright (year 2020)	2020-03-10 16:42:08 +01:00
Gerhard Lazu	e7c997744d	Improve config for returning metrics per object Since metrics are now aggregated by default, it made more sense to use the inverse meaning of disabling aggregation, and call it a positive and explicit action: return_per_object_metrics. Naming pair: @michaelklishin Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-02-11 13:08:00 +00:00
dcorbacho	253ef8e827	Fix 0/0 and enable all tests	2020-02-07 17:08:10 +01:00
Gerhard Lazu	09b29057af	Aggregate metrics by default Having talked to @michaelklishin we've decided to enable metrics aggregation by default so that RabbitMQ nodes with many objects serve the same amount of metrics quickly rather than taking many seconds and transferring many MBs of data on every scrape. Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-02-07 15:19:30 +00:00
Gerhard Lazu	11d676f3e1	Replace histogram type with gauge for raft_entry_commit_latency_seconds We want to keep the same metric type regardless whether we aggregate or don't. If we had used a histogram type, considering the ~12 buckets that we added, it would have meant 12 extra metrics per queue which would have resulted in an explosion of metrics. Keeping the gauge type and aggregating latencies across all members. re https://github.com/rabbitmq/rabbitmq-prometheus/pull/28 Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-02-06 17:37:37 +00:00
dcorbacho	06186065b4	Option to aggregate channel, queue and connection metrics `prometheus.enable_metric_aggregation = true` rabbitmq-prometheus#26	2020-01-10 16:35:50 +01:00
Gerhard Lazu	89efb964d9	Convert raft_entry_commit_latency to seconds & be explicit about unit This is a follow-up to https://github.com/rabbitmq/ra/pull/160 Had to introduce mf_convert/3 so that METRICS_REQUIRING_CONVERSIONS proplist does not clash with METRICS_RAW proplists that have the same number of elements. This is begging to be refactored, but I know that @dcorbacho is working on https://github.com/rabbitmq/rabbitmq-prometheus/issues/26 Also modified the RabbitMQ-Quorum-Queues-Raft dashboard Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>	2020-01-07 16:20:59 +00:00
Gerhard Lazu	b8893afcde	Add auto-generated test rabbitmq_management.schema	2019-12-03 11:27:26 +00:00
Michael Klishin	3aed601336	A typo	2019-11-26 12:25:08 +03:00
Michael Klishin	6e17eeb3c5	Update this test to use a consumer in a separate process	2019-11-26 12:23:40 +03:00
Gerhard Lazu	f550aa0706	Fix queue_metrics references Some properties had queue_ appended, while others used messages_ instead of message_. This meant that metrics such as rabbitmq_queue_consumers were not reported correctly, as captured in https://github.com/rabbitmq/rabbitmq-prometheus/issues/9#issuecomment-558233464 The test needs fixing before this can be merged, it's currently failing with: $ make ct-rabbit_prometheus_http t=with_metrics:metrics_test == rabbit_prometheus_http_SUITE == * [with_metrics] rabbit_prometheus_http_SUITE > with_metrics {error, {shutdown, {gen_server,call, [<0.245.0>, {call, {'basic.cancel',<<"amq.ctag-uHUunE5EoozMKYG8Bf6s1Q">>, false}, none,<0.252.0>}, infinity]}}} Closes #19	2019-11-26 07:47:22 +00:00
Michael Klishin	b70a8da7f0	Expose endpoint path configuration, references #8	2019-09-26 13:39:21 +03:00
Michael Klishin	b03dfa2dd2	New style configuration schema for listeners Closes #8.	2019-09-26 13:08:36 +03:00
Gerhard Lazu	2b73981ab1	Fix build & identity info metrics Improve pattern matching used in tests so that we don't match partial metric names. [#167846096]	2019-09-04 13:21:50 +01:00
Gerhard Lazu	5781130b61	Use the correct metric types & capture perspective when naming Some metrics were of type gauge while they should have been of type counter. Thanks @brian-brazil for making the distinction clear. This is now captured as a comment above the metric definitions. Because all metrics are from RabbitMQ's perspective, cached for up to 5 seconds by default (configurable), we prepend `rabbitmq_` to all metrics emitted by this collector. While Some metrics are for Erlang (erlang_), Mnesia (schema_db_) or the System (io_), they are all observed & cached by RabbitMQ, hence the prefix. This is the last PR which started in the context of prometheus/docs#1414 [#167846096]	2019-09-04 11:49:48 +01:00
Gerhard Lazu	aafc4c026b	Revert erlang_uptime_seconds to gauge, not counter We care about its value rather than the rate of change. [#167846096]	2019-09-03 19:56:59 +01:00
Gerhard Lazu	fbc945f710	Convert all time metrics to seconds This started in the context of prometheus/docs#1414, specifically https://github.com/prometheus/docs/pull/1414#issuecomment-524250746 [#167846096]	2019-09-03 17:17:50 +01:00
Gerhard Lazu	98e488f1c4	Use standard naming for metrics expected from the client library As described in https://prometheus.io/docs/instrumenting/writing_clientlibs/#process-metrics. Until prometheus.erl has the prometheus_process_collector functionality built-in - this may not happen -, we are exposing a subset of those metrics via rabbitmq_core_metrics_collector, so we are going to stick to the expected naming conventions. This commit supercedes the thought process captured in `1e5f4de4cb` [#167846096]	2019-09-03 15:31:55 +01:00
Gerhard Lazu	1e5f4de4cb	Rename process-related metrics to stay closer to conventions While `process_open_fds` would have been ideal, because the value is cached within RabbitMQ, and computed differently across platforms, it is important to keep the distinction from, say, what the kernel reports just-in-time. I am also capturing the Erlang context by adding `erlang_` to the relevant metrics. The full context is: RabbitMQ observed this Erlang VM process metric to be X, so this is why some metrics are prefixed with `rabbitmq_erlang_process_` Because there is a difference betwen what RabbitMQ limits are set to, e.g. `rabbitmq_memory_used_limit_bytes`, vs. what RabbitMQ reports about the Erlang process, e.g. `rabbitmq_erlang_process_memory_used_bytes`. This is the best that we can do while staying honest about what is being reported. cc @brian-brazil [#167846096]	2019-09-03 12:30:48 +01:00
Gerhard Lazu	2e686f1131	Continue updating RabbitMQ-Overview dashboard to use the new info metric [#167846096]	2019-08-27 17:11:41 +01:00
Gerhard Lazu	e2be7193ff	Use a higher config_port when testing Otherwise it will clash with docker-compose-overview.yml ports	2019-08-15 16:40:19 +01:00
Gerhard Lazu	052d92c74b	Replace global labels with build_info & identity_info metrics This started in the context of prometheus/docs#1414, specifically https://github.com/prometheus/docs/pull/1414#issuecomment-520505757 Rather than labelling all metrics with the same label, we are introducing 2 new metrics: rabbitmq_build_info & rabbitmq_identity_info. I suspect that we may want to revert deadtrickster/prometheus.erl#91 when we agree that the proposed alternative is better. We are yet to see through changes in Grafana dashboards. I am most interested in how the updated queries will look like and, more importantly, if we will have the same panels as we do now. More commits to follow shortly, wanted to get this out the door first. In summary, this commit changes: # TYPE erlang_mnesia_held_locks gauge # HELP erlang_mnesia_held_locks Number of held locks. erlang_mnesia_held_locks{node="rabbit@920f1e3272af",cluster="rabbit@920f1e3272af",rabbitmq_version="3.8.0-alpha.806",erlang_version="22.0.7"} 0 # TYPE erlang_mnesia_lock_queue gauge # HELP erlang_mnesia_lock_queue Number of transactions waiting for a lock. erlang_mnesia_lock_queue{node="rabbit@920f1e3272af",cluster="rabbit@920f1e3272af",rabbitmq_version="3.8.0-alpha.806",erlang_version="22.0.7"} 0 ... To this: # TYPE erlang_mnesia_held_locks gauge # HELP erlang_mnesia_held_locks Number of held locks. erlang_mnesia_held_locks 0 # TYPE erlang_mnesia_lock_queue gauge # HELP erlang_mnesia_lock_queue Number of transactions waiting for a lock. erlang_mnesia_lock_queue 0 ... # TYPE rabbitmq_build_info untyped # HELP rabbitmq_build_info RabbitMQ & Erlang/OTP version info rabbitmq_build_info{rabbitmq_version="3.8.0-alpha.809",prometheus_plugin_version="3.8.0-alpha.809-2019.08.15",prometheus_client_version="4.4.0",erlang_version="22.0.7"} 1 # TYPE rabbitmq_identity_info untyped # HELP rabbitmq_identity_info Node & cluster identity info rabbitmq_identity_info{node="rabbit@bc7aeb0c2564",cluster="rabbit@bc7aeb0c2564"} 1 ... [#167846096]	2019-08-15 16:00:29 +01:00
Gerhard Lazu	4aa3871194	Use different names for *_process_reductions_total metrics It is invalid to have multiple metrics with the same name, TYPE & HELP, but differing labels. [#167846096]	2019-08-14 16:17:48 +01:00
Gerhard Lazu	75ecd6af1d	Fix test that fails when the metric is empty	2019-08-14 12:44:29 +01:00
Gerhard Lazu	e218ea5ea2	Reorder elements in metric names & improve naming bytes / packets must come before _total Explaing element order difference in TOTALS vs the metrics above [#167846096]	2019-08-13 19:15:12 +01:00
Gerhard Lazu	f1043134f4	Fix test failure message description	2019-08-08 16:58:01 +01:00
Diana Corbacho	82d858719d	Label all metrics with Erlang and RMQ version [#166413229]	2019-08-05 09:40:51 +01:00
Gerhard Lazu	805dd5e3b2	Enable quorum queue feature flag when runnin e2e metrics tests Otherwise tests will fail in CI: https://ci.rabbitmq.com/teams/main/pipelines/server-release:v3.8.x-mixed-versions/jobs/test-rabbitmq-prometheus/builds/20 Remove unused rabbit_ct_client_helpers imports	2019-06-26 11:03:28 +01:00
Gerhard Lazu	5e280c0281	Add first version of RabbitMQ Raft metrics Depends on https://github.com/rabbitmq/ra/tree/metrics_tweaks & https://github.com/rabbitmq/rabbitmq-server/tree/qq_metrics_tweak [#166819045]	2019-06-20 20:11:31 +01:00
Gerhard Lazu	2645082738	Finish Erlang Distribution Grafana dashboard Includes Erlang node to colour pinning Adds a few make targets to help with docker-compose repetitive commands & Grafana dashboard updates. Split Overview & Distribution Docker deployments re deadtrickster/prometheus.erl#92 [finishes #166004512]	2019-05-29 18:19:09 +01:00
Gerhard Lazu	f9ce43677b	Review metrics with @dcorbacho [accepts #165831668]	2019-05-15 17:06:42 +01:00
Gerhard Lazu	c596efb58e	Review all metrics to ETS mappings Clarify descriptions, improve metric names, fix typos etc. Follow-up to deadtrickster/prometheus_rabbitmq_exporter#75. Helpful metric descriptions * https://www.rabbitmq.com/monitoring.html * https://docs.signalfx.com/en/latest/integrations/integrations-reference/integrations.rabbitmq.html * https://github.com/rabbitmq/rabbitmq-common/blob/master/include/rabbit_core_metrics.hrl * https://github.com/rabbitmq/rabbitmq-common/blob/master/src/rabbit_core_metrics.erl Thanks for the pair-up @michaelklishin! [finishes #165831668]	2019-05-09 17:25:17 +01:00
Gerhard Lazu	a2e6687162	Bump prometheus.erl to v4.3.0 This includes the global_labels feature introduced in deadtrickster/prometheus.erl#91 To test, run `docker-compose up` in docker dir, then navigate to localhost:15692/metrics & localhost:3000/dashboards (admin:admin) to see the Grafana RabbitMQ Overview dashboard.	2019-05-01 12:58:35 +01:00
Gerhard Lazu	e61a5efde9	Use latest deps, depend on rabbitmq_management_agent + add node label Bumping all prometheus-related deps to latest stable. Defining them in rabbitmq-components.mk, so that they can be promoted to all deps in umbrella. rabbitmq_management_agent is required for alarm-related metrics to be available. Added node label to most `rabbitmq_` metrics. I need help adding them to mfa_totals - metrics_node_label_test test currently fails. The new unit tests ensure that label/0 behaves as expected in all cases - made refactoring easy. Run unit tests via: gmake eunit EUNIT_MODS=prometheus_rabbitmq_core_metrics_collector Updating to latest erlang.mk makes running eunit tests much faster: 2s vs 10s. To do this, comment `ERLANG_MK_*` in Makefile and run `gmake erlank-mk`.	2019-04-09 17:02:40 +01:00
Michael Klishin	08be918234	Change default port to 15692	2019-04-01 18:00:39 +03:00
Diana Corbacho	46431063a3	Support only text format and (optional) gzip encoding Since Prometheus 2.0 protobuf is not longer a supported format	2019-03-13 14:36:49 +00:00

1 2

87 Commits