Commit Graph

225 Commits

Author SHA1 Message Date
Michal Kuratczyk c0368a0d24
[skip ci] Update dashboards for RabbitMQ 4.1
Key changes:
- endpoint variable to handle scraping multiple endpoints
- message size panels (new metric in 4.1)
- panels at the top of the Overview dashboard should be more up to date
  (they show the latest value)
- values should be accurate if multiple endpoints are scraped
  (previously, many would be doubled)
- Nodes table shows fewer volumns and shows node uptime
2025-04-16 17:48:21 +02:00
Tony Lewis Hiroaki URAHAMA 3c5f4d3d39
Bump Prometheus Version 2025-03-01 18:21:51 +00:00
Anh Nguyen dc9311a561 Update Erlang Distribution dashboard panel and instance filtering
- Modified metric expression and legend format in State of distribution links
- Changed panel type from 'flant-statusmap-panel' to 'status-history' for Process state
2024-11-14 11:04:07 +07:00
Anh Nguyen b9dc0ea3b4 Add instance filtering to Erlang BEAM Grafana dashboard metrics
- Updated metric expressions to include instance filtering with {instance=\"$node\"}
  for the following metrics:
  - erlang_vm_statistics_run_queues_length
  - erlang_vm_statistics_dirty_io_run_queue_length
  - erlang_vm_statistics_dirty_cpu_run_queue_length
- Added 'DS_PROMETHEUS' as a templated data source variable
2024-11-13 20:20:02 +07:00
Luke Bakken 3d668fda46
Grafana: add a runtime/Erlang/BEAM dashboard (#12456)
* Add BEAM dashboard

Also update the other dashboards by opening in Grafana v11.2.2 and ensuring they work as expected.

* Update the Erlang-Distributions-Compare dashboard

* Update the RabbitMQ-Overview dashboard

* Update the RabbitMQ-Quorum-Queues-Raft dashboard

* Update the RabbitMQ-Stream dashboard

* Update distribution link status panel

---------

Co-authored-by: Michal Kuratczyk <mkuratczyk@vmware.com>
2024-10-17 07:10:54 -07:00
Michal Kuratczyk 9b828c08b7
Remove HiPE 2024-08-28 09:18:28 +02:00
Michal Kuratczyk 116ab4f6fe
Remove memory_high_watermark_paging_ratio 2024-08-28 08:12:49 +02:00
Michal Kuratczyk 41a4d1711d
OTP27 support (#11366)
* "maybe" is now a keyword
* Bump horus to 0.2.5 and switch to hex
* Get rid of some deprecated callbacks/functions
2024-06-18 07:32:58 +02:00
Lajos Gerecs 82e25af5d5
Grafana: make sure dashboards do not break when detailed metrics are used (#5945)
* Fix broken dashboards if detailed metrics are used

If detailed metrics are pulled into the same prometheus, then
we get an error in Grafana:

execution: many-to-many matching not allowed:
matching labels must be unique on one side

This is because both endpoints provide `rabbit_identity_info`
which is not unique to the endpoint.

* add detailed metric scraper to prometheus config

---------

Co-authored-by: Michal Kuratczyk <michal.kuratczyk@broadcom.com>
2023-12-27 15:44:05 +01:00
Johan Rhodin 0b2a94c1ec
Update RabbitMQ-Overview.json
Global counters for producers added in https://github.com/rabbitmq/rabbitmq-server/pull/3127 but never made it to this dashboard
2023-11-01 13:23:41 -05:00
Michal Kuratczyk c56f2e2678
Remove the query threshold
The graph looks empty or broken when values are sometimes
above and sometimes below the 5000 limit. I think it's better
to just show everything.
2023-09-07 11:33:35 +02:00
Rin Kuryloski 609171ec70 Rename the tanzu cli scope to vmware
And update other references to commercial editions
2023-02-16 13:49:54 +01:00
Connor Rogers 6ee0a318e8
Move message rate metrics from channel/queue aggregation to global counters 2022-08-08 16:19:01 +01:00
Connor Rogers c88326ef23
Add README.md for creating/updating dashboards 2022-08-05 17:41:47 +01:00
Connor Rogers e35fd65ff3
Fix overview graphs in Grafana 9
'-1' is no longer accepted as of Grafana 9, and causes a console error when rendering
2022-08-05 17:09:34 +01:00
Connor Rogers 9ac6862e06
Fix dist link graph
Both directions of the link were showing as one entry instead of two.

This is beacuse of https://github.com/flant/grafana-statusmap/issues/277
2022-08-05 16:51:21 +01:00
Connor Rogers 42f30ba7c3
Set time series to show all series in tooltip 2022-08-05 16:14:29 +01:00
Connor Rogers 40767cdae4
Take dashboard definitions straight from exported Grafana for simplicity 2022-08-05 16:09:30 +01:00
Connor Rogers 4d28eef0f8
Migrate from deprecated panels in Grafana 2022-08-05 15:46:27 +01:00
Connor Rogers 8e404ecd04
Update to supported Grafana and Prometheus versions 2022-08-05 12:52:26 +01:00
Gerhard Lazu 62d82e1660
Break down metrics by node in all RabbitMQ-Stream pie charts
Otherwise we won't be able to see which nodes are running "hot"

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-08-11 13:39:30 +01:00
David Ansari 4b774db5c1 Use same threshold color for "Errors since boot" 2021-08-02 17:05:17 +02:00
David Ansari c99ee6961e Use same colorMode in all RabbitMQ-Stream panels
Co-authored-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-08-02 13:33:00 +02:00
David Ansari ea18c31288 Make RabbitMQ-Stream dashboard work via ConfigMap
Before this commit, importing the dashboard via ConfigMap as seen in
1eb1dc618e
didn't work because DS_PROMETHEUS variable was undefined in Grafana.

Related to https://github.com/rabbitmq/rabbitmq-server/pull/3250

Co-authored-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-08-02 13:12:48 +02:00
Gerhard Lazu 65afbb931b
Ensure RabbitMQ-Stream dashboard works correctly after import
This breaks the docker-compose integration, but we need to move away
from it anyways, the whole dev flow needs revisiting after our focus on
K8s.

$__rate_interval does not work with irate, dropping it in favour of 60s,
same as all other dashboards.

This is a follow-up to https://github.com/rabbitmq/rabbitmq-server/pull/3250

Thanks @ansd for mentioning about the post-import issues.

It was uploaded as https://grafana.com/api/dashboards/14798/revisions/3/download

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-07-30 13:53:02 +01:00
Gerhard Lazu 35a6369327
Restart stream-perf-test on-failure
This handles the scenario where rmq2 is not available, and
stream-perf-test exits with a non-zero exit code. Good spot @ansd!

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-07-30 11:25:36 +01:00
David Ansari 47d572908d Convert string to integer for ulimits.nofile
Before this commit:

> make overview metrics
services.rmq1.ulimits.nofile.hard must be a integer
make: *** [Makefile:68: overview] Error 15

Accoring to the docs
https://docs.docker.com/compose/compose-file/compose-file-v3/#ulimits
this must be an integer.
2021-07-30 09:46:38 +02:00
Gerhard Lazu 6f5c4118ea
Publish RabbitMQ-Stream dashboard to grafana.com
Removed the Dockerfile and slimmed down the Makefile, all of this is now
handled by https://github.com/rabbitmq/rabbitmq-server/blob/master/.github/workflows/oci.yaml
cc @Zerpet @pjk25

More details here (including the steps used to publish to grafana.com):
https://github.com/rabbitmq/release-engineering/issues/11#issuecomment-887627938

I don't want to hold up this PR, will invest in automating the
steps described in the previous link another time. Time to 🚀

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-07-29 19:34:05 +01:00
Gerhard Lazu 1e5708b0c5
Fix Grafana dashboards when importing from URL
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-03-22 19:27:13 +00:00
Gerhard Lazu c18ad7a5b6
Fix colors for node names that include digits in Grafana dashboards
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-03-08 13:19:14 +00:00
Gerhard Lazu 6adb2449b4
Add inet_tcp_metrics Grafana dashboard & cluster example
It uses the commercial edition of RabbitMQ, requires a valid Tanzu
Network account.  Learn more: https://rabbitmq.com/tanzu

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-02-05 12:50:32 +00:00
Gerhard Lazu 0ce95075ef
Bump all Grafana dashboards dep versions to latest
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-01-18 18:59:11 +00:00
David Ansari 377a933f4c
Filter Grafana dashboards by namespace (#2719)
So that clusters with the same rabbitmq_cluster name in different K8s namespaces don't clash

Namespace filter comes first, because the order of the layers is namespace -> cluster -> node

Tested with the latest 3.9.0 dev build

We had to account for plugin changes from .ez to directories & the management.load_definitions deprecation which would prevent a node from booting (fixed in 07a0dd7438). This commit didn't make it through the 3.9.x pipeline yet, so there is no 3.9.0 dev build with this fix yet. The simplest fix is to drop `management.` from the load_definitions config.

The next manual step is to generate all dashboards using e.g. `make RabbitMQ-Overview.json > ~/Downloads/RabbitMQ-Overview.json` and upload them to https://grafana.com/orgs/rabbitmq

Great contribution @ansd, thank you 👏🏻
2021-01-18 18:45:05 +00:00
Gerhard Lazu 4e31a176c9 Upgrade RabbitMQ Overview dashboard to Grafana 7
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-11-13 12:55:05 +00:00
Gerhard Lazu 530de03e38 Merge pull request #61 from rabbitmq/grafana-publisher-fix
Prevent non-zero publisher count in Grafana when aggregating metrics
2020-11-13 12:45:33 +00:00
Gerhard Lazu 3f6f54eb02 Bump Grafana, Prometheus & Node Exporter to latest
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-11-13 12:15:06 +00:00
Connor Rogers 5b9f77a5f2 Prevent non-zero publisher count when aggregating metrics
In the case where there are 0 channels (and as such 0 publishers), the
dashboard reports there are actually `n` publishers in an `n`-node
cluster. This changes the calculation of publishers to be number of
channels (which is always known) minus the number of consumers (which is
always known).
2020-11-12 15:26:39 +00:00
Gerhard Lazu 8f7953438e Fix Erlang cookie when running with Docker Compose on Windows
Context:
9452cf179b (commitcomment-40660523)

Thanks @wainwrightmark!

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-07-17 10:28:14 +01:00
Gerhard Lazu 9452cf179b Mount .erlang.cookie file
Context: we want to move away from environment variables and use either
config files or env files (such as the rabbitmq-env.conf).

Since .erlang.cookie is neither, the official RabbitMQ Docker image
handles this by writing the value from the RABBITMQ_ERLANG_COOKIE env
var into the file if it does not exist. The problem is that if this file
exists, and the value is different from the RABBITMQ_ERLANG_COOKIE env
var, CLI tools will not be able to communicate with the rabbit node, as
described here: https://github.com/rabbitmq/rabbitmq-cli/issues/443

The only gotcha is that this file must be owned by the user, and
privileges should not be too open (git should have captured this). If
not, RabbitMQ will fail to boot. This is somewhat similar to how OpenSSH
reacts when private key permissions are too open.

re https://github.com/docker-library/rabbitmq/pull/422#issuecomment-650074731

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-07-01 17:07:16 +01:00
Gerhard Lazu b28f6e64ba Fix metric name & description in zdbbl graph, Erlang Distribution
open http://localhost:3000/dashboards # select Erlang-Distribution
    e > Metrics; General > Description # when on Data buffered in the distribution links queue
    Save Dashboard > Export > +Export for sharing externally > Save to file
    pwd
    /Users/gerhard/github.com/rabbitmq/3.9.x/deps/rabbitmq_prometheus
    vimdiff docker/grafana/dashboards/Erlang-Distribution.json ~/Downloads/Erlang-Distribution*.json

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-06-24 16:36:04 +01:00
Gerhard Lazu a6f6244c85 Build Docker image from latest 3.9 dev release + this PR
Update OTP to latest stable, 23.0.2

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-06-18 11:52:16 +01:00
Gerhard Lazu 850a30653d Use Grafana 6.7.2 schema defaults in Erlang-Distribution
This will make future diffs smaller

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
(cherry picked from commit e1a08d6ae752181177cbcc411219a8dd780359d2)
2020-04-27 15:22:01 +01:00
Gerhard Lazu eca19f7dd9 Bump versions across a number of deps
- RabbitMQ latest 3.9 dev build
- OpenSSL - https://github.com/docker-library/rabbitmq/pull/403
- OTP, PerfTest, Prometheus & Grafana latest GA

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
(cherry picked from commit b25b4e897337d97edbf6a826b0f12d20ea7cf914)
2020-04-22 18:12:21 +01:00
aakcht 729dd14f9b Color labelling grafana fix 2020-04-14 18:20:04 +04:00
Gerhard Lazu 1222018e50 Make Erlang-Memory-Allocators dashboard look better on light
Use the latest Grafana schema improvements to simplify the dahsboard
definition.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-02-24 13:21:46 +00:00
Gerhard Lazu 8d40bf85a2 Bump Prometheus & Grafana to latest stable
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-02-24 12:45:13 +00:00
Gerhard Lazu 9c112b5718 Sum resident set size on Erlang-Memory-Allocators
Otherwise the singlestat panel will return a 'Only queries that return
single series/table is supported' error if the node changed some
properties, like the instance label because the IP changed.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-02-24 12:09:47 +00:00
Gerhard Lazu d64361658a Bump to latest unverified generic-unix 3.9 dev build
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-02-12 19:00:00 +00:00
Gerhard Lazu 92ef32d022 Build image with latest RabbitMQ 3.9.0 dev + local
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-02-11 13:29:29 +00:00
Gerhard Lazu 19683fc2c9 Clean up NODES table on RabbitMQ-Overview
Hiding "all other" values stopped working since Grafana v6.6.1, need to
be explicit about which values should be hidden. Picked up a few other
changes from Grafana after Save JSON to file.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-02-11 13:26:20 +00:00