Before this commit, importing the dashboard via ConfigMap as seen in
1eb1dc618e
didn't work because DS_PROMETHEUS variable was undefined in Grafana.
Related to https://github.com/rabbitmq/rabbitmq-server/pull/3250
Co-authored-by: Gerhard Lazu <gerhard@lazu.co.uk>
This breaks the docker-compose integration, but we need to move away
from it anyways, the whole dev flow needs revisiting after our focus on
K8s.
$__rate_interval does not work with irate, dropping it in favour of 60s,
same as all other dashboards.
This is a follow-up to https://github.com/rabbitmq/rabbitmq-server/pull/3250
Thanks @ansd for mentioning about the post-import issues.
It was uploaded as https://grafana.com/api/dashboards/14798/revisions/3/download
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
It uses the commercial edition of RabbitMQ, requires a valid Tanzu
Network account. Learn more: https://rabbitmq.com/tanzu
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
So that clusters with the same rabbitmq_cluster name in different K8s namespaces don't clash
Namespace filter comes first, because the order of the layers is namespace -> cluster -> node
Tested with the latest 3.9.0 dev build
We had to account for plugin changes from .ez to directories & the management.load_definitions deprecation which would prevent a node from booting (fixed in 07a0dd7438). This commit didn't make it through the 3.9.x pipeline yet, so there is no 3.9.0 dev build with this fix yet. The simplest fix is to drop `management.` from the load_definitions config.
The next manual step is to generate all dashboards using e.g. `make RabbitMQ-Overview.json > ~/Downloads/RabbitMQ-Overview.json` and upload them to https://grafana.com/orgs/rabbitmq
Great contribution @ansd, thank you 👏🏻
In the case where there are 0 channels (and as such 0 publishers), the
dashboard reports there are actually `n` publishers in an `n`-node
cluster. This changes the calculation of publishers to be number of
channels (which is always known) minus the number of consumers (which is
always known).
open http://localhost:3000/dashboards # select Erlang-Distribution
e > Metrics; General > Description # when on Data buffered in the distribution links queue
Save Dashboard > Export > +Export for sharing externally > Save to file
pwd
/Users/gerhard/github.com/rabbitmq/3.9.x/deps/rabbitmq_prometheus
vimdiff docker/grafana/dashboards/Erlang-Distribution.json ~/Downloads/Erlang-Distribution*.json
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
This will make future diffs smaller
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
(cherry picked from commit e1a08d6ae752181177cbcc411219a8dd780359d2)
Otherwise the singlestat panel will return a 'Only queries that return
single series/table is supported' error if the node changed some
properties, like the instance label because the IP changed.
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Hiding "all other" values stopped working since Grafana v6.6.1, need to
be explicit about which values should be hidden. Picked up a few other
changes from Grafana after Save JSON to file.
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
This is a follow-up to https://github.com/rabbitmq/ra/pull/160
Had to introduce mf_convert/3 so that METRICS_REQUIRING_CONVERSIONS
proplist does not clash with METRICS_RAW proplists that have the same
number of elements. This is begging to be refactored, but I know that
@dcorbacho is working on https://github.com/rabbitmq/rabbitmq-prometheus/issues/26
Also modified the RabbitMQ-Quorum-Queues-Raft dashboard
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Grafan will keep failing with the following error message otherwise:
failed to load dashboard from /dashboards/__inputs.json Dashboard title cannot be empty
It captures the Quorum-Queues Raft, so let's be specific, especially
since we know that there will be other Raft implementations in RabbitMQ,
not just Quorum Queues.
[#166926415]
It is essential to know which RabbitMQ & Erlang/OTP version the cluster
is running, as well as how many nodes there are in the cluster. We now
have a table which lists this information, right under all singlestat
panels.
The singlestat panels have been re-organized to make room for 2 new
ones: Nodes & Publishers. Classic & Quorum Queues would be great to
have, as would VHosts. The last singlestats that I would add are Alarms
& Partitions. This would bring the total number of singlestat panels to
14 (we currently have 10). While 14 feel overwhelming, it captures all
the important information that I believe is worth knowing about any
RabbitMQ cluster.
All message-related sections now display 2 graph panels instead of 3.
While 3 panels look good on 27" screens, they don't work as well on 15"
screens, which is what the majority will be using. Also the 3rd panel
would always be for anti-pattern graphs (e.g. unroutable messages,
polling operations, etc.) and would be mostly empty in the majority of
cases. Fitting fewer panels per row not only helps focusing and
understanding what is being displayed, but it also makes it easier to
compare when viewing 2 panels side-by-side, on 27" screens. Nodes &
churn sections still have 3 panels, which works well when 1 panel is
more important than the others. The compromise that we need to make is
between giving enough horizontal space to equally important panels vs
making the dashboard page too long. RabbitMQ-Overview has always been a
comprehensive dashboard which captures a lot of imformation, it was
always tough balancing the important vs the complete.
[finishes #167836027]
9.313226 GiB is a lot harder to read than 9.31 GiB, and therefore less
useful. Observing other people use this made it obvious that limiting
the precision was the human-friendly thing to do.
* explains source of metrics via row names
* makes tables slightly wider to mitigate long names line wrapping
* do not limit entries in tables, refresh resets table pagination
[finishes #168734621]
The yardstick for all Grafana dashboards should be 1920 x 1200, the
screen format most common in our team. If the dashboards look good on
our screens, they will look good on other screens too. Smalle
resolutions won't look too crammed, and bigger resolutions can be split
in half (e.g. 27" iMacs).
Some take-aways from optimising the layout of this dashboard:
* limit horizontal graph panels to 3
* limit horizontal panels to 2 if the information is dense (e.g. table + graph)
* use the same width for graph panels that need comparing, stack vertically