Commit Graph

35 Commits

Author SHA1 Message Date
Gerhard Lazu 1e5708b0c5
Fix Grafana dashboards when importing from URL
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-03-22 19:27:13 +00:00
Gerhard Lazu c18ad7a5b6
Fix colors for node names that include digits in Grafana dashboards
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-03-08 13:19:14 +00:00
Gerhard Lazu 0ce95075ef
Bump all Grafana dashboards dep versions to latest
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-01-18 18:59:11 +00:00
David Ansari 377a933f4c
Filter Grafana dashboards by namespace (#2719)
So that clusters with the same rabbitmq_cluster name in different K8s namespaces don't clash

Namespace filter comes first, because the order of the layers is namespace -> cluster -> node

Tested with the latest 3.9.0 dev build

We had to account for plugin changes from .ez to directories & the management.load_definitions deprecation which would prevent a node from booting (fixed in 07a0dd7438). This commit didn't make it through the 3.9.x pipeline yet, so there is no 3.9.0 dev build with this fix yet. The simplest fix is to drop `management.` from the load_definitions config.

The next manual step is to generate all dashboards using e.g. `make RabbitMQ-Overview.json > ~/Downloads/RabbitMQ-Overview.json` and upload them to https://grafana.com/orgs/rabbitmq

Great contribution @ansd, thank you 👏🏻
2021-01-18 18:45:05 +00:00
Gerhard Lazu b28f6e64ba Fix metric name & description in zdbbl graph, Erlang Distribution
open http://localhost:3000/dashboards # select Erlang-Distribution
    e > Metrics; General > Description # when on Data buffered in the distribution links queue
    Save Dashboard > Export > +Export for sharing externally > Save to file
    pwd
    /Users/gerhard/github.com/rabbitmq/3.9.x/deps/rabbitmq_prometheus
    vimdiff docker/grafana/dashboards/Erlang-Distribution.json ~/Downloads/Erlang-Distribution*.json

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2020-06-24 16:36:04 +01:00
Gerhard Lazu 850a30653d Use Grafana 6.7.2 schema defaults in Erlang-Distribution
This will make future diffs smaller

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
(cherry picked from commit e1a08d6ae752181177cbcc411219a8dd780359d2)
2020-04-27 15:22:01 +01:00
aakcht 729dd14f9b Color labelling grafana fix 2020-04-14 18:20:04 +04:00
Gerhard Lazu 400ebdf9f8 Publish Erlang-Distribution Grafana dashboard to grafana.com
https://grafana.com/grafana/dashboards/11352

[finishes #166355345]
2019-12-04 21:12:16 +00:00
Gerhard Lazu a356e2f630 Set all datasources to null, simplify dashboard tags
When exporting dashboards, all datasources are set to a dynamic
datasource, otherwise use the default local one (prometheus).
2019-10-14 09:18:43 +01:00
Gerhard Lazu e5fc8b18c8 Fix Erlang-Distributions-Compare title, reset time_options 2019-10-04 22:08:27 +01:00
Gerhard Lazu 71692f2dbf Remove shared __requires & update-dashboards make target
__requires differs across dashboards

update-dashboards is not as useful anymore, vimdiffing most of the time.
2019-10-04 21:54:14 +01:00
Gerhard Lazu 19cbbbf755 Update tags for all Grafana dashboards 2019-10-03 17:39:19 +01:00
Gerhard Lazu 402aa4722f Extract __requires from Grafana dashboards, template all datasources 2019-10-03 17:32:40 +01:00
Gerhard Lazu dae49b5c08 Extract __inputs from Grafana dashboards
While __inputs are required for the dashboards to work in environments
where Prometheus is not the default datasource, it breaks the local
development flow. In other words,
9aa22e1895
prevents `make metrics overview` from working as designed.

We are going to add shortly a simple way of converting the local
dashboards into a format that can be imported in Grafana and will work
when Prometheus is not the default datasource (e.g. when using
https://github.com/coreos/kube-prometheus)

Long-term, these dashboards will be available via grafana.com, which is
the preferred way of consuming them.

cc @mkuratczyk
2019-10-02 12:51:33 +01:00
Michal Kuratczyk 9aa22e1895 Make the datasource configurable for all dashboards 2019-09-24 15:40:18 +02:00
Gerhard Lazu b3336da844 Finish updating Erlang-Distribution dashboard to use new info metric
[#167846096]
2019-09-03 10:48:17 +01:00
Gerhard Lazu 6639f5f68f Start updating Erlang-Distribution dashboard to use new info metric
[#167846096]
2019-09-02 22:40:24 +01:00
Gerhard Lazu d5c83792bc Increase Prometheus scrape to 15s & match across all metrics
We want to use a consistent range for all metrics that use rate() and a
safe value (4x the Prometheus scrape interval):
https://www.robustperception.io/what-range-should-i-use-with-rate

This also prompted a change in RabbitMQ's default
collect_statistics_interval, so that we don't update metrics
unnecessarily. We are OK if the Management UI doesn't update on every 5s
auto-refresh.

Related a929f22233

[#167846096]
2019-08-13 17:20:49 +01:00
Gerhard Lazu 57b6092348 Remove duplicate filter
Thanks @mkuratczyk for spotting it!
2019-08-05 18:29:16 +01:00
Gerhard Lazu c92e551007 Improve Erlang Dist & Overview dashboards based on recent learnings
Learned a couple of new things while building RabbitMQ-Raft, applied
them here.
2019-06-24 18:29:20 +01:00
Gerhard Lazu 4b78d41055 Improve node naming, standardise the colour pinning regex 2019-06-17 22:20:54 +01:00
Gerhard Lazu 6daccf9b88 Improve node colour pinning
* start from 0, not 1
* fix colour pinning for nodes with numbers - e.q. rmq-gcp-38
2019-06-17 19:04:28 +01:00
Gerhard Lazu d5b1a03648 Increase erlang_vm_dist_node_queue_size threshold to 64MB & expand info
[#166037004]
2019-06-17 17:20:04 +01:00
Gerhard Lazu 8a60eef9a3 Fix erlang_vm_dist_node_queue_size graph
It's not a rate, it's the actual buffered data

[#166037004]
2019-06-17 17:01:27 +01:00
Gerhard Lazu cf339a49e8 Respond to learnings from a LRE PromStack & Erlang Distribution metrics
re deadtrickster/prometheus.erl#94
re erlang/otp#2270

[#166574772]
2019-06-11 19:03:50 +01:00
Gerhard Lazu 773b8f8670 Show legends on process states
It's hard to understand what the different colours mean otherwise. Also,
yellow is preferable to purple when it comes to displaying runnable
processes - those stuck in the run queue.

cc @michaelklishin
2019-06-03 22:00:43 +01:00
Gerhard Lazu 0945511e7f Capture learnings from ERL-959 into Erlang Distribution Grafana dashboard
It explains the correlation between inet packets & TCP packets, and why
the inet packet size varies when TLS is used for inter-node
communication.

[finishes 166419953]
2019-06-03 18:04:07 +01:00
Gerhard Lazu 112254ed96 Enable filtering Erlang Distribution metrics in Grafana by cluster
[finishes #165818813]
2019-05-30 13:58:56 +01:00
Gerhard Lazu 7b632674c9 Update Distribution & Overview dashboard tags 2019-05-30 09:37:24 +01:00
Gerhard Lazu 2645082738 Finish Erlang Distribution Grafana dashboard
Includes Erlang node to colour pinning

Adds a few make targets to help with docker-compose repetitive commands
& Grafana dashboard updates.

Split Overview & Distribution Docker deployments

re deadtrickster/prometheus.erl#92

[finishes #166004512]
2019-05-29 18:19:09 +01:00
Gerhard Lazu 4e81af4cfc Pin RabbitMQ nodes to colours in all Grafana panesl
Regex is greedy, need to look into non-greedy matching, especially for
Erlang Distribution metrics.

[#166004512]
2019-05-20 22:19:43 +01:00
Gerhard Lazu d1460d5b44 Stress Erlang Distribution metrics on OTP 21
We (+@essen) have answered a bunch of questions (see the story) and
improved the metrics + dashboard in the process. Added some improvements
to the RabbitMQ Overview metrics as well.

[#166004104]
2019-05-20 21:41:27 +01:00
Gerhard Lazu 1f333ebed6 Display the number of Erlang Distribution links
[#166004104]
2019-05-20 10:12:22 +01:00
Gerhard Lazu ebde2ff663 Default Erlang Distribution Grafana dashboard to 10 minutes
It's the same as RabbitMQ Overview
2019-05-15 17:05:07 +01:00
Gerhard Lazu 7652799e05 Add Grafana dashboard for Erlang Distribution
Just the first version, imperfect in many ways, but better than nothing.

[#166004512]
2019-05-14 16:17:04 +01:00