Commit Graph

170 Commits

Author SHA1 Message Date
Pepe Cano 7ed46fd321
docs(alerting): `alertingSaveStateCompressed` is enabled by default (#111897) 2025-10-03 09:02:58 +02:00
Seunghun Shin 512c292e04
Alerting: Add jitter support for periodic alert state storage to reduce database load spikes (#111357)
What is this feature?

This PR implements a jitter mechanism for periodic alert state storage to distribute database load over time instead of processing all alert instances simultaneously. When enabled via the state_periodic_save_jitter_enabled configuration option, the system spreads batch write operations across 85% of the save interval window, preventing database load spikes in high-cardinality alerting environments.

Why do we need this feature?

In production environments with high alert cardinality, the current periodic batch storage can cause database performance issues by processing all alert instances simultaneously at fixed intervals. Even when using periodic batch storage to improve performance, concentrating all database operations at a single point in time can overwhelm database resources, especially in resource-constrained environments.

Rather than performing all INSERT operations at once during the periodic save, distributing these operations across the time window until the next save cycle can maintain more stable service operation within limited database resources. This approach prevents resource saturation by spreading the database load over the available time interval, allowing the system to operate more gracefully within existing resource constraints.

For example, with 200,000 alert instances using a 5-minute interval and 4,000 batch size, instead of executing 50 batch operations simultaneously, the jitter mechanism distributes these operations across approximately 4.25 minutes (85% of 5 minutes), with each batch executed roughly every 5.2 seconds.

This PR provides system-level protection against such load spikes by distributing operations across time, reducing peak resource usage while maintaining the benefits of periodic batch storage. The jitter mechanism is particularly valuable in resource-constrained environments where maintaining consistent database performance is more critical than precise timing of state updates.
2025-09-29 11:22:36 +02:00
Alexander Akhmetov c827ddf790
Alerting: Add meta-monitoring documentation for GRAFANA_ALERTS (#108785) 2025-07-29 21:54:44 +02:00
renovate[bot] c94f930950
Update dependency prettier to v3.6.2 (#108689)
* Update dependency prettier to v3.6.2

* run prettier

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
2025-07-25 17:47:44 +01:00
Pepe Cano 61efc8b609
docs(alerting): clarify usage of different Alertmanagers and fix misleading details (#107498)
* docs(alerting): clarify usage of different Alertmanagers and fix misleading details

* address review changes
2025-07-02 21:46:29 +02:00
Pepe Cano f5b79fca55
docs(alerting): performance considerations minor clarifications (#107333) 2025-06-30 07:42:26 +00:00
Alexander Akhmetov f4b0e793aa
Alerting: Document label sanitization in GRAFANA_ALERTS (#107285)
* Alerting: Document label sanitization in GRAFANA_ALERTS
2025-06-27 23:33:42 +02:00
Pepe Cano f14492baf8
docs(alerting): prom backend to write ALERTS metric (#107006)
* docs(alerting): prom backend to write ALERTS metric

* add enterprise label

* Update docs/sources/alerting/set-up/configure-alert-state-history/index.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/configure-alert-state-history/index.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/configure-alert-state-history/index.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/configure-alert-state-history/index.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/configure-alert-state-history/index.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/meta-monitoring.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/meta-monitoring.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/meta-monitoring.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/meta-monitoring.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/meta-monitoring.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/configure-alert-state-history/index.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/configure-alert-state-history/index.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/meta-monitoring.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

* Update docs/sources/alerting/set-up/meta-monitoring.md

Co-authored-by: Alexander Akhmetov <me@alx.cx>

---------

Co-authored-by: Alexander Akhmetov <me@alx.cx>
2025-06-24 10:38:52 +02:00
Jack Baldry 244ffad99d
Fix all the old usage of admonition syntax (#106984) 2025-06-19 17:31:13 +01:00
Vadim Stepanov 5137995830
Alerting: Add support for Redis Sentinel for Alerting HA (#106322)
* Alerting: Add support for Redis Sentinel

* docs

* docs

* Use minisentinel in test

* Apply suggestions from code review

Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>
Co-authored-by: Fayzal Ghantiwala <114010985+fayzal-g@users.noreply.github.com>

* "address(es)" -> "address or addresses"

* make update-workspace

* make lint-go-diff

---------

Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>
Co-authored-by: Fayzal Ghantiwala <114010985+fayzal-g@users.noreply.github.com>
2025-06-05 15:02:40 +01:00
Pepe Cano b2d317cc23
docs(alerting): add notes for Redis HA setup (#106144)
* docs(alerting): add note for Redis HA setup

* run prettier
2025-05-29 11:21:32 -05:00
Pepe Cano f023fcc68a
docs(alerting): New Alertmanager contact point docs (#103782) 2025-04-21 12:08:41 +02:00
Yuri Tseretyan 1bafd5c807
Docs: Remove mention of alertingApiServer flag from alerting documentation (#104131) 2025-04-18 11:51:38 -04:00
Moustafa Baiou b604fdf6f8
Alerting: Update docs for RBAC (#104005)
* Alerting: Update docs for RBAC

This updates the documentation for RBAC to match the changes from 032299011a

* add note about RBAC permissions with datasource permissions

* writers toolkit update
2025-04-15 15:34:19 -04:00
William Wernert a8f60de620
Alerting: Remove feature toggles relating to Loki Alert State History (#103540)
* Remove feature toggles relating to Loki Alert State History
2025-04-08 09:50:27 -04:00
Pepe Cano d44a9953d3
docs(alerting): add missing port setting for the HA k8s example (#103017) 2025-03-27 22:05:41 +01:00
Johnny Kartheiser 558773ed7f
docs: add note about alert migration (#102741)
* docs: add note about alert migration

added note about alert migration in 11.6.0.

* docs: add note about alert migration

adding documentation about the migrations to the performance limitation docs

* title edit

<-- vale = NO -->
2025-03-26 15:39:32 -05:00
Alexander Akhmetov ef5cc12b33
Alerting: Add HMAC signature config to the webhook integration (#100960)
Adds HMAC-SHA256 signature support to webhook notifications, providing a way to verify the authenticity and integrity of webhook requests. The implementation allows to specify the header in which the signature will be sent. The signature is calculated from the request body.

An optional timestamp header name can be provided. If set, the HMAC signature will be generated by concatenating the timestamp, a ":" and the request body: {timestamp}:{body}. The timestamp will also be sent in the provided header name.
2025-03-14 07:22:41 +01:00
Pepe Cano 5bfe046da9
docs(alerting): clarify behaviour when provisioning the policy tree (#101937) 2025-03-11 15:58:25 +01:00
Robby Milo 13cf67de53
Remove relref shortcodes (#101694)
* manually replce all shared relrefs

* relref replace - grafana next

* Merge branch 'master' into robbymilo/relref-replace-grafana-next

* manual fixes

* remove ref shortcode

* Merge branch 'master' into robbymilo/relref-replace-grafana-next

* prettier

* fix test

* update readme
2025-03-06 13:59:08 +01:00
Alexander Akhmetov a9ce930634
Alerting: Promote alertingSaveStateCompressed flag to public preview (#99935) 2025-02-06 18:09:43 +01:00
Garret Wyman cf177776bf
Alerting: Adding color option for slack receiver (#99615) 2025-01-30 00:12:16 +02:00
Pepe Cano 6178320257
Alerting docs/internal: leaf bundle page becomes a branch bundle (#98391) 2024-12-30 11:49:11 +01:00
Pepe Cano 6976908597
Alerting docs: update configuring and using additional Alertmanagers in Grafana Alerting (#98363)
* Alerting docs: Alertmanager data source

* Alertmanager data source: minor intro changes

* Update `Intro > Notifications`

* Complete Configure alertmanager update

* Configure alertmanager

* minor final edits
2024-12-27 10:16:31 +01:00
Pepe Cano 001cc853a5
Alerting docs: update `Monitor status` section (#98179)
* Minor updates to `Declare incidents` docs

* change URL

* View alert state history + restructuring

* Complete alert state

* alert state

* change heading

* View alert rules

* Monitor alerts
2024-12-18 13:23:15 +01:00
Alexander Akhmetov 1f8f9a45d7
Alerting: Add state_periodic_save_batch_size config option (#98019)
* Alerting: Add state_periodic_save_batch_size config option

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2024-12-16 15:30:38 +01:00
brendamuir 52f6f69d4b
Alerting docs: adds new export alert rule definition (#97028)
* Alerting docs: adds new export alert rule definition

* update

* Update button text for exporting new rule definition

* Update docs/sources/alerting/set-up/provision-alerting-resources/export-alerting-resources/index.md

Co-authored-by: Sonia Aguilar <33540275+soniaAguilarPeiron@users.noreply.github.com>

* Update docs/sources/alerting/set-up/provision-alerting-resources/export-alerting-resources/index.md

Co-authored-by: Sonia Aguilar <33540275+soniaAguilarPeiron@users.noreply.github.com>

* definition

* type

* definition 2

* pepes feedback

* removes will

---------

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
Co-authored-by: Sonia Aguilar <33540275+soniaAguilarPeiron@users.noreply.github.com>
2024-11-26 19:20:55 +01:00
Pepe Cano 706300e9b7
Alerting: notification template group (#96447)
Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
2024-11-22 14:40:20 +02:00
Pepe Cano 154a2e0d06
Alerting docs: Move `Meta monitoring` to `Additional configuration` section (#96150) 2024-11-11 11:08:12 +01:00
Pepe Cano b953190328
Alerting docs: reuse `Additional configuration` page for Cloud docs (#96101) 2024-11-08 16:03:47 +01:00
brendamuir 86bc087257
Alerting docs: changes advanced to additional (#96083) 2024-11-08 13:29:10 +01:00
brendamuir b2af163dc5
Alerting docs: adds advanced config section (#96013)
* Alerting docs: adds advanced config section

* corrects ref

* feedback from pepe

* renames detect and respond
2024-11-07 15:59:00 +01:00
Alessandro Chitarrini c490b29d34
Update Discord contact point documentation for use_discord_username type (#95011) 2024-11-04 10:02:19 +00:00
Yuri Tseretyan c5bad9f843
Alerting: Update documentation to include new permissions for routes (#95437)
* update documentation

* Update index.md
2024-10-30 10:09:32 +01:00
Tito Lins 71d04a326b
Alerting: Support tls config for webhook receiver (#93513)
Adds the ability to configure tls settings on the webhook receiver (e.g. to skip server certificate validation)
2024-10-22 12:44:32 +02:00
Yuri Tseretyan ced5497ba1
Docs: Update alerting notifications documentation (#93944)
* add new permissions and fixed roles

* Apply suggestions from code review

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* address comments

* add actions to complete list

* fmt

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2024-10-15 16:16:15 +02:00
Pepe Cano 75900139ae
Alerting docs: update `Configure Alertmanagers` (#93712)
* Alerting docs: update `Configure Alertmanagers`
- clarify alertmanager types
- specify that each alertmanager manages its own alerting resources

* Update docs/sources/alerting/set-up/configure-alertmanager/index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/set-up/configure-alertmanager/index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/set-up/configure-alertmanager/index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* use `Alertmanager` - capitalize the first letter

* Small copy changes

* Minor cosmetic updates to `Add an Alertmanager` section

* Update docs/sources/alerting/set-up/configure-alertmanager/index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2024-09-25 12:44:17 +02:00
brendamuir 5a1a3204c9
Alerting docs: adds ha-advertise-address to alerting docs (#93303)
* Alerting docs: adds ha-advertise-address to alerting docs

* Change description of setting [ha_advertise_address]

---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2024-09-23 19:14:34 +02:00
Alexander Akhmetov e59ea00518
Alerting: Add TLS, QoS and retain options to the MQTT receiver (#92331) 2024-09-17 21:11:16 +02:00
brendamuir 98f766a50d
Alerting docs: adds recording rule info (#93204)
* Alerting docs: adds recording rule info

* ran prettier

* Updates with feedback from pepe and removes external reference

* couple of minor edits

* removes reference

* feedback from sonia

* adds links per gilles

* adds correct version link
2024-09-12 12:06:23 +02:00
Pepe Cano 7eb7b51dce
Docs: update IAM links for Cloud docs (#93147) 2024-09-10 09:33:45 +02:00
Pepe Cano 15de549093
Alerting docs: HA - configure alertmanager to prevent duplicated notifications (#92611)
* Alerting docs: HA - configure shared alertmanager

* Update docs/sources/alerting/set-up/configure-high-availability/_index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/set-up/configure-high-availability/_index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Apply minor content suggestion

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2024-09-03 06:31:09 +02:00
brendamuir 667cbd626c
Alerting docs: adds display names to fixed roles for RBAC (#92357)
* Alerting docs: adds display names to fixed roles for RBAC

* ran prettier

* updates descriptions

* preposition update

* ran prettier
2024-08-27 09:23:04 +02:00
Pepe Cano 2ba930ab1f
Alerting Docs: Monitor your high availability setup (#92063)
* Alerting Docs: Monitor your high availability setup

* Update docs/sources/alerting/set-up/configure-high-availability/_index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/set-up/configure-high-availability/_index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/set-up/configure-high-availability/_index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/set-up/configure-high-availability/_index.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Shorten links

* Update/reorder a bit the description about alertmanager gossiping

* Update `alertmanager_peer_position` description

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2024-08-27 08:01:14 +02:00
Alexander Akhmetov 832bb01f36
Alerting: Add MQTT notifications receiver (#91487)
* Alerting: Add MQTT notifications receiver
* Update alerting to 9daa6239cc41dc42bff0e916c8d0d27766caa8b9 (main)
---------

Co-authored-by: Jack Baldry <jack.baldry@grafana.com>
Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2024-08-22 16:47:48 +02:00
brendamuir f833b782b6
Alerting docs: adds silence RBAC 11.1 (#89176)
* Alerting docs: adds silence RBAC 11.1

* ran prettier

* Improve docs with new rule-specific silence RBAC information

* Apply suggestions from code review

Co-authored-by: Jack Baldry <jack.baldry@grafana.com>

* Apply suggestions from code review

Co-authored-by: Jack Baldry <jack.baldry@grafana.com>

* prettier

---------

Co-authored-by: Matt Jacobson <matthew.jacobson@grafana.com>
Co-authored-by: Jack Baldry <jack.baldry@grafana.com>
2024-06-27 10:10:34 +02:00
Jacob Valdemar eb76ea47a0
Alerting: Add ha_reconnect_timeout configuration option (#88823)
* Docs: Update "Configure high availability" guide with ha_reconnect_timeout configuration

---------

Co-authored-by: Christopher Moyer <35463610+chri2547@users.noreply.github.com>
2024-06-11 13:25:48 -04:00
Pepe Cano 3af72bdeee
Alerting docs: fix `Alertmanagers` title (#88625) 2024-06-04 09:10:28 +02:00
Pepe Cano 68c44f1dd9
Alerting docs: Update Alertmanager docs (#88567)
* Remove alertmanager page and set redirects

* Update internal alertmanager links

* Update `Alertmanager` docs

* Change heading to `Configure Alertmanagers`
2024-06-03 11:00:52 +02:00
William Wernert 7a744a746b
Alerting: Update docs with rule read RBAC changes (#88565)
* Remove ref to `datasources:query` for rule read

* Remove more refs to `datasources:query`

* Run prettier
2024-05-31 11:50:44 -04:00