[release-11.6.1] docs(alerting): clarify recovery threshold for pending state (#104238)

docs(alerting): clarify recovery threshold for pending state (#102780)

Alerting docs: clarify recovery threshold on pending state

(cherry picked from commit 536ff2fc3d)
This commit is contained in:
Pepe Cano 2025-04-22 22:19:01 +02:00 committed by GitHub
parent 35f90c5f77
commit 044fb387c6
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 12 additions and 9 deletions

View File

@ -29,6 +29,11 @@ refs:
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state
- pattern: /docs/grafana-cloud/ - pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state
recovery-threshold:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rules/queries-conditions/#recovery-threshold
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rules/queries-conditions/#recovery-threshold
modify-the-no-data-or-error-state: modify-the-no-data-or-error-state:
- pattern: /docs/grafana/ - pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#modify-the-no-data-or-error-state destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#modify-the-no-data-or-error-state
@ -192,9 +197,9 @@ You can toggle between **Default** and **Advanced** options. If the [Default vs.
b. Click **Preview** to verify that the expression is successful. b. Click **Preview** to verify that the expression is successful.
1. To add a recovery threshold, turn the **Custom recovery threshold** toggle on and fill in a value for when your alert rule should stop firing. 1. To add a [recovery threshold](ref:recovery-threshold), enable the **Custom recovery threshold** toggle and enter a value that defines when the alert should recover—transition to `Normal` state from the `Alerting` or `Pending` state.
You can only add one recovery threshold in a query and it must be the alert condition. You can only add one recovery threshold, and it must be part of the alert condition.
1. Click **Set as alert condition** on the query or expression you want to set as your [alert condition](ref:alert-condition). 1. Click **Set as alert condition** on the query or expression you want to set as your [alert condition](ref:alert-condition).
{{< /collapse >}} {{< /collapse >}}

View File

@ -122,13 +122,11 @@ A threshold returns `0` when the condition is false and `1` when true.
If the threshold is set as the alert condition, the alert fires when the threshold returns `1`. If the threshold is set as the alert condition, the alert fires when the threshold returns `1`.
#### Recovery threshold ### Recovery threshold
To reduce the noise from flapping alerts, you can set a recovery threshold different to the alert threshold. To reduce the noise from flapping alerts, you can set a recovery threshold so that the alert returns to the `Normal` state only after the recovery threshold is crossed.
Flapping alerts occur when a metric hovers around the alert threshold condition and may lead to frequent state changes, resulting in too many notifications. Flapping alerts occur when the query value repeatedly crosses above and below the alert threshold, causing frequent state changes. This results in a series of firing-resolved-firing notifications and a noisy alert state history.
The value of a flapping metric can continually go above and below a threshold, resulting in a series of firing-resolved-firing notifications and a noisy alert state history.
For example, if you have an alert for latency with a threshold of 1000ms and the number fluctuates around 1000 (say 980 -> 1010 -> 990 -> 1020, and so on), then each of those might trigger a notification: For example, if you have an alert for latency with a threshold of 1000ms and the number fluctuates around 1000 (say 980 -> 1010 -> 990 -> 1020, and so on), then each of those might trigger a notification:
@ -138,8 +136,8 @@ For example, if you have an alert for latency with a threshold of 1000ms and the
To prevent this, you can set a recovery threshold to define two thresholds instead of one: To prevent this, you can set a recovery threshold to define two thresholds instead of one:
1. An alert is triggered when the first threshold is crossed. 1. An alert transitions to the `Pending` or `Alerting` state when the alert threshold is crossed.
1. An alert is resolved only when the second (recovery) threshold is crossed. 1. An alert transitions back to `Normal` state only after the recovery threshold is crossed.
In the previous example, setting the recovery threshold to 900ms means the alert only resolves when the latency falls below 900ms: In the previous example, setting the recovery threshold to 900ms means the alert only resolves when the latency falls below 900ms: