mirror of https://github.com/grafana/grafana.git
[release-11.6.1] docs(alerting): clarify recovery threshold for pending state (#104238)
docs(alerting): clarify recovery threshold for pending state (#102780)
Alerting docs: clarify recovery threshold on pending state
(cherry picked from commit 536ff2fc3d
)
This commit is contained in:
parent
35f90c5f77
commit
044fb387c6
|
@ -29,6 +29,11 @@ refs:
|
||||||
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state
|
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state
|
||||||
- pattern: /docs/grafana-cloud/
|
- pattern: /docs/grafana-cloud/
|
||||||
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state
|
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/state-and-health/#alert-instance-state
|
||||||
|
recovery-threshold:
|
||||||
|
- pattern: /docs/grafana/
|
||||||
|
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rules/queries-conditions/#recovery-threshold
|
||||||
|
- pattern: /docs/grafana-cloud/
|
||||||
|
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rules/queries-conditions/#recovery-threshold
|
||||||
modify-the-no-data-or-error-state:
|
modify-the-no-data-or-error-state:
|
||||||
- pattern: /docs/grafana/
|
- pattern: /docs/grafana/
|
||||||
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#modify-the-no-data-or-error-state
|
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/state-and-health/#modify-the-no-data-or-error-state
|
||||||
|
@ -192,9 +197,9 @@ You can toggle between **Default** and **Advanced** options. If the [Default vs.
|
||||||
|
|
||||||
b. Click **Preview** to verify that the expression is successful.
|
b. Click **Preview** to verify that the expression is successful.
|
||||||
|
|
||||||
1. To add a recovery threshold, turn the **Custom recovery threshold** toggle on and fill in a value for when your alert rule should stop firing.
|
1. To add a [recovery threshold](ref:recovery-threshold), enable the **Custom recovery threshold** toggle and enter a value that defines when the alert should recover—transition to `Normal` state from the `Alerting` or `Pending` state.
|
||||||
|
|
||||||
You can only add one recovery threshold in a query and it must be the alert condition.
|
You can only add one recovery threshold, and it must be part of the alert condition.
|
||||||
|
|
||||||
1. Click **Set as alert condition** on the query or expression you want to set as your [alert condition](ref:alert-condition).
|
1. Click **Set as alert condition** on the query or expression you want to set as your [alert condition](ref:alert-condition).
|
||||||
{{< /collapse >}}
|
{{< /collapse >}}
|
||||||
|
|
|
@ -122,13 +122,11 @@ A threshold returns `0` when the condition is false and `1` when true.
|
||||||
|
|
||||||
If the threshold is set as the alert condition, the alert fires when the threshold returns `1`.
|
If the threshold is set as the alert condition, the alert fires when the threshold returns `1`.
|
||||||
|
|
||||||
#### Recovery threshold
|
### Recovery threshold
|
||||||
|
|
||||||
To reduce the noise from flapping alerts, you can set a recovery threshold different to the alert threshold.
|
To reduce the noise from flapping alerts, you can set a recovery threshold so that the alert returns to the `Normal` state only after the recovery threshold is crossed.
|
||||||
|
|
||||||
Flapping alerts occur when a metric hovers around the alert threshold condition and may lead to frequent state changes, resulting in too many notifications.
|
Flapping alerts occur when the query value repeatedly crosses above and below the alert threshold, causing frequent state changes. This results in a series of firing-resolved-firing notifications and a noisy alert state history.
|
||||||
|
|
||||||
The value of a flapping metric can continually go above and below a threshold, resulting in a series of firing-resolved-firing notifications and a noisy alert state history.
|
|
||||||
|
|
||||||
For example, if you have an alert for latency with a threshold of 1000ms and the number fluctuates around 1000 (say 980 -> 1010 -> 990 -> 1020, and so on), then each of those might trigger a notification:
|
For example, if you have an alert for latency with a threshold of 1000ms and the number fluctuates around 1000 (say 980 -> 1010 -> 990 -> 1020, and so on), then each of those might trigger a notification:
|
||||||
|
|
||||||
|
@ -138,8 +136,8 @@ For example, if you have an alert for latency with a threshold of 1000ms and the
|
||||||
|
|
||||||
To prevent this, you can set a recovery threshold to define two thresholds instead of one:
|
To prevent this, you can set a recovery threshold to define two thresholds instead of one:
|
||||||
|
|
||||||
1. An alert is triggered when the first threshold is crossed.
|
1. An alert transitions to the `Pending` or `Alerting` state when the alert threshold is crossed.
|
||||||
1. An alert is resolved only when the second (recovery) threshold is crossed.
|
1. An alert transitions back to `Normal` state only after the recovery threshold is crossed.
|
||||||
|
|
||||||
In the previous example, setting the recovery threshold to 900ms means the alert only resolves when the latency falls below 900ms:
|
In the previous example, setting the recovery threshold to 900ms means the alert only resolves when the latency falls below 900ms:
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue