This tutorial is a continuation of the [Get started with Grafana Alerting - Alert routing](http://www.grafana.com/tutorials/alerting-get-started-pt2/) tutorial.
Grouping in Grafana Alerting reduces notification noise by combining related alert instances into a single, concise notification. This is useful for on-call engineers, ensuring they focus on resolving incidents instead of sorting through a flood of notifications.
- Alternatively, you can try out this example in our interactive learning environment: [Get started with Grafana Alerting - Grouping](https://killercoda.com/grafana-labs/course/grafana/alerting-get-started-pt3/). It's a fully configured environment with all the dependencies already installed.
1. Change to the directory where you cloned the repository:
<!-- INTERACTIVE exec START -->
```
cd tutorial-environment
```
<!-- INTERACTIVE exec END -->
1. Run the Grafana stack:
<!-- INTERACTIVE ignore START -->
```
docker compose up -d
```
<!-- INTERACTIVE ignore END -->
{{<docs/ignore>}}
<!-- INTERACTIVE exec START -->
```bash
docker-compose up -d
```
<!-- INTERACTIVE exec END -->
{{< /docs/ignore >}}
The first time you run `docker compose up -d`, Docker downloads all the necessary resources for the tutorial. This might take a few minutes, depending on your internet connection.
<!-- INTERACTIVE ignore START -->
{{<admonitiontype="note">}}
If you already have Grafana, Loki, or Prometheus running on your system, you might see errors, because the Docker image is trying to use ports that your local installations are already using. If this is the case, stop the services, then run the command again.
{{</admonition>}}
<!-- INTERACTIVE ignore END -->
{{<docs/ignore>}}
NOTE:
If you already have Grafana, Loki, or Prometheus running on your system, you might see errors, because the Docker image is trying to use ports that your local installations are already using. If this is the case, stop the services, then run the command again.
{{< /docs/ignore >}}
<!-- INTERACTIVE page step1.md END -->
<!-- INTERACTIVE page step2.md START -->
## How alert rule grouping works
Alert notification grouping is configured with **labels** and **timing options**:
- **Labels** map the alert rule with the notification policy and define the grouping.
- **Timing options** control when and how often notifications are sent.
{{<figuresrc="/media/docs/alerting/alerting-notification-policy-diagram-with-labels-v3.png"max-width="750px"alt="A diagram about the components of a notification policy, including labels and groups">}}
You’re monitoring metrics like CPU usage, memory utilization, and network latency across multiple regions. Some of these alert rules include labels such as `region: us-west` and `region: us-east`. If multiple alert rules trigger across these regions, they can result in notification floods.
- **Group interval**: setting determines how often updates for the same alert group are sent. By default, this interval is set to 5 minutes, but you can customize it to be shorter or longer based on your needs.
Following the above example, [notification policies](ref:notification-policies) are created to route alert instances, which have a region label, to a specific contact point. The goal is to receive one consolidated notification per region. To demonstrate how grouping works, alert notifications for the East Coast team are not grouped. Regarding timing, a specific schedule is defined for that region. This setup overrides the parent's settings to fine-tune the behavior for specific labels (i.e., regions).
- **Grafana Cloud** users: Log in via Grafana Cloud.
- **OSS users**: Go to [http://localhost:3000](http://localhost:3000).
1. Navigate to **Notification Policies**:
- Go to **Alerts & IRM > Alerting > Notification Policies**.
1. Add a child policy:
- In the Default policy, click **+ New child policy**.
- **Label**: `region`
- **Operator**: `=`
- **Value**: `us-west`
This label matches alert rules where the region label is us-west.
1. Choose a **Contact point**:
- Select **Webhook**.
If you don’t have any contact points, add a [Contact point](https://grafana.com/docs/grafana/latest/alerting/configure-notifications/manage-contact-points/#add-a-contact-point).
1. Enable Continue matching:
- Turn on **Continue matching subsequent sibling nodes** so the evaluation continues even after one or more labels (i.e. region label) match.
**Group by** consolidates alerts that share the same grouping label into a single notification. For example, all alerts with `region=us-west` will be combined into one notification, making it easier to manage and reducing alert fatigue.
1. Set custom timing:
- Toggle **Override general timings**.
- **Group interval**: `2m`. This ensures follow-up notifications for the same alert group will be sent at intervals of 2 minutes. While the default is 5 minutes, we chose 2 minutes here to provide faster feedback for demonstration purposes.
**Timing options** control how often notifications are sent and can help balance timely alerting with minimizing noise.
- Repeat the steps above for `region = us-east` but without overriding grouping and timing options. Use a different webhook endpoint as the contact point.
{{<figuresrc="/media/docs/alerting/notificaiton-policies-region.png"max-width="750px"alt="Two nested notification policies to route and group alert notifications">}}
These nested policies should route alert instances where the region label is either us-west or us-east. Only the us-west region team should receive grouped alert notifications.
Label matchers are combined using the `AND` logical operator. This means that all matchers must be satisfied for a rule to be linked to a policy. If you attempt to use the same label key (e.g., region) with different values (e.g., us-west and us-east), the condition will not match, because it is logically impossible for a single key to have multiple values simultaneously.
However, `region!=us-east && region=!us-west` can match. For example, it would match a label set where `region=eu-central`.
Alternatively, for identical label keys use regular expression matchers (e.g., `region=~us-west|us-east`).
1. In the Default policy, click **+ New child policy**.
- In the Default policy, click **+ New child policy**.
- **Label**: `region`
- **Operator**: `=`
- **Value**: `us-west`
This label matches alert rules where the region label is us-west
1. Choose a **Contact point**:
- Select **Webhook**.
If you don’t have any contact points, add a Contact point.
1. Enable Continue matching:
- Turn on **Continue matching subsequent sibling nodes** so the evaluation continues even after one or more labels (i.e. region label) match.
1. Override grouping settings:
- Toggle **Override grouping**.
- **Group by**: `region`.
**Group by** consolidates alerts that share the same grouping label into a single notification. For example, all alerts with `region=us-west` will be combined into one notification, making it easier to manage and reducing alert fatigue.
1. Set custom timing:
- Toggle **Override general timings**.
- **Group interval**: `2m`. This ensures follow-up notifications for the same alert group will be sent at intervals of 2 minutes. While the default is 5 minutes, we chose 2 minutes here to provide faster feedback for demonstration purposes.
**Timing options** control how often notifications are sent and can help balance timely alerting with minimizing noise.
1. Save and repeat:
- Repeat for `region = us-east` with a different webhook or a different contact point.
**Note**: Label matchers are combined using the `AND` logical operator. This means that all matchers must be satisfied for a rule to be linked to a policy. If you attempt to use the same label key (e.g., region) with different values (e.g., us-west and us-east), the condition will not match, because it is logically impossible for a single key to have multiple values simultaneously.
However, `region!=us-east && region=!us-west` can match. For example, it would match a label set where `region=eu-central`.Alternatively, for identical label keys use regular expression matchers (e.g., `region=~us-west|us-east`).
Make it short and descriptive as this appears in your alert notification. For instance, `High CPU usage - Multi-region`.
### Define query and alert condition
In this section, we use the default options for Grafana-managed alert rule creation. The default options let us define the query, a expression (used to manipulate the data -- the `WHEN` field in the UI), and the condition that must be met for the alert to be triggered (in default mode is the threshold).
Grafana includes a [test data source](https://grafana.com/docs/grafana/latest/datasources/testdata/) that creates simulated time series data. This data source is included in the demo environment for this tutorial. If you're working in Grafana Cloud or your own local Grafana instance, you can add the data source through the **Connections** menu.
1. From the drop-down menu, select **TestData** data source.
1. From **Scenario** select **CSV Content**.
1. Copy in the following CSV data:
- Select **TestData** as the data source.
- Set **Scenario** to **CSV Content**.
- Use the following CSV data:
```csv
region,cpu-usage,service,instance
us-west,35,web-server-1,server-01
us-west,81,web-server-1,server-02
us-east,79,web-server-2,server-03
us-east,52,web-server-2,server-04
us-west,45,db-server-1,server-05
us-east,77,db-server-2,server-06
us-west,82,db-server-1,server-07
us-east,93,db-server-2,server-08
```
The returned data simulates a data source returning multiple time series, each leading to the creation of an alert instance for that specific time series.
- Keep `Last` as the value for the reducer function (`WHEN`), and `IS ABOVE 75` as the threshold value. This is the value above which the alert rule should trigger.
1. Set the **pending period** to `0s` (zero seconds), so the alert rule fires the moment the condition is met (this minimizes the waiting time for the demonstration).
1. Click **Preview routing** to ensure correct matching.
{{<figuresrc="/media/docs/alerting/region-notification-policy-routing-preview.png"max-width="750px"alt="Preview of alert instance routing with the region label matcher">}}
The preview should show that the region label from our data source is successfully matching the notification policies that we created earlier thanks to the label matcher that we configured.
When the configured alert rule detects CPU or memory usage higher than 75% across multiple regions, it will evaluate the metric every minute. If the condition persists, notifications will be grouped together, with a Group wait of 30 seconds before the first alert is sent. Follow-up notifications for the same alert group will be sent at intervals of 2 minutes (US-west alert instances only), increasing the frequency of the grouped alert notifications. US-east instances follow-up notifications should be sent at the default interval of 5 minutes. If the condition continues for an extended period, a Repeat interval of 4 hours ensures that the alert is only resent if the issue persists.
As a result, our notification policies should route three notifications: one grouped notification grouping both CPU and memory alert instances from the us-west region and two separate notifications with alert instances from the us-east region.
By configuring **notification policies** and using **labels** (such as _region_), you can group alert notifications based on specific criteria and route them to the appropriate teams. Fine-tuning **timing options**—including group wait, group interval, and repeat interval—further can reduce noise and ensures notifications remain actionable without overwhelming on-call engineers.
In [Get started with Grafana Alerting: Template your alert notifications](http://www.grafana.com/tutorials/alerting-get-started-pt4/) you learn how to use templates to create customized and concise notifications.
In [Get started with Grafana Alerting: Template your alert notifications](http://www.grafana.com/tutorials/alerting-get-started-pt4/) you learn how to use templates to create customized and concise notifications.