Commit Graph

57 Commits

Author SHA1 Message Date
Yuri Tseretyan 15fab1cb99
Alerting: Update integration schema to support versions (#109969)
* add VersionedNotifierPlugin and method that converts NotifierPlugin to it

* return new schema if query parameter version=2

* add version to k8s model of integration

* fix open api snapshot

* add version to IntegrationConfig

* use current version on conversion

* create versioned integrations for test
2025-08-28 14:46:30 -04:00
Moustafa Baiou c73b3ccf6e Alerting: Fix copying of recording rule fields
Recording rule fields were not being copied correctly when duplicating an alert rule. This manifests as missing `TargetDataSourceUID` fields from the `Record` part of the rule when rules in a group are re-ordered.

Added some additional tests to ensure we cover the generation of recording rules in tests and fixed the copying logic to ensure all fields are copied correctly.
2025-08-28 14:07:00 -04:00
Moustafa Baiou 16f8359d35
Alerting: Update Alert Rule to use int64 for MissingSeriesEvalsToResolve (#109306) 2025-08-06 21:45:48 -04:00
Yuri Tseretyan 3e2296acd3
Alerting: Support for active time intervals in notification policies (#104252)
Backend Code Checks / Validate Backend Configs (push) Waiting to run Details
Backend Unit Tests / Grafana (push) Waiting to run Details
Backend Unit Tests / Grafana Enterprise (push) Waiting to run Details
CodeQL checks / Analyze (go) (push) Waiting to run Details
CodeQL checks / Analyze (javascript) (push) Waiting to run Details
CodeQL checks / Analyze (python) (push) Waiting to run Details
Lint Frontend / Verify i18n (push) Waiting to run Details
Lint Frontend / Lint (push) Waiting to run Details
Lint Frontend / Typecheck (push) Waiting to run Details
Lint Frontend / Betterer (push) Waiting to run Details
golangci-lint / lint-go (push) Waiting to run Details
Crowdin Upload Action / upload-sources-to-crowdin (push) Waiting to run Details
Coverage / Backend Unit Tests (push) Waiting to run Details
End-to-end tests / Build & Package Grafana (push) Waiting to run Details
End-to-end tests / ${{ matrix.suite }} (dashboards-suite) (push) Blocked by required conditions Details
End-to-end tests / ${{ matrix.suite }} (panels-suite) (push) Blocked by required conditions Details
End-to-end tests / ${{ matrix.suite }} (smoke-tests-suite) (push) Blocked by required conditions Details
End-to-end tests / ${{ matrix.suite }} (various-suite) (push) Blocked by required conditions Details
End-to-end tests / ${{ matrix.suite }} (old arch) (old-arch/dashboards-suite) (push) Blocked by required conditions Details
End-to-end tests / ${{ matrix.suite }} (old arch) (old-arch/panels-suite) (push) Blocked by required conditions Details
End-to-end tests / ${{ matrix.suite }} (old arch) (old-arch/smoke-tests-suite) (push) Blocked by required conditions Details
End-to-end tests / ${{ matrix.suite }} (old arch) (old-arch/various-suite) (push) Blocked by required conditions Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (1) (push) Waiting to run Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (2) (push) Waiting to run Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (3) (push) Waiting to run Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (4) (push) Waiting to run Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (5) (push) Waiting to run Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (6) (push) Waiting to run Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (7) (push) Waiting to run Details
Frontend tests / Unit tests (${{ matrix.chunk }} / 8) (8) (push) Waiting to run Details
Integration Tests / Sqlite (push) Waiting to run Details
Integration Tests / MySQL (push) Waiting to run Details
Integration Tests / Postgres (push) Waiting to run Details
Run dashboard schema v2 e2e / dashboard-schema-v2-e2e (push) Waiting to run Details
Dispatch sync to mirror / dispatch-job (push) Waiting to run Details
Trivy Scan / trivy-scan (push) Waiting to run Details
publish-kinds-next / config (push) Has been cancelled Details
publish-kinds-next / main (push) Has been cancelled Details
* add active_time_intervals to route model

* update k8s compat layer

* update notification policies service to validate active time intervals

* update integration tests

* update openapi

* add active time interval to model

* update route generator to include active time interval

* Update storage list and rename methods to handle active intervals

* update api model

* update provisioning and export models

* update ui to allow active timing config

* update i18n

* fix snapshots for ui tests

* run prettier

* Alerting: Active time intervals UI naming (#104402)

* update naming in UI

* update naming in the edit page title

* update translations

* update alerting module

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Sonia Aguilar <33540275+soniaAguilarPeiron@users.noreply.github.com>
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
2025-05-07 19:19:33 -04:00
Mariell Hoversholm 757be6365a
CI: Bump golangci-lint to 2.0.2 (#103572) 2025-04-10 14:42:23 +02:00
maicon d8c5c2d3b8
K8s: Folders: Modify GetChildren to return only Folder References (#103072)
* Return FolderReference instead of Folder on GetChildren

Signed-off-by: Maicon Costa <maiconscosta@gmail.com>

---------

Signed-off-by: Maicon Costa <maiconscosta@gmail.com>
2025-04-02 01:30:17 -03:00
Alexander Akhmetov 695ac91290
Alerting: Add backend support for keep_firing_for (#100750)
What is this feature?

This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).

keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:

Before                                          

+----------+     +----------+                    
| Alerting |---->|  Normal  |                    
+----------+     +----------+                    

-----
After

+----------+      +------------+     +----------+
| Alerting |----->| Recovering |---->|  Normal  |
+----------+      +------------+     +----------+                                                 

Why do we need this feature?

This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
2025-03-18 11:24:48 +01:00
Alexander Akhmetov 7dd6f52630
Alerting: Add MissingSeriesEvalsToResolve option to the AlertRule (#101184) 2025-03-11 22:12:06 +01:00
Yuri Tseretyan 879b121136
Alerting: Add GUID to alert rule tables (#101321)
* add column guid to alert rule table and rule_guid to rule version table
+ populate the new field with UUID
* update storage and domain models
* patch GUID
* ignore GUID in fingerprint tests
2025-02-28 09:47:25 -05:00
Alexander Akhmetov 6eb335a8ce
Alerting: API to read rule groups using mimirtool (#100674) 2025-02-25 15:49:08 +01:00
Yuri Tseretyan 4cac3158c7
Alerting: Fix alert rule copy to include metadata (#100212)
* copy metadata

* add tests for copy and generator

* extract copy rule to a production method and update usages

* fix tests
2025-02-11 09:46:02 -05:00
Yuri Tseretyan 1b8db233a7
Alerting: Rule Version API to Ignore versions without diff (#100093) 2025-02-10 09:20:35 -05:00
Yuri Tseretyan ac41c19350
Alerting: Rule version history API (#99041)
* implement store method to read rule versions

* implement request handler

* declare a new endpoint

* fix fake to return correct response

* add tests

* add integration tests

* rename history to versions

* apply diff from swagger CI step

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-02-03 13:26:18 -05:00
Yuri Tseretyan 92d6762a3a
Alerting: Store information about user that created\updated alert rule (#99395)
* introduce new fields created_by in rule tables
* update domain model and compat layer to support UpdatedBy
* add alert rule generator mutators for UpdatedBy
* ignore UpdatedBy in diff and hash calculation
* Add user context to alert rule insert/update operations
  Updated InsertAlertRules and UpdateAlertRules methods to accept a user context parameter. This change ensures auditability and better tracking of user actions when creating or updating alert rules. Adjusted all relevant calls and interfaces to pass the user context accordingly.

* set UpdatedBy in PreSave because this is where Updated is set
* Use nil userID for system-initiated updates
This ensures differentiation between system and user-initiated changes for better traceability and clarity in update origins.

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-01-24 12:09:17 -05:00
Santiago f60caf6932
Alerting: Fix alert rules unpausing after moving rule to different folder (#97580)
Alerting: Fix alert rules unpaused after moving rule to different folder
2024-12-06 14:33:13 -03:00
Alexander Akhmetov 324503ee8b
Alerting: Add simplified_notifications_section field to the alert rule metadata (#95988) 2024-11-14 12:55:54 +01:00
Alexander Akhmetov 0a4e6ff86b
Alerting: Add SaveAlertInstancesForRule instance store method (#94505)
Alerting: Add SaveAlertInstancesForRule method to the InstanceStore interface
2024-10-11 13:47:44 +02:00
Alexander Akhmetov 9f5b05f936
Alerting: Add metadata field with editor_settings to alert rule (#93245) 2024-09-19 16:43:41 +02:00
Matthew Jacobson 32f06c6d9c
Alerting: Receiver API complete core implementation (#91738)
* Replace global authz abstraction with one compatible with uid scope

* Replace GettableApiReceiver with models.Receiver in receiver_svc

* GrafanaIntegrationConfig -> models.Integration

* Implement Create/Update methods

* Add optimistic concurrency to receiver API

* Add scope to ReceiversRead & ReceiversReadSecrets

migrates existing permissions to include implicit global scope

* Add receiver create, update, delete actions

* Check if receiver is used by rules before delete

* On receiver name change update in routes and notification settings

* Improve errors

* Linting

* Include read permissions are requirements for create/update/delete

* Alias ngalert/models to ngmodels to differentiate from v0alpha1 model

* Ensure integration UIDs are valid, unique, and generated if empty

* Validate integration settings on create/update

* Leverage UidToName to GetReceiver instead of GetReceivers

* Remove some unnecessary uses of simplejson

* alerting.notifications.receiver -> alerting.notifications.receivers

* validator -> provenanceValidator

* Only validate the modified receiver

stops existing invalid receivers from preventing modification of a valid
receiver.

* Improve error in Integration.Encrypt

* Remove scope from alert.notifications.receivers:create

* Add todos for receiver renaming

* Use receiverAC precondition checks in k8s api

* Linting

* Optional optimistic concurrency for delete

* make update-workspace

* More specific auth checks in k8s authorize.go

* Add debug log when delete optimistic concurrency is skipped

* Improve error message on authorizer.DecisionDeny

* Keep error for non-forbidden errutil errors
2024-08-26 10:47:53 -04:00
Alexander Weaver 34ab5fe1f3
Alerting: Restart rule routines if the type changes (#90867)
* Restart when types change

* Wire up test hooks correctly

* testing
2024-08-14 14:57:47 -05:00
Matthew Jacobson ba800692c6
Alerting: Persist AlertInstance ResolvedAt & LastSentAt (#89135)
* Alerting: Persist AlertInstance ResolvedAt & LastSentAt

* Fix test

* Modify existing tests

* Fix merge conflicts from nullable LastSentAt & ResolvedAt
2024-07-12 12:26:58 -04:00
William Wernert d359591dac
Alerting: Support recording rule struct in provisioning API (#87849)
* Support record struct in provisioning API

* Update api spec

* Use record field

* Restrict API endpoints following toggle

* Fix swagger spec

* Add recording rule validation to store validator
2024-06-06 21:05:02 +03:00
Matthew Jacobson 31d5dd0a12
Alerting: Prevent updating rule uid matcher for silences (#88519)
Prevents updating the `__alert_rule_uid__` equality matcher (used for rule-specific silences) on existing silences
2024-06-03 17:39:06 -04:00
Alexander Weaver 65793440d3
Alerting: Test infrastructure for recording rules (#88200)
* Add test rule generator support for recording rules

* Remove accidental add

* Recording rules appear in GetRulesForScheduling

* A couple more tests, updates, count

* No need to capture rule defs
2024-05-23 16:27:07 -05:00
Matthew Jacobson babfa2beac
Alerting: Hook up GMA silence APIs to new authentication handler (#86625)
This PR connects the new RBAC authentication service to existing alertmanager API silence endpoints.
2024-05-03 15:32:30 -04:00
Yuri Tseretyan 052082a927
Alerting: Refactor Alert Rule Generators (#86813) 2024-04-29 21:52:15 -04:00
Yuri Tseretyan 9735a8a080
Alerting: Distinguish conflict violation errors (#86634)
* update generator to set ID = 0 and do not set 0 if unique is needed
* return proper message when the constraint violation
2024-04-22 12:28:46 -04:00
Yuri Tseretyan 1eebd2a4de
Alerting: Support for simplified notification settings in rule API (#81011)
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated

* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.

* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings

* update GET API so only admins can see auto-gen
2024-02-15 09:45:10 -05:00
Alexander Weaver 00a260effa
Alerting: Add setting to distribute rule group evaluations over time (#80766)
* Simple, per-base-interval jitter

* Add log just for test purposes

* Add strategy approach, allow choosing between group or rule

* Add flag to jitter rules

* Add second toggle for jittering within a group

* Wire up toggles to strategy

* Slightly improve comment ordering

* Add tests for offset generation

* Rename JitterStrategyFrom

* Improve debug log message

* Use grafana SDK labels rather than prometheus labels
2024-01-18 12:48:11 -06:00
Yuri Tseretyan f6a46744a6
Alerting: Support hysteresis command expression (#75189)
Backend: 

* Update the Grafana Alerting engine to provide feedback to HysteresisCommand. The feedback information is stored in state.Manager as a fingerprint of each state. The fingerprint is persisted to the database. Only fingerprints that belong to Pending and Alerting states are considered as "loaded" and provided back to the command.
   - add ResultFingerprint to state.State. It's different from other fingerprints we store in the state because it is calculated from the result labels.
  -  add rule_fingerprint column to alert_instance
   - update alerting evaluator to accept AlertingResultsReader via context, and update scheduler to provide it.
   - add AlertingResultsFromRuleState that implements the new interface in eval package
   - update getExprRequest to patch the hysteresis command.

* Only one "Recovery Threshold" query is allowed to be used in the alert rule and it must be the Condition.


Frontend:

* Add hysteresis option to Threshold in UI. It's called "Recovery Threshold"
* Add test for getUnloadEvaluatorTypeFromCondition
* Hide hysteresis in panel expressions

* Refactor isInvalid and add test for it
* Remove unnecesary React.memo
* Add tests for updateEvaluatorConditions

---------

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
2024-01-04 11:47:13 -05:00
Matthew Jacobson ce90a1f2be
Alerting: Apply query optimization to eval endpoints (#78566)
* Alerting: Apply query optimization to eval endpoints

Previously, query optimization was applied to alert queries when scheduled but
not when ran through `api/v1/eval` or `/api/v1/rule/test/grafana`. This could
lead to discrepancies between preview and scheduled alert results.
2023-11-28 19:44:28 -05:00
Yuri Tseretyan 7cec741bae
Alerting: Extract alerting rules authorization logic to a service (#77006)
* extract alerting authorization logic to separate package
* convert authorization logic to service
2023-11-15 18:54:54 +02:00
Yuri Tseretyan 85425b2194
Alerting: Fix flaky test TestExportRules (#77519)
* fix test to correclty mock data store

* Update pkg/services/ngalert/api/api_ruler_export_test.go

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

* Update pkg/services/ngalert/api/api_ruler_export_test.go

---------

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2023-11-01 21:35:04 +02:00
Yuri Tseretyan 027bd9356f
Alerting: Rule Modify Export APIs (#75322)
* extend RuleStore interface to get namespace by UID
* add new export API endpoints
* implement request handlers
* update authorization and wire handlers to paths
* add folder error matchers to errorToResponse
* add tests for export methods
2023-10-02 11:47:59 -04:00
Yuri Tseretyan 0053b07885
Alerting: Refactor of state manager tests (#72849)
* calculate cacheID instead of literals
   * use mocked clocks
   * advance clocks with the eval results
   * use clearer timestamp aliases
   * make expected state labels be more clear to read
Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2023-08-04 13:39:49 -04:00
Yuri Tseretyan b57ef1f2c7
Alerting: Fix TestIntegration_GetAlertRulesForScheduling to make sure rules are created in different org (#69088)
make sure rules are created in different org
2023-05-25 13:51:38 -04:00
ismail simsek 91221bc436
Expressions: Fixes the issue showing expressions editor (#62510)
* Use suggested value for uid

* update the snapshot

* use __expr__

* replace all -100 with __expr__

* update snapshot

* more changes

* revert redundant change

* Use expr.DatasourceUID where it's possible

* generate files
2023-01-31 18:50:10 +01:00
Alex Moreno 531b439cf1
Alerting: Add alert pausing feature (#60734)
* Add field in alert_rule model, add state to alert_instance model, and state to eval

* Remove paused state from eval package

* Skip paused alert rules in scheduler

* Add migration to add is_paused field to alert_rule table

* Convert to postable alerts only if not normal, pernding, or paused

* Handle paused eval results in state manager

* Add Paused state to eval package

* Add paused alerts logic in scheduler

* Skip alert on scheduler

* Remove paused status from eval package

* Apply suggestions from code review

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Remove state

* Rethink schedule and manager for paused alerts

* Change return to continue

* Remove unused var

* Rethink alert pausing

* Paused alerts storing annotations

* Only add one state transition

* Revert boolean method renaming refactor

* Revert take image refactor

* Make registry errors public

* Revert method extraction for getting a folder title

* Revert variable renaming refactor

* Undo unnecessary changes

* Revert changes in test

* Remove IsPause check in PatchPartiLAlertRule function

* Use SetNormal to set state

* Fix text by returning to old behaviour on alert rule deletion

* Add test in schedule_unit_test.go to test ticks with paused alerts

* Add coment to clarify usage of context.Background()

* Add comment to clarify resetStateByRuleUID method usage

* Move rule get to a more limited scope

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* rum gofmt on pkg/services/ngalert/schedule/schedule.go

* Remove defer cancel for context

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/testing.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* skip scheduler rule state clean up on paused alert rule

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Fix mock in test

* Add (hopefully) final suggestions

* Use error channel from recordAnnotationsSync to cancel context

* Run make gen-cue

* Place pause alert check in channel update after version check

* Reduce branching un update channel select

* Add if for error and move code inside if in state manager ResetStateByRuleUID

* Add reason to logs

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Do not delete alert rule routine, just exit on eval if is paused

* Reduce branching and create-close a channel to avoid deadlocks

* Separate state deletion and state reset (includes history saving)

* Add current pause state in rule route in scheduler

* Split clearState and bring errCh closer to RecordStatesAsync call

* Change rule to ruleMeta in RecordStatesAsync

* copy state to be able to modify it

* Add timeout to context creation

* Shorten the timeout

* Use resetState is rule is paused and deleteState if rule is not paused

* Remove Empty state reason

* Save every rule change in historian

* Add tests for DeleteStateByRuleUID and ResetStateByRuleUID

* Remove useless line

* Remove outdated comment

Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
2023-01-26 18:29:10 +01:00
Santiago e5920c211e
Chore: Fix random indices for slices in test files (#61884)
* Fix random indices for slices in test files

* Empty commit
2023-01-24 15:07:37 -03:00
Matthew Jacobson 23e05373a7
Alerting: Fix flaky TestIntegrationUpdateAlertRules (#61641)
Prevents random OrgID=0 in test alert generation causing invalid alert rule.
2023-01-17 19:09:46 +00:00
Yuri Tseretyan abb49d96b5
Alerting: update state manager to return StateTransition instead of State (#58867)
* improve test for stale states
* update state manager return StateTransition
* update scheduler to accept state transitions
2022-12-06 13:07:39 -05:00
idafurjes 080ea88af7
Nested Folders: Support getting of nested folder in folder service wh… (#58597)
* Nested Folders: Support getting of nested folder in folder service when feature flag is set

* Fix lint

* Fix some tests

* Fix ngalert test

* ngalert fix

* Fix API tests

* Fix some tests and lint

* Fix lint 2

* Fix library elements and panels

* Add access control to get folder

* Cleanup and minor test change
2022-11-11 14:28:24 +01:00
Yuri Tseretyan bad4f28d0d
Alerting: update test TestAlertingTicker to not rely on clock (#58544)
* extract method processTick
* make processTick return scheduled rules
* move state manager tests to state manager
* update test
* move all tests into one file
* remove unused fields
2022-11-09 15:08:57 -05:00
Joe Blubaugh b476ae62fb
Alerting: Write and Delete multiple alert instances. (#55350)
Prior to this change, all alert instance writes and deletes happened
individually, in their own database transaction. This change batches up
writes or deletes for a given rule's evaluation loop into a single
transaction before applying it.

These new transactions are off by default, guarded by the feature toggle "alertingBigTransactions"

Before:

```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8           398           2991381 ns/op         1133537 B/op      27703 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: FovKXiRVzm} with title: "an alert definition FTvFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: foDFXmRVkm} with title: "an alert definition fovFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: VQvFuigVkm} with title: "an alert definition VwDKXmR4kz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.619s
```

After:

```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8          1440            816484 ns/op          352297 B/op       6529 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: 302r_igVzm} with title: "an alert definition q0h9lmR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: 71hrlmR4km} with title: "an alert definition nJ29_mR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: Cahr_mR4zm} with title: "an alert definition ja2rlmg4zz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.383s
```

So we cut time by about 75% and memory allocations by about 60% when
storing and deleting 100 instances.
2022-10-06 14:22:58 +08:00
Yuriy Tseretyan 2d38664fe6
Alerting: Improve validation of query and expressions on rule submit (#53258)
* Improve error messages of server-side expression 
* move validation of alert queries and a condition to eval package
2022-09-21 15:14:11 -04:00
Joe Blubaugh 22c937340e
Revert "Alerting: Write and Delete multiple alert instances. (#54072)" (#54885)
This reverts commit 5e4fd94413.
2022-09-09 17:44:06 +02:00
Joe Blubaugh 5e4fd94413
Alerting: Write and Delete multiple alert instances. (#54072)
Prior to this change, all alert instance writes and deletes happened
individually, in their own database transaction. This change batches up
writes or deletes for a given rule's evaluation loop into a single
transaction before applying it.

Before:
```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8           398           2991381 ns/op         1133537 B/op      27703 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: FovKXiRVzm} with title: "an alert definition FTvFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: foDFXmRVkm} with title: "an alert definition fovFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: VQvFuigVkm} with title: "an alert definition VwDKXmR4kz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.619s
```

After:
```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8          1440            816484 ns/op          352297 B/op       6529 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: 302r_igVzm} with title: "an alert definition q0h9lmR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: 71hrlmR4km} with title: "an alert definition nJ29_mR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: Cahr_mR4zm} with title: "an alert definition ja2rlmg4zz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.383s
```

So we cut time by about 75% and memory allocations by about 60% when
storing and deleting 100 instances.

This change also updates some of our tests so that they run successfully against postgreSQL - we were using random Int64s, but postgres integers, which our tables use, max out at 2^31-1
2022-09-02 11:17:20 +08:00
Yuriy Tseretyan 41bd36eb97
Alerting: Update rules delete endpoint to handle rules in group (#53790)
* update RouteDeleteAlertRules rules to update as a group
* remove expecter from scheduler mock to support variadic function
* create function to check for provisioning status + tests

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2022-08-24 15:33:33 -04:00
Yuriy Tseretyan 5fb778814c
Alerting: Update rules version when folder title is updated (#53013)
* remove support for bus from scheduler
* rename event to FolderTitleUpdated and fire only if title has changed
* add method to increase version of all rules that belong to a folder
* update ngalert service to subscribe to folder title change event call data store and update scheduler
* add tests
2022-08-01 19:28:38 -04:00
Yuriy Tseretyan 054fe54b03
Alerting: Split Scheduler and AlertRouter tests (#52416)
* move fake FakeExternalAlertmanager to sender package
* move tests from scheduler to router
* update alerts router to have all fields private
* update scheduler tests to use sender mock
2022-07-19 09:32:54 -04:00