What is this feature?
This PR implements a jitter mechanism for periodic alert state storage to distribute database load over time instead of processing all alert instances simultaneously. When enabled via the state_periodic_save_jitter_enabled configuration option, the system spreads batch write operations across 85% of the save interval window, preventing database load spikes in high-cardinality alerting environments.
Why do we need this feature?
In production environments with high alert cardinality, the current periodic batch storage can cause database performance issues by processing all alert instances simultaneously at fixed intervals. Even when using periodic batch storage to improve performance, concentrating all database operations at a single point in time can overwhelm database resources, especially in resource-constrained environments.
Rather than performing all INSERT operations at once during the periodic save, distributing these operations across the time window until the next save cycle can maintain more stable service operation within limited database resources. This approach prevents resource saturation by spreading the database load over the available time interval, allowing the system to operate more gracefully within existing resource constraints.
For example, with 200,000 alert instances using a 5-minute interval and 4,000 batch size, instead of executing 50 batch operations simultaneously, the jitter mechanism distributes these operations across approximately 4.25 minutes (85% of 5 minutes), with each batch executed roughly every 5.2 seconds.
This PR provides system-level protection against such load spikes by distributing operations across time, reducing peak resource usage while maintaining the benefits of periodic batch storage. The jitter mechanism is particularly valuable in resource-constrained environments where maintaining consistent database performance is more critical than precise timing of state updates.
* feat(plugins): add a way to expose core apis only to certain plugins
* review: update naming
* review: update the owners of the feature toggle
* feat: share the restricted apis with extensions
* fix: linters
* feat: remove the `addPanel` api
* chore: fix linting and betterer issue
* tests: use `@ts-expect-error` for more clarity
* Add setting to disable username based brute force login protection
* Use new DisableUsernameLoginProtection setting in tests where appropriate
* Update documentation for other brute force directives
* Avoid unecessary database calls
* Add test cases for username and IP protection settings
* Revert "Revert: Future-proofing query and data source model in Dashboard Sche… (#107985)"
This reverts commit 13a89d4ae3.
* Revert "Revert "Schema V2: Simplify annotations v1<->v2 conversions" (#107984)"
This reverts commit 2b8c5bea1a.
* make gen apps
* e2e update
* Use v2alpha2 by default (#108177)
* Use v2alpha2 by default
* Apply only DS changes to alpha2
* Use v2alpha2 by default except to query
* Create a v2 index in @grafana/schema
* Update path and apply lint
* Update tests
* Update imports to v2 status
* Fix failing openapi test
* Schemav2 breaking changes: conversion implementation (#108224)
* provision v2alpha1 dashboard
* Run conversions for DS refactor
* Run snapshot testing on conversions
* Normalize output name
* Update snapshots to include all panel and variable cases
* fix lint
* fix lint
* fix test and go lint
* more go lint
---------
Co-authored-by: Ivan Ortega <ivanortegaalba@gmail.com>
Co-authored-by: Haris Rozajac <haris.rozajac12@gmail.com>
* Schema v2: Introduce group/datasource convention to GroupBy and AdHoc variable (#108237)
* Schema v2: Introduce group/datasource convention to GroupBy and AdHoc variables
* add conversion
* App Installer: Authorizer support (#108419)
* Chore: use `satisfies` and remove a load of `any`s (#108397)
use satisfies and remove a load of anys
* improve logging and fail unified-storage migration with more than 0 errors (#108471)
improve logging and fail unified-storage migration with more than 0 errors
* fix conversion test
* Secrets: Create more granular fixed roles for SecureValues (#108382)
* Provisioning: Fix bug in job progress recording (#108440)
Fix bug in job progress recording
* Provisioning: Fix ImportAllPanelsFromLocalRepository test (#108441)
* Provisioning: Skip flaky test
* Fix flaky provisioning test
* Fix lint
---------
Co-authored-by: Roberto Jimenez Sanchez <roberto.jimenez@grafana.com>
* BulkDeleteProvisionedResource: Move progress bar into a second step (#108417)
* Move progress bar into a second step
---------
Co-authored-by: Alex Khomenko <Clarity-89@users.noreply.github.com>
* [Dashboard Schema Codegen] Move dashboard CUE codegen block back up into kind body (#108476)
[Dashboard Schema Codegen] Move dashboard CUE codegen block back up into kind body to make sure new versions have the same settings.
---------
Co-authored-by: Haris Rozajac <haris.rozajac12@gmail.com>
Co-authored-by: Todd Treece <360020+toddtreece@users.noreply.github.com>
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
Co-authored-by: Will Assis <35489495+gassiss@users.noreply.github.com>
Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>
Co-authored-by: Roberto Jiménez Sánchez <jszroberto@gmail.com>
Co-authored-by: Roberto Jimenez Sanchez <roberto.jimenez@grafana.com>
Co-authored-by: Yunwen Zheng <yunwen.zheng@grafana.com>
Co-authored-by: Alex Khomenko <Clarity-89@users.noreply.github.com>
Co-authored-by: Austin Pond <IfSentient@users.noreply.github.com>
Co-authored-by: Ivan Ortega <ivanortegaalba@gmail.com>
* Dashboard Schema V2: Refactor VizConfigKind to follow DataQueryKind convention (#108148)
* Dashboards API: Register v2alpha2 API
* Prepare conversion functions
* Fix test
* Refactor VizConfigKind to follow DataQueryKind convention
* fix tests
* use new dataquerykind convention alpha 2
* add conversion
* fix tests
* fix tests
* fix another test
* Fix merge
---------
Co-authored-by: Dominik Prokop <dominik.prokop@grafana.com>
* fix k8s codegen
* Update e2e-playwright/dashboards/TestV2Dashboard.json
* Update e2e/dashboards/TestV2Dashboard.json
* revert app generation for non-related apps
* try again
* another try
* also revert folder and secret app generation
* v2alpha1 provisioned dashboard
* Fix kind
* Fix conversion snapshots
* Update API discovery registry
* Rename to v2beta1
* Rename migrations
* Update apps/dashboard/pkg/apis/dashboard/v2beta1/doc.go
Co-authored-by: Stephanie Hingtgen <stephanie.hingtgen@grafana.com>
* Ensure conditional rendering and other non changed properties
---------
Co-authored-by: Ivan Ortega <ivanortegaalba@gmail.com>
Co-authored-by: Haris Rozajac <haris.rozajac12@gmail.com>
Co-authored-by: Todd Treece <360020+toddtreece@users.noreply.github.com>
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
Co-authored-by: Will Assis <35489495+gassiss@users.noreply.github.com>
Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>
Co-authored-by: Roberto Jiménez Sánchez <jszroberto@gmail.com>
Co-authored-by: Roberto Jimenez Sanchez <roberto.jimenez@grafana.com>
Co-authored-by: Yunwen Zheng <yunwen.zheng@grafana.com>
Co-authored-by: Alex Khomenko <Clarity-89@users.noreply.github.com>
Co-authored-by: Austin Pond <IfSentient@users.noreply.github.com>
Co-authored-by: Haris Rozajac <58232930+harisrozajac@users.noreply.github.com>
Co-authored-by: Stephanie Hingtgen <stephanie.hingtgen@grafana.com>
* Cloud migrations: store snapshots in the database
* update github.com/grafana/grafana-cloud-migration-snapshot to v1.9.0
* make update-workspace
* use new field name in test
* return error after call to fmt.Errorf
* create methods for readability / fix session deletiong not deleting snapshots
* remove debugging changes
* update sample.ini
* update tests to include OrgID in ListSnapshotsQuery
* lint
* lint
* Update pkg/services/cloudmigration/cloudmigrationimpl/snapshot_mgmt.go
Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>
* remove TODO
* Update pkg/services/cloudmigration/cloudmigrationimpl/snapshot_mgmt.go
Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>
* remove one of the debug logs
---------
Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>
* wip
* docker compose dev setup
* commit new tilt stuff
* move files into own dir
* reset files back to main
* use just one nginx container for both gateway and cdn
* update proxy service name
* make it all work when in subdir
* rename more things
* reset more changes
* fix config
* add makefile command, fix ws upgrade
* add local check script
* tidy
* tidy up, comments, readyme
* codeowners
* change cdn host to localhost to avoid adding host.docker.internal to /etc/hosts
* route POST /login to backend
* Build nginx container with config baked in so it can be live_update-ed
* fix headers
**What is this feature?**
This PR implements a new Prometheus historian backend that allows Grafana alerting to write alert state history as Prometheus-compatible `ALERTS` metrics to remote Prometheus-compatible data sources.
The metric includes a few additional labels:
* `grafana_alertstate`: Grafana's full alert state, more granular than Prometheus.
* `grafana_rule_uid`: Grafana's alert rule UID.
Grafana states are included in the `grafana_alertstate` label also mapped to Prometheus-compatible `alertstate` values:
| Grafana alert state | `alertstate` | `grafana_alertstate` |
|---------------------|-----------------------|-----------------------|
| `Alerting` | `firing` | `alerting` |
| `Recovering` | `firing` | `recovering` |
| `Pending` | `pending` | `pending` |
| `Error` | `firing` | `error` |
| `NoData` | `firing` | `nodata` |
| `Normal` | _(no metric emitted)_ | _(no metric emitted)_ |
Adds a new "Allow as recording rules target" toggle to Prometheus datasource configuration that controls whether the datasource can be selected as a target for writing recording rules.
---------
Co-authored-by: ismail simsek <ismailsimsek09@gmail.com>
Co-authored-by: Konrad Lalik <konradlalik@gmail.com>
* Alerting: Add support for Redis Sentinel
* docs
* docs
* Use minisentinel in test
* Apply suggestions from code review
Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>
Co-authored-by: Fayzal Ghantiwala <114010985+fayzal-g@users.noreply.github.com>
* "address(es)" -> "address or addresses"
* make update-workspace
* make lint-go-diff
---------
Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>
Co-authored-by: Fayzal Ghantiwala <114010985+fayzal-g@users.noreply.github.com>
* feat: preinstall_sync config - process and installation logic
* ref: add preinstall_sync list to preinstalled plugins of frontendsettings
* fix: conf blank line for sections
* ref: remove plugins async flag, and rename PreinstallPlugins
* docs: default installed plugin list