Commit Graph

16217 Commits

Author SHA1 Message Date
machine424 ba14bc49db
chore: deprecate prometheus_remote_storage_{samples,exemplars,histograms}_in_total and prometheus_remote_storage_highest_timestamp_in_seconds
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-01 13:19:28 +02:00
machine424 184c7eb918
storage/remote: compute highestTimestamp and dataIn at QueueManager level
Because of relabelling, an endpoint can only select a subset of series
that go through WriteStorage

Having a highestTimestamp at WriteStorage level yields wrong values
if the corresponding sample won't even make it to a remote queue.

Currently PrometheusRemoteWriteBehind is based on that, and would fire
if an endpoint is only interested in a subset of series that take time
to appear.

A "prometheus_remote_storage_queue_highest_timestamp_seconds" that only
takes into account samples in the queue is introduced, and used in
PrometheusRemoteWriteBehind and dashboards in documentation/prometheus-mixin

Same applies to samplesIn/dataIn, QueueManager should know more about
when to update those; when data is enqueued.

That makes dataDropped unnecessary, thus help simplify the logic
in QueueManager.calculateDesiredShards()

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-01 13:19:24 +02:00
Ayoub Mrini 2cbeef6d95
Merge pull request #17090 from machine424/bump_go
chore: add a "make bump-go-version" to handle Go bumps across all files
2025-09-01 13:05:29 +02:00
Bartlomiej Plotka 18626a99c4
fix(rules.Manager): ensure non-nil context (#17103)
Saw some panic on main due to lack of defaulting:
https://github.com/prometheus/prometheus/actions/runs/17317373582/job/49162760911

Signed-off-by: bwplotka <bwplotka@gmail.com>
2025-08-29 10:43:59 +01:00
bwplotka 172cde8af1 Revert "feat(storage): add new CombinedAppender interface and compatibility layer"
This reverts commit 2fb680a229.
2025-08-29 08:16:39 +01:00
bwplotka 794bf774c2 Reapply "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)"
This reverts commit f5fab47577.
2025-08-29 08:16:37 +01:00
bwplotka f5fab47577 Revert "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)"
This reverts commit c808a71e18.
2025-08-29 08:15:28 +01:00
bwplotka 2fb680a229 feat(storage): add new CombinedAppender interface and compatibility layer
Signed-off-by: bwplotka <bwplotka@gmail.com>
2025-08-29 08:14:34 +01:00
Jonathan c808a71e18
prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)
* chore: send Unit and Type when feature flag is enabled

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused code and comments

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unreal scenario

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused if

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused labels

Signed-off-by: perebaj <perebaj@gmail.com>

* linter

Signed-off-by: perebaj <perebaj@gmail.com>

* enable type and unit through remotewrite config

Signed-off-by: perebaj <perebaj@gmail.com>

* remove test comment and capture type and unit when flag is enabled

Signed-off-by: perebaj <perebaj@gmail.com>

* gofumpt

Signed-off-by: perebaj <perebaj@gmail.com>

* modelTypeToWriteV2Type

Signed-off-by: perebaj <perebaj@gmail.com>

* use NewMetadataFromLabels

Signed-off-by: perebaj <perebaj@gmail.com>

* capture feature flag from main

Signed-off-by: perebaj <perebaj@gmail.com>

* simplifying logic

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused function

Signed-off-by: perebaj <perebaj@gmail.com>

* formatting code

Signed-off-by: perebaj <perebaj@gmail.com>

* gofumpt

Signed-off-by: perebaj <perebaj@gmail.com>

* remove public var: EnableTypeAndUnitLabels

Signed-off-by: perebaj <perebaj@gmail.com>

* remove enableTypeAndUnitLabels from TestPopulateV2TimeSeries_typeAndUnitLabels

Signed-off-by: perebaj <perebaj@gmail.com>

* remove enableTypeAndUnitLabels from main

Signed-off-by: perebaj <perebaj@gmail.com>

* use schema helper to populate metadata

Signed-off-by: perebaj <perebaj@gmail.com>

* remove metadata since nil is the default value

Signed-off-by: perebaj <perebaj@gmail.com>

* add TestPopulateV2TimeSeries_UnexpectedMetadata

Signed-off-by: perebaj <perebaj@gmail.com>

* Update storage/remote/queue_manager_test.go

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

---------

Signed-off-by: perebaj <perebaj@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2025-08-29 04:10:01 +00:00
Björn Rabenstein ba808d1736
Merge pull request #17092 from prometheus/beorn7/cleanup
Apply analyzer modernize to the whole codebase
2025-08-28 00:42:33 +02:00
Bryan Boreham 2fb50b12cd
[PERF] TSDB: Optimize appender creation on empty chunks (#16922)
Skip creating an iterator and walking all through any existing values,
when we can easily tell there are no existing values.

This is the normal case - the TSDB head creates an appender immediately
after creating every chunk.

Remove redundant handling of empty chunks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-27 17:11:08 +01:00
Bryan Boreham 4a782634a4
Merge pull request #17093 from prometheus/beorn7/histogram
tsdb: Remove unused `Layout()` methods
2025-08-27 16:26:43 +01:00
beorn7 23f1d3ba25 tsdb: Remove unused `Layout()` methods
Both `HistogramChunk` and `FloatHistogramChunk` have a `Layout()`
method for historical reasons. As it has turned out, these methods are
unused and also buggy. This commit simply removes them.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 17:01:23 +02:00
beorn7 71c21fb9e4 Fix minor issues after applying analyzer "modernize"
- The tool left an empty line behind that we don't need anymore, see
  https://github.com/prometheus/prometheus/pull/17092. (Arguably not a
  bug in the tool but just our stricter style about empty lines.)

- In tsdb/index/postings_test.go , our (admittedly somewhat
  convoluted) code structure tricked the tool so it spit out something
  that wouldn't even compile.

- storage/remote/queue_manager_test.go is just a minor formatting
  nit.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 15:44:11 +02:00
beorn7 747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Ayoub Mrini 9cbb3a66c9
Merge pull request #17063 from machine424/muttt
test(notifier): add a test showing an alert mutation bug between alertmanager_config and fix it
2025-08-27 11:13:00 +02:00
pipiland2612 0246aa22f4 Parralel test
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-08-27 10:47:39 +02:00
Darkknight 9fc4212214
revert unexpected metadata metric fopr RWV2 and add log on unexpected metadata instead. (#17082)
Signed-off-by: leegin <leegin.t@gmail.com>
2025-08-26 11:54:14 -07:00
SuperQ b1802bae0c
Add changlog entry
Add changelog entry to pull in #16925 to v3.6.0.

Signed-off-by: SuperQ <superq@gmail.com>
2025-08-26 18:16:42 +02:00
Ganesh Vernekar b98cc631a2
Restore stale series count from chunk snapshots
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:53 +02:00
Ganesh Vernekar c3789ff547
Restore stale series count on WAL replay
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:45 +02:00
Ganesh Vernekar 787fe92e86
Test the stale series tracking in Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:39 +02:00
Ganesh Vernekar 4a37fd886f
Track stale series in the Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:32 +02:00
Lukasz Mierzwa 31282d67b7 Log when GC / block write starts
Right now Prometheus only logs when these operations are completed.
It's a bit surprising to see suddenly a message saying "I was busy doing X for the past N minutes"
so let's add a message when the operation starts, so it's easier to understand what Prometheus was doing at any point in time
when reading logs.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-08-26 10:30:22 +01:00
bragi92 20580b6ba8
remote_write azure auth : add workload identity support (#16788)
* initial changes

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* .

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* fix comments

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* fix tenantid test

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* style

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* pr feedback

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

---------

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2025-08-26 07:14:47 +01:00
machine424 8f79470ca9
fix(notifier): create a new alert when relabeling alters labels
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-26 07:37:45 +02:00
SuperQ b87cbf0294
Fixup err nil checks
Cleanup double `if` statements for errors being nil / not-nil.

Signed-off-by: SuperQ <superq@gmail.com>
2025-08-25 17:37:02 +02:00
machine424 bd725fd6b8
test(notifier): add a test showing an alert mutation bug between alertmanager_config (alertmanagersets)
The alert_relabel_configs should only apply to the corresponding alertmanagerset

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-25 17:04:14 +02:00
Darkknight 7cf585527f
remote_write: add metric for unexpected metadata in populateV2TimeSeries (#17034)
add metric to track unexpected metadata seen in populateV2TimeSeries, which would indicate metadata incorrectly routed in queue_manager code paths

---------

Signed-off-by: leegin <leegin.t@gmail.com>
Signed-off-by: Darkknight <leegin.t@gmail.com>
2025-08-22 10:33:52 -07:00
Bryan Boreham 153cdb2b0b [PERF] PromQL: Replace Fprintf %f with AppendFloat
The combination of `AvailableBuffer`` followed by `Write` is optimised
inside `bytes.Buffer`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-22 10:58:01 +01:00
Minh Nguyen c8deefb038
[tsdb] Add CounterResetHint: CounterReset to synthetic zero sample (#17011)
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-08-21 23:26:01 +02:00
Marco Pracucci 954cad35b2
Optimise concurrent rule evaluation for rules querying ALERTS and ALERTS_FOR_STATE (#17064)
* Optimise concurrent rule evaluation for rules querying ALERTS and ALERTS_FOR_STATE

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Further optimised the case of ALERTS and ALERTS_FOR_STATE without alertname label matcher

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2025-08-21 16:57:57 +02:00
machine424 9855613435 fix PR number
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-21 15:28:03 +02:00
machine424 94b4c49a76 apply bboreham's suggestions
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-21 15:28:03 +02:00
machine424 157ed00d9d chore: prepare release 3.6.0-rc.0
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-21 15:28:03 +02:00
Bryan Boreham b8d2d505f5 [PERF] PromQL: Replace some Sprintf with bytes.Buffer
Goes faster due to reduced memory allocation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:38:05 +01:00
Bryan Boreham 49d9261693 [PERF] PromQL: Replace some simple Sprintf with string concat
This goes faster because there is no runtime format parsing.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:05:49 +01:00
Bryan Boreham e44ee2f182 [TESTS] PromQL: Add BenchmarkExprString
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:05:38 +01:00
Bryan Boreham 66fbea97bb [TESTS] Check expr with function call in TestExprString
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:04:46 +01:00
Bryan Boreham 8b3f59e9c3
Merge pull request #16593 from bboreham/ast-child-iter
[PERF] PromQL: Reduce allocations when walking syntax tree
2025-08-21 09:14:41 +01:00
cuiweixie 8e48f43e06 discovery: refactor to use reflect.TypeFor
Signed-off-by: cuiweixie <cuiweixie@gmail.com>
2025-08-21 13:52:30 +08:00
Björn Rabenstein 7e37066994
Merge pull request #17051 from juliusmh/nh_counter_reset_hint_collision_annotation
Histograms: set annotation when adding or subtracting histograms that have `not_reset` and `reset` hints.
2025-08-20 17:24:12 +02:00
Julius Hinze cdf7208478
annotations: histogram counter reset warning includes operation
Signed-off-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-20 15:14:21 +02:00
Julius Hinze 77b5c3f217
Histograms: set annotation when adding or subtracting histograms that have `not_reset` and `reset` hints.
Signed-off-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-20 15:00:45 +02:00
Bryan Boreham 498f63e60b
Merge pull request #17029 from pr00se/wal-checkpoint-dropped-samples
TSDB: use timestamps rather than WAL segment numbers to track how long deleted series should be retained in checkpoints
2025-08-20 11:15:10 +01:00
Bartlomiej Plotka 5dc3c976b4
Merge pull request #17061 from prometheus/not-parallel
[TESTS] remote-write: Make TestShutdown non-parallel to reduce flakes.
2025-08-20 09:03:45 +01:00
Ganesh Vernekar a86d9a3858
Merge pull request #16925 from prometheus/codesome/stale-series-tracking
tsdb: Track stale series in the Head block based on stale sample
2025-08-19 15:35:19 -07:00
Patryk Prus bbc9e47e42
Add comment about differences between agent mode and regular Prometheus
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-19 18:33:52 -04:00
Ganesh Vernekar 3904b3cd5f Restore stale series count from chunk snapshots
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00
Ganesh Vernekar b29ce3e489 Restore stale series count on WAL replay
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00