Commit Graph

16205 Commits

Author SHA1 Message Date
beorn7 23f1d3ba25 tsdb: Remove unused `Layout()` methods
Both `HistogramChunk` and `FloatHistogramChunk` have a `Layout()`
method for historical reasons. As it has turned out, these methods are
unused and also buggy. This commit simply removes them.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 17:01:23 +02:00
beorn7 71c21fb9e4 Fix minor issues after applying analyzer "modernize"
- The tool left an empty line behind that we don't need anymore, see
  https://github.com/prometheus/prometheus/pull/17092. (Arguably not a
  bug in the tool but just our stricter style about empty lines.)

- In tsdb/index/postings_test.go , our (admittedly somewhat
  convoluted) code structure tricked the tool so it spit out something
  that wouldn't even compile.

- storage/remote/queue_manager_test.go is just a minor formatting
  nit.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 15:44:11 +02:00
beorn7 747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Ayoub Mrini 9cbb3a66c9
Merge pull request #17063 from machine424/muttt
test(notifier): add a test showing an alert mutation bug between alertmanager_config and fix it
2025-08-27 11:13:00 +02:00
pipiland2612 0246aa22f4 Parralel test
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-08-27 10:47:39 +02:00
Darkknight 9fc4212214
revert unexpected metadata metric fopr RWV2 and add log on unexpected metadata instead. (#17082)
Signed-off-by: leegin <leegin.t@gmail.com>
2025-08-26 11:54:14 -07:00
SuperQ b1802bae0c
Add changlog entry
Add changelog entry to pull in #16925 to v3.6.0.

Signed-off-by: SuperQ <superq@gmail.com>
2025-08-26 18:16:42 +02:00
Ganesh Vernekar b98cc631a2
Restore stale series count from chunk snapshots
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:53 +02:00
Ganesh Vernekar c3789ff547
Restore stale series count on WAL replay
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:45 +02:00
Ganesh Vernekar 787fe92e86
Test the stale series tracking in Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:39 +02:00
Ganesh Vernekar 4a37fd886f
Track stale series in the Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-26 15:45:32 +02:00
Lukasz Mierzwa 31282d67b7 Log when GC / block write starts
Right now Prometheus only logs when these operations are completed.
It's a bit surprising to see suddenly a message saying "I was busy doing X for the past N minutes"
so let's add a message when the operation starts, so it's easier to understand what Prometheus was doing at any point in time
when reading logs.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-08-26 10:30:22 +01:00
bragi92 20580b6ba8
remote_write azure auth : add workload identity support (#16788)
* initial changes

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* .

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* fix comments

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* fix tenantid test

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* style

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* Update storage/remote/azuread/azuread.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>

* pr feedback

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

---------

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2025-08-26 07:14:47 +01:00
machine424 8f79470ca9
fix(notifier): create a new alert when relabeling alters labels
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-26 07:37:45 +02:00
SuperQ b87cbf0294
Fixup err nil checks
Cleanup double `if` statements for errors being nil / not-nil.

Signed-off-by: SuperQ <superq@gmail.com>
2025-08-25 17:37:02 +02:00
machine424 bd725fd6b8
test(notifier): add a test showing an alert mutation bug between alertmanager_config (alertmanagersets)
The alert_relabel_configs should only apply to the corresponding alertmanagerset

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-25 17:04:14 +02:00
Darkknight 7cf585527f
remote_write: add metric for unexpected metadata in populateV2TimeSeries (#17034)
add metric to track unexpected metadata seen in populateV2TimeSeries, which would indicate metadata incorrectly routed in queue_manager code paths

---------

Signed-off-by: leegin <leegin.t@gmail.com>
Signed-off-by: Darkknight <leegin.t@gmail.com>
2025-08-22 10:33:52 -07:00
Bryan Boreham 153cdb2b0b [PERF] PromQL: Replace Fprintf %f with AppendFloat
The combination of `AvailableBuffer`` followed by `Write` is optimised
inside `bytes.Buffer`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-22 10:58:01 +01:00
Minh Nguyen c8deefb038
[tsdb] Add CounterResetHint: CounterReset to synthetic zero sample (#17011)
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-08-21 23:26:01 +02:00
Marco Pracucci 954cad35b2
Optimise concurrent rule evaluation for rules querying ALERTS and ALERTS_FOR_STATE (#17064)
* Optimise concurrent rule evaluation for rules querying ALERTS and ALERTS_FOR_STATE

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Further optimised the case of ALERTS and ALERTS_FOR_STATE without alertname label matcher

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2025-08-21 16:57:57 +02:00
machine424 9855613435 fix PR number
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-21 15:28:03 +02:00
machine424 94b4c49a76 apply bboreham's suggestions
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-21 15:28:03 +02:00
machine424 157ed00d9d chore: prepare release 3.6.0-rc.0
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-08-21 15:28:03 +02:00
Bryan Boreham b8d2d505f5 [PERF] PromQL: Replace some Sprintf with bytes.Buffer
Goes faster due to reduced memory allocation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:38:05 +01:00
Bryan Boreham 49d9261693 [PERF] PromQL: Replace some simple Sprintf with string concat
This goes faster because there is no runtime format parsing.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:05:49 +01:00
Bryan Boreham e44ee2f182 [TESTS] PromQL: Add BenchmarkExprString
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:05:38 +01:00
Bryan Boreham 66fbea97bb [TESTS] Check expr with function call in TestExprString
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-21 11:04:46 +01:00
Bryan Boreham 8b3f59e9c3
Merge pull request #16593 from bboreham/ast-child-iter
[PERF] PromQL: Reduce allocations when walking syntax tree
2025-08-21 09:14:41 +01:00
cuiweixie 8e48f43e06 discovery: refactor to use reflect.TypeFor
Signed-off-by: cuiweixie <cuiweixie@gmail.com>
2025-08-21 13:52:30 +08:00
Björn Rabenstein 7e37066994
Merge pull request #17051 from juliusmh/nh_counter_reset_hint_collision_annotation
Histograms: set annotation when adding or subtracting histograms that have `not_reset` and `reset` hints.
2025-08-20 17:24:12 +02:00
Julius Hinze cdf7208478
annotations: histogram counter reset warning includes operation
Signed-off-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-20 15:14:21 +02:00
Julius Hinze 77b5c3f217
Histograms: set annotation when adding or subtracting histograms that have `not_reset` and `reset` hints.
Signed-off-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-20 15:00:45 +02:00
Bryan Boreham 498f63e60b
Merge pull request #17029 from pr00se/wal-checkpoint-dropped-samples
TSDB: use timestamps rather than WAL segment numbers to track how long deleted series should be retained in checkpoints
2025-08-20 11:15:10 +01:00
Bartlomiej Plotka 5dc3c976b4
Merge pull request #17061 from prometheus/not-parallel
[TESTS] remote-write: Make TestShutdown non-parallel to reduce flakes.
2025-08-20 09:03:45 +01:00
Ganesh Vernekar a86d9a3858
Merge pull request #16925 from prometheus/codesome/stale-series-tracking
tsdb: Track stale series in the Head block based on stale sample
2025-08-19 15:35:19 -07:00
Patryk Prus bbc9e47e42
Add comment about differences between agent mode and regular Prometheus
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-19 18:33:52 -04:00
Ganesh Vernekar 3904b3cd5f Restore stale series count from chunk snapshots
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00
Ganesh Vernekar b29ce3e489 Restore stale series count on WAL replay
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00
Ganesh Vernekar 0c3d3d7466 Test the stale series tracking in Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00
Ganesh Vernekar 7a947d3629 Track stale series in the Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:27 -07:00
Bryan Boreham a3c4a9bd18 [TESTS] remote-write: Make TestShutdown non-parallel to reduce flakes.
Resolves #17045.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-19 18:04:20 +01:00
Justin Jung 0f98dcbc07
Engine: Allow error response code to be customized (#16257)
Currently the API always returns http code 422 for engine execution error, and

This PR allows the error code to be overriden, based on the ErrorType and the error itself.

Signed-off-by: Justin Jung <jungjust@amazon.com>
Signed-off-by: Justin Jung <justinjung04@gmail.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
2025-08-19 16:43:47 +01:00
Bartlomiej Plotka 93bbf4bc90
Merge pull request #17041 from bernot-dev/remove-queue-manager-startup-benchmark
test: remove obsolete queue manager test
2025-08-18 17:06:39 +01:00
Arve Knudsen 0a40df33fb
Make metric/label name validation scheme explicit (#16928)
* Parameterize metric/label name validation scheme

Parameterized metric/label name validation scheme

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-18 08:09:00 +00:00
Arve Knudsen 68d0d3eee3
Remote write: Return after writing error response for invalid compression (#17050)
* Remote write: Return after writing error response for invalid compression

Fix remote write HTTP handler to return after writing error response for
invalid compression (non-Snappy).

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-08-17 15:18:47 +00:00
Bryan Boreham af588174b0
Merge pull request #17052 from vicwicker/fix-chunk-encoding-docs
docs: Fix chunk format documentation for `varint` encoding
2025-08-17 09:35:19 +01:00
Adam Bernot 575a60ec92
test: fix flaky test
A race condition in TestSendSamplesWithBackoffWithSampleAgeLimit was
observed in CI where the sample age limit was too close to the backoff
time, causing samples to be dropped intermittently. Increasing the
SampleAgeLimit resolves the problem.

Signed-off-by: Adam Bernot <bernot@google.com>
2025-08-15 18:58:27 +00:00
Björn Rabenstein e8f650b00c
Merge pull request #17049 from prometheus/beorn7/doc
docs: counter vs. gauge histogram behavior with `+`/`-`
2025-08-15 11:19:51 +02:00
Victor Herrero Otal 0cbbc9b7d3 docs: Fix chunk format documentation for `varint` encoding
While preparing PR #16701, we identified an inconsistency in the chunk
format documentation. The `varint` encoding can require up to 10 bytes
for a 64-bit integer, such as when timestamps are encoded. However, the
chunk length field is a 32-bit integer, which requires at most 5 bytes
in `varint` encoding.

This is reflected in the code, where a maximum of 5 bytes are read when
parsing the chunk length.

    50ba25f273/tsdb/chunks/chunks.go (L709-L711)

    50ba25f273/tsdb/chunks/chunks.go (L47-L48)

Co-authored-by: Istvan Zoltan Ballok <istvan.zoltan.ballok@sap.com>
Signed-off-by: Victor Herrero Otal <victor.herrero.otal@sap.com>
2025-08-15 10:56:21 +02:00
Björn Rabenstein 1c002c5669
Merge pull request #17048 from prymitive/parserErr
ENHANCEMENT: Refactor TestParseExpressions to be more explicit about errors
2025-08-14 16:04:16 +02:00