Commit Graph

304 Commits

Author SHA1 Message Date
György Krajcsovits d11ee103ac
perf(tsdb): reuse map of sample types to speed up head appender
While investigating +10% CPU in v3.7 release, found that ~5% is from
expanding the types map. Try reuse.

Also fix a linter error.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-10-07 08:31:20 +02:00
Bryan Boreham 0d3ec05056
Merge pull request #17043 from machine424/ffl
chore: allow seamless use of testing/synctest for >=go1.24
2025-09-30 12:11:12 +01:00
Jan Fajerski c9e0e36701
Add comments clarifying why promql.Querylogger exists (#17231)
buf.build / lint and publish (push) Waiting to run Details
CI / Go tests (push) Waiting to run Details
CI / More Go tests (push) Waiting to run Details
CI / Go tests with previous Go version (push) Waiting to run Details
CI / UI tests (push) Waiting to run Details
CI / Go tests on Windows (push) Waiting to run Details
CI / Mixins tests (push) Waiting to run Details
CI / Build Prometheus for common architectures (0) (push) Waiting to run Details
CI / Build Prometheus for common architectures (1) (push) Waiting to run Details
CI / Build Prometheus for common architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (0) (push) Waiting to run Details
CI / Build Prometheus for all architectures (1) (push) Waiting to run Details
CI / Build Prometheus for all architectures (10) (push) Waiting to run Details
CI / Build Prometheus for all architectures (11) (push) Waiting to run Details
CI / Build Prometheus for all architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (3) (push) Waiting to run Details
CI / Build Prometheus for all architectures (4) (push) Waiting to run Details
CI / Build Prometheus for all architectures (5) (push) Waiting to run Details
CI / Build Prometheus for all architectures (6) (push) Waiting to run Details
CI / Build Prometheus for all architectures (7) (push) Waiting to run Details
CI / Build Prometheus for all architectures (8) (push) Waiting to run Details
CI / Build Prometheus for all architectures (9) (push) Waiting to run Details
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details
CI / Check generated parser (push) Waiting to run Details
CI / golangci-lint (push) Waiting to run Details
CI / fuzzing (push) Waiting to run Details
CI / codeql (push) Waiting to run Details
CI / Publish main branch artifacts (push) Blocked by required conditions Details
CI / Publish release artefacts (push) Blocked by required conditions Details
CI / Publish UI on npm Registry (push) Blocked by required conditions Details
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details
And why we only have one implementation in this code base.

Fixes: https://github.com/prometheus/prometheus/issues/15869

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2025-09-26 15:33:09 +01:00
machine424 365409d3be
chore: allow seamless use of testing/synctest for >=go1.24
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-19 22:48:25 +02:00
Björn Rabenstein 5b153f80a5
Merge pull request #17094 from prometheus/beorn7/histogram
promql: Fix when to emit a HistogramCounterResetCollisionWarning
2025-09-04 13:55:41 +02:00
beorn7 2628320292 promql: Fix when to emit a `HistogramCounterResetCollisionWarning`
So far, we emitted a `HistogramCounterResetCollisionWarning` when
encountering conflicting counter resets in the calculation of (i)rate
and friends. We even tested for that. However, in the rate
calculation, we are not interested in those collisions. They are
actually expected.

On the other hand, we did not warn about those collisions when doing a
`sum` aggregation, where such a warning would be appropriate.

This commit removes the warning in the former case and adds it in the
latter. Sadly, we cannot really test this as we still remove the
counter reset hint for the first sample in a chunk. (And that's the
only sample where we could get a `NotCounterReset` hint.)

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-03 18:00:03 +02:00
Owen Williams 6ee965c255
common: Update to prom/common v0.66.0, fix TextParser creation (#17139)
TextParser as of prom/common v0.66.0 requires an explicit validation scheme.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2025-09-03 11:20:04 -04:00
beorn7 747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Julius Hinze cdf7208478
annotations: histogram counter reset warning includes operation
Signed-off-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-20 15:14:21 +02:00
Julius Hinze 77b5c3f217
Histograms: set annotation when adding or subtracting histograms that have `not_reset` and `reset` hints.
Signed-off-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-20 15:00:45 +02:00
Arve Knudsen 0a40df33fb
Make metric/label name validation scheme explicit (#16928)
* Parameterize metric/label name validation scheme

Parameterized metric/label name validation scheme

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-18 08:09:00 +00:00
Neeraj Gartia 2c0de4e7c2
Fix `histogram_quantile` annotation in range query when delayed name removal is disabled (#16794)
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2025-08-13 18:06:48 +02:00
Matthieu MOREL cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
Carrie Edwards 44b0fbba1e
No info annotation for rate/increase when type is histogram (#16915)
buf.build / lint and publish (push) Waiting to run Details
CI / Go tests (push) Waiting to run Details
CI / More Go tests (push) Waiting to run Details
CI / Go tests with previous Go version (push) Waiting to run Details
CI / UI tests (push) Waiting to run Details
CI / Go tests on Windows (push) Waiting to run Details
CI / Mixins tests (push) Waiting to run Details
CI / Build Prometheus for common architectures (0) (push) Waiting to run Details
CI / Build Prometheus for common architectures (1) (push) Waiting to run Details
CI / Build Prometheus for common architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (0) (push) Waiting to run Details
CI / Build Prometheus for all architectures (1) (push) Waiting to run Details
CI / Build Prometheus for all architectures (10) (push) Waiting to run Details
CI / Build Prometheus for all architectures (11) (push) Waiting to run Details
CI / Build Prometheus for all architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (3) (push) Waiting to run Details
CI / Build Prometheus for all architectures (4) (push) Waiting to run Details
CI / Build Prometheus for all architectures (5) (push) Waiting to run Details
CI / Build Prometheus for all architectures (6) (push) Waiting to run Details
CI / Build Prometheus for all architectures (7) (push) Waiting to run Details
CI / Build Prometheus for all architectures (8) (push) Waiting to run Details
CI / Build Prometheus for all architectures (9) (push) Waiting to run Details
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details
CI / Check generated parser (push) Waiting to run Details
CI / golangci-lint (push) Waiting to run Details
CI / fuzzing (push) Waiting to run Details
CI / codeql (push) Waiting to run Details
CI / Publish main branch artifacts (push) Blocked by required conditions Details
CI / Publish release artefacts (push) Blocked by required conditions Details
CI / Publish UI on npm Registry (push) Blocked by required conditions Details
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2025-07-24 10:11:37 -07:00
George Krajcsovits 5b7ff92d95
fix(promql): histogram_quantile and histogram_fraction NaN observed in native histogram (#16724)
buf.build / lint and publish (push) Waiting to run Details
CI / Go tests (push) Waiting to run Details
CI / More Go tests (push) Waiting to run Details
CI / Go tests with previous Go version (push) Waiting to run Details
CI / UI tests (push) Waiting to run Details
CI / Go tests on Windows (push) Waiting to run Details
CI / Mixins tests (push) Waiting to run Details
CI / Build Prometheus for common architectures (0) (push) Waiting to run Details
CI / Build Prometheus for common architectures (1) (push) Waiting to run Details
CI / Build Prometheus for common architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (0) (push) Waiting to run Details
CI / Build Prometheus for all architectures (1) (push) Waiting to run Details
CI / Build Prometheus for all architectures (10) (push) Waiting to run Details
CI / Build Prometheus for all architectures (11) (push) Waiting to run Details
CI / Build Prometheus for all architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (3) (push) Waiting to run Details
CI / Build Prometheus for all architectures (4) (push) Waiting to run Details
CI / Build Prometheus for all architectures (5) (push) Waiting to run Details
CI / Build Prometheus for all architectures (6) (push) Waiting to run Details
CI / Build Prometheus for all architectures (7) (push) Waiting to run Details
CI / Build Prometheus for all architectures (8) (push) Waiting to run Details
CI / Build Prometheus for all architectures (9) (push) Waiting to run Details
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details
CI / Check generated parser (push) Waiting to run Details
CI / golangci-lint (push) Waiting to run Details
CI / fuzzing (push) Waiting to run Details
CI / codeql (push) Waiting to run Details
CI / Publish main branch artifacts (push) Blocked by required conditions Details
CI / Publish release artefacts (push) Blocked by required conditions Details
CI / Publish UI on npm Registry (push) Blocked by required conditions Details
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details
* fix(promql): histogram_quantile NaN observed in native histogram

Fixes: #16578

See the issue for detailed explanation.
When a histogram had only NaN observations and no normal observations,
we returned 0 from the quantile, which is completely wrong. If there were
normal observations but we went over them, we returned the upper bound of
the existing buckets, however that contradicts expectations on
histogram_fraction. Now we return NaN if the quantile is calculated to be
over all normal observations, falling into NaNs (in a virtual +Inf bucket).

We also return info level annotations if we see any NaN observations.
The annotation calls out if we returned NaN or even if we took the
virtual +Inf bucket into account.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* fix(promql): histogram_fraction NaN observed in native histogram

Fixes: #16580

According to the specification we should not take NaN observations
into account when calculating the fraction. This commit fixes that
and adds an info level annotation to let the user know about this.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-06-25 13:37:43 +02:00
Björn Rabenstein ea75bc18e9
Merge pull request #16697 from Konstantinov-Innokentii/export-query-samples-fields
buf.build / lint and publish (push) Waiting to run Details
CI / Go tests (push) Waiting to run Details
CI / More Go tests (push) Waiting to run Details
CI / Go tests with previous Go version (push) Waiting to run Details
CI / UI tests (push) Waiting to run Details
CI / Go tests on Windows (push) Waiting to run Details
CI / Mixins tests (push) Waiting to run Details
CI / Build Prometheus for common architectures (0) (push) Waiting to run Details
CI / Build Prometheus for common architectures (1) (push) Waiting to run Details
CI / Build Prometheus for common architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (0) (push) Waiting to run Details
CI / Build Prometheus for all architectures (1) (push) Waiting to run Details
CI / Build Prometheus for all architectures (10) (push) Waiting to run Details
CI / Build Prometheus for all architectures (11) (push) Waiting to run Details
CI / Build Prometheus for all architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (3) (push) Waiting to run Details
CI / Build Prometheus for all architectures (4) (push) Waiting to run Details
CI / Build Prometheus for all architectures (5) (push) Waiting to run Details
CI / Build Prometheus for all architectures (6) (push) Waiting to run Details
CI / Build Prometheus for all architectures (7) (push) Waiting to run Details
CI / Build Prometheus for all architectures (8) (push) Waiting to run Details
CI / Build Prometheus for all architectures (9) (push) Waiting to run Details
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details
CI / Check generated parser (push) Waiting to run Details
CI / golangci-lint (push) Waiting to run Details
CI / fuzzing (push) Waiting to run Details
CI / codeql (push) Waiting to run Details
CI / Publish main branch artifacts (push) Blocked by required conditions Details
CI / Publish release artefacts (push) Blocked by required conditions Details
CI / Publish UI on npm Registry (push) Blocked by required conditions Details
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details
stats: Export QuerySamples StartTimestamp and Interval fields
2025-06-10 15:05:22 +02:00
Arthur Silva Sens 8d0a747237
PROM-39: Provide PromQL info annotations when rate()/increase() over series without `__type__`="counter" label (#16632)
* Provide PromQL info annotations when rate()/increase() over series without counter label

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

* Address comments

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

---------

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-06-06 10:01:20 -03:00
Innokentii Konstantinov cd5bd11bd8 Export QuerySamples StartTimestamp and Interval to enable custom statistics implementations to populate these fields as needed.
Signed-off-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
2025-06-06 15:31:37 +08:00
Julius Volz 13564c03ef Standardize doc page title handling
See https://groups.google.com/g/prometheus-developers/c/cwL3cM66Em8

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2025-05-28 21:37:27 +02:00
jub0bs 4bc8df0f54
util/httputil: Always add Vary header in SetCORS
Closes #15406

Signed-off-by: jub0bs <jcretel-infosec+github@protonmail.com>
2025-05-19 11:42:44 +02:00
Arve Knudsen e7e3ab2824
Fix linting issues found by golangci-lint v2.0.2 (#16368)
buf.build / lint and publish (push) Waiting to run Details
CI / Go tests (push) Waiting to run Details
CI / More Go tests (push) Waiting to run Details
CI / Go tests with previous Go version (push) Waiting to run Details
CI / UI tests (push) Waiting to run Details
CI / Go tests on Windows (push) Waiting to run Details
CI / Mixins tests (push) Waiting to run Details
CI / Build Prometheus for common architectures (0) (push) Waiting to run Details
CI / Build Prometheus for common architectures (1) (push) Waiting to run Details
CI / Build Prometheus for common architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (0) (push) Waiting to run Details
CI / Build Prometheus for all architectures (1) (push) Waiting to run Details
CI / Build Prometheus for all architectures (10) (push) Waiting to run Details
CI / Build Prometheus for all architectures (11) (push) Waiting to run Details
CI / Build Prometheus for all architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (3) (push) Waiting to run Details
CI / Build Prometheus for all architectures (4) (push) Waiting to run Details
CI / Build Prometheus for all architectures (5) (push) Waiting to run Details
CI / Build Prometheus for all architectures (6) (push) Waiting to run Details
CI / Build Prometheus for all architectures (7) (push) Waiting to run Details
CI / Build Prometheus for all architectures (8) (push) Waiting to run Details
CI / Build Prometheus for all architectures (9) (push) Waiting to run Details
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details
CI / Check generated parser (push) Waiting to run Details
CI / golangci-lint (push) Waiting to run Details
CI / fuzzing (push) Waiting to run Details
CI / codeql (push) Waiting to run Details
CI / Publish main branch artifacts (push) Blocked by required conditions Details
CI / Publish release artefacts (push) Blocked by required conditions Details
CI / Publish UI on npm Registry (push) Blocked by required conditions Details
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details
* Fix linting issues found by golangci-lint v2.0.2

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-05-03 19:05:13 +02:00
Arthur Silva Sens 95f49dd84b
Bump prometheus/common to v0.63.0 (#16210)
* Bump prometheus/common to v0.63.0

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

* nolint usage of deprecated model.NameValidationScheme

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

---------

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-03-13 20:42:42 +01:00
Bartlomiej Plotka 7a7bc65237
Add util/compression package to consolidate snappy/zstd use in Prometheus. (#16156)
# Conflicts:
#	tsdb/db_test.go

Apply suggestions from code review




tmp



Addressed comments.



Update util/compression/buffers.go

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-03-10 10:36:26 +00:00
Matthieu MOREL c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
jub0bs 329ec6831a
util/httputil: reduce heap allocations in newCompressedResponseWriter (#16001)
* util/httputil: Benchmark newCompressedResponseWriter

This benchmark illustrates that newCompressedResponseWriter incurs a
prohibitive amount of heap allocations when handling a request containing a
malicious Accept-Encoding header.¬

Signed-off-by: jub0bs <jcretel-infosec+github@protonmail.com>

* util/httputil: Improve newCompressedResponseWriter

This change dramatically reduces the heap allocations (in bytes)
incurred when handling a request containing a malicious Accept-Encoding header.

Below are some benchmark results; for conciseness, I've omitted the name of the
benchmark function (BenchmarkNewCompressionHandler_MaliciousAcceptEncoding):

```
goos: darwin
goarch: amd64
pkg: github.com/prometheus/prometheus/util/httputil
cpu: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
│     old     │                 new                 │
│   sec/op    │   sec/op     vs base                │
  18.60m ± 2%   13.54m ± 3%  -27.17% (p=0.000 n=10)

│       old        │                 new                 │
│       B/op       │    B/op     vs base                 │
  16785442.50 ± 0%   32.00 ± 0%  -100.00% (p=0.000 n=10)

│    old     │                new                 │
│ allocs/op  │ allocs/op   vs base                │
  2.000 ± 0%   1.000 ± 0%  -50.00% (p=0.000 n=10)
```

Signed-off-by: jub0bs <jcretel-infosec+github@protonmail.com>

---------

Signed-off-by: jub0bs <jcretel-infosec+github@protonmail.com>
2025-02-11 14:14:55 +01:00
Neeraj Gartia 8be416a67c
[FIX] PromQL: Updates annotation for bin op between incompatible histograms (#15895)
PromQL: Updates annotation for bin op between incompatible histograms

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2025-02-01 22:57:29 +01:00
Jan Fajerski 54cf0d6879
Merge pull request #15472 from tjhop/ref/jsonfilelogger-slog-handler
ref: JSONFileLogger slog handler, add scrape.FailureLogger interface
2025-01-27 20:17:46 +01:00
crystalstall 616914abe2 Signed-off-by: crystalstall <crystalruby@qq.com>
refactor: using slices.Contains to simplify the code

Signed-off-by: crystalstall <crystalruby@qq.com>
2025-01-11 00:41:51 +08:00
Neeraj Gartia 0e99ca3e8c
[BUGFIX] PromQL: Fix `deriv`, `predict_linear` and `double_exponential_smoothing` with histograms (#15686)
PromQL: Fix deriv, predict_linear and double_exponential_smoothing with histograms

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-12-19 23:50:28 +01:00
Joel Beckmeyer bad1c75514 fix alignment of atomic uint64 on 32-bit
Signed-off-by: Joel Beckmeyer <joel@beckmeyer.us>
2024-12-16 10:44:50 -05:00
TJ Hoplock 437362a7a7 fix(deduper): use ptr to sync.RWMutex, fix panic during concurrent use
Resolves: #15559

As accurately noted in the issue description, the map is shared among
child loggers that get created when `WithAttr()`/`WithGroup()` are
called on the underlying handler, which happens via `log.With()` and
`log.WithGroup()` respectively.

The RW mutex was a value in the previous implementation that used
go-kit/log, and I should've updated it to use a pointer when I converted
the deduper.

Also adds a test.

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-12-10 02:51:46 -05:00
TJ Hoplock e0104a6b7e ref: JSONFileLogger slog handler, add scrape.FailureLogger interface
Improves upon #15434, better resolves #15433.

This commit introduces a few changes to ensure safer handling of the
JSONFileLogger:
- the JSONFileLogger struct now implements the slog.Handler interface,
  so it can directly be used to create slog Loggers. This pattern more
closely aligns with upstream slog usage (which is generally based around
handlers), as well as making it clear that devs are creating a whole new
logger when interacting with it (vs silently modifying internal configs
like it did previously).
- updates the `promql.QueryLogger` interface to be a union of the
  methods of both the `io.Closer` interface and the `slog.Handler`
interface. This allows for plugging in/using slog-compatible loggers
other than the JSONFileLogger, if desired (ie, for downstream projects).
- introduces new `scrape.FailureLogger` interface; just like
  `promql.QueryLogger`, it is a union of `io.Closer` and `slog.Handler`
interfaces. Similar logic applies to reasoning.
- updates tests where needed; have the `FakeQueryLogger` from promql's
  engine_test implement the `slog.Handler`, improve JSONFileLogger test
suite, etc.

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-11-28 23:14:31 -05:00
TJ Hoplock aa8e067f13 Merge release-3.0 into main 2024-11-27 17:51:51 -05:00
György Krajcsovits eb9ce70c3e Set hasCount after setting count to be consistent
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-11-26 13:56:22 +01:00
György Krajcsovits a48d05912d nhcb: optimize, do not recalculate suffixes multiple times
Reduce string manipulation by just cutting off the histogram suffixes from
the series name label once.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-11-25 15:37:38 +01:00
György Krajcsovits 41f051e9a4 Small improvement in handling cases without count
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-11-25 15:02:17 +01:00
TJ Hoplock 3e24e84172 fix!: stop unbounded memory usage from query log
Resolves: #15433

When I converted prometheus to use slog in #14906, I update both the
`QueryLogger` interface, as well as how the log calls to the
`QueryLogger` were built up in `promql.Engine.exec()`. The backing
logger for the `QueryLogger` in the engine is a
`util/logging.JSONFileLogger`, and it's implementation of the `With()`
method updates the logger the logger in place with the new keyvals added
onto the underlying slog.Logger, which means they get inherited onto
everything after. All subsequent calls to `With()`, even in later
queries, would continue to then append on more and more keyvals for the
various params and fields built up in the logger. In turn, this causes
unbounded growth of the logger, leading to increased memory usage, and
in at least one report was the likely cause of an OOM kill. More
information can be found in the issue and the linked slack thread.

This commit does a few things:

- It was referenced in feedback in #14906 that it would've been better
  to not change the `QueryLogger` interface if possible, this PR
proposes changes that bring it closer to alignment with the pre-3.0
`QueryLogger` interface contract
- reverts `promql.Engine.exec()`'s usage of the query logger to the
  pattern of building up an array of args to pass at once to the end log
call. Avoiding the repetitious calls to `.With()` are what resolve the
issue with the logger growth/memory usage.
- updates the scrape failure logger to use the update `QueryLogger`
  methods in the contract.
- updates tests accordingly
- cleans up unused methods

Builds and passes tests successfully. Tested locally and confirmed I
could no longer reproduce the issue/it resolved the issue.

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-11-23 14:20:37 -05:00
TJ Hoplock 16a4354b48 chore(deduper): add compile check that slog.Handler int is satisfied
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-11-15 15:48:34 -05:00
Neeraj Gartia 789c9b1a5e
[BUGFIX] PromQL: Corrects the behaviour of some operator and aggregators with Native Histograms (#15245)
PromQL: Correct the behaviour of some operator and aggregators with Native Histograms

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-11-12 15:37:05 +01:00
Bartlomiej Plotka 76432aaf4e
Merge pull request #15220 from prometheus/nhcb-scrape-optimize
perf(nhcb): scrape optimize
2024-11-08 19:02:48 +01:00
Julien 5c05c89b8b Update prometheus/common
Signed-off-by: Julien <roidelapluie@o11y.eu>
2024-11-07 10:08:02 +01:00
Julien 28f22b4d80 Update prometheus/common
Signed-off-by: Julien <roidelapluie@o11y.eu>
2024-11-05 12:08:33 +01:00
György Krajcsovits eafe72a0d0 perf(nhcb): optimize away most allocations in convertnhcb
In general aim for the happy case when the exposer lists the buckets
in ascending order.

Use Compact(2) to compact the result of nhcb convert.

This is more in line with how client_golang optimizes spans vs
buckets.
aef8aedb4b/prometheus/histogram.go (L1485)

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-28 08:34:54 +01:00
TJ Hoplock 4f9e4dc016 ref: remove unused deduper log wrapper methods
I used these wrapper methods during initial development of the custom
handler that the deduper now implements. Since the deduper implements
slog.Handler and can be used directly as a logger, these wrapper methods
are no longer needed.

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-24 22:31:37 -04:00
TJ Hoplock b602393473 fix: avoid data race in log deduper
This change should have been included in the initial prometheus slog
conversion, but I must've lost track of it in all the rebases involved
in that PR.

This changes the dedupe logger so that the only method that needs to use
the lock is the `Handle()` method that actually interacts with the
deduplication map.

Ex:
```
==================
WARNING: DATA RACE
Write at 0x00c000518bc0 by goroutine 29481:
  github.com/prometheus/prometheus/util/logging.(*Deduper).WithAttrs()
      /home/tjhop/go/src/github.com/prometheus/prometheus/util/logging/dedupe.go:89 +0xef
  log/slog.(*Logger).With()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/log/slog/logger.go:132 +0x106
  github.com/prometheus/prometheus/storage/remote.NewQueueManager()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/queue_manager.go:483 +0x7a9
  github.com/prometheus/prometheus/storage/remote.(*WriteStorage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/write.go:201 +0x102c
  github.com/prometheus/prometheus/storage/remote.(*Storage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage.go:92 +0xfd
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.func1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:172 +0x3e4
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.gowrap1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:174 +0x41

Previous read at 0x00c000518bc0 by goroutine 31261:
  github.com/prometheus/prometheus/util/logging.(*Deduper).Handle()
      /home/tjhop/go/src/github.com/prometheus/prometheus/util/logging/dedupe.go:82 +0x2b1
  log/slog.(*Logger).log()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/log/slog/logger.go:257 +0x228
  log/slog.(*Logger).Error()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/log/slog/logger.go:230 +0x3d4
  github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).loop()
      /home/tjhop/go/src/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:254 +0x2db
  github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).Start.gowrap1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:227 +0x33

Goroutine 29481 (running) created at:
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:164 +0xe4
  testing.tRunner()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /home/tjhop/.asdf/installs/golang/1.23.1/go/src/testing/testing.go:1743 +0x44

Goroutine 31261 (running) created at:
  github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).Start()
      /home/tjhop/go/src/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:227 +0x177
  github.com/prometheus/prometheus/storage/remote.(*QueueManager).Start()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/queue_manager.go:934 +0x304
  github.com/prometheus/prometheus/storage/remote.(*WriteStorage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/write.go:232 +0x151b
  github.com/prometheus/prometheus/storage/remote.(*Storage).ApplyConfig()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage.go:92 +0xfd
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.func1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:172 +0x3e4
  github.com/prometheus/prometheus/storage/remote.TestWriteStorageApplyConfigsDuringCommit.gowrap1()
      /home/tjhop/go/src/github.com/prometheus/prometheus/storage/remote/storage_test.go:174 +0x41
==================
--- FAIL: TestWriteStorageApplyConfigsDuringCommit (2.26s)
    testing.go:1399: race detected during execution of test
FAIL
FAIL    github.com/prometheus/prometheus/storage/remote 68.321s
```

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-24 22:30:38 -04:00
Vanshika cccbe72514
TSDB: Fix some edge cases when OOO is enabled (#14710)
Fix some edge cases when OOO is enabled

Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Signed-off-by: Vanshika <102902652+Vanshikav123@users.noreply.github.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-10-23 17:34:28 +02:00
George Krajcsovits 55419275ab
Merge pull request #15159 from prometheus/nhcb-scrape-optimize1
convertnhcb: use CutSuffix instead of regex replace for histogram name
2024-10-16 10:25:18 +02:00
Neeraj Gartia d4b1f9eb33
Corrects the behaviour of binary opperators between histogram and float (#14726)
promql: corrects binary operators functioning for mixed sample with histogram and float

For invalid pairings of sample types, an annotation is added now.

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-10-15 14:44:36 +02:00
György Krajcsovits 78de9bd10f convertnhcb: use CutSuffix instead of regex replace for histogram name
This is much quicker.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-14 16:50:05 +02:00
George Krajcsovits 522149a2ae
model: move classic NHCB conversion into its own file (#15156)
* model: move classic to NHCB conversion into its own file

In preparation for #14978.

Author: Jeanette Tan <jeanette.tan@grafana.com>  2024-07-03 11:56:48
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Better naming from review comment

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add doc strings.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-10-14 14:23:11 +02:00