prometheus

Commit Graph

Author	SHA1	Message	Date
Ben Kochie	204249fcb5	Update golangci-lint (#17478 ) * Update golangci-lint to v2.6.0 * Fixup various linting issues. * Fixup deprecations. * Add exception for `labels.MetricName` deprecation. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-05 13:47:34 +01:00
Minh Nguyen	30992dd032	[RW2] Fix: Only update metadata to WAL when metadata-wal-records feature is enabled (#17470 ) * add feature check when UpdateMetadata Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * add appendMetadata boolean to write_hander Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-11-04 08:16:57 +00:00
Ben Kochie	48956f60d7	Update modernize (#17471 ) Apply additional Go modernize tool improvements. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-04 05:13:49 +00:00
Minh Nguyen	784ec0a792	update test to test both v1 and v2 (#17467 ) Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-11-03 09:22:46 +00:00
Minh Nguyen	c8f1de18a7	[RW2] Fix type and unit labels propagation in Remote Write v2 receiver to prioritize type-and-unit-labels feature (#17387 ) * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix nits & update docs Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix docs Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-31 08:59:03 +00:00
Björn Rabenstein	84d2007a08	Merge pull request #17423 from geogrego/main docs: Fix typos	2025-10-30 16:56:48 +01:00
György Krajcsovits	b8192127ee	Merge remote-tracking branch 'origin/release-3.7' into krajo/merge-3.7.3-to-main # Conflicts: # CHANGELOG.md # storage/remote/queue_manager_test.go	2025-10-30 09:21:25 +01:00
Laurent Dufresne	7621eb772c	histogram: Add `Error` type for all histogram errors `histogram.Error` becomes the generic wrapper type for all histogram errors. This makes it easier and less error prone when adding new errors to check if an error is an histogram error as well as making it less error prone to convert the errors. This change the type of those specific sentinel errors from error to `histogram.Error`, but it should almost never matter. e.g., `errors.Is(err, ErrHistogram...)` would still work out of the box. Signed-off-by: Laurent Dufresne <laurent.dufresne@grafana.com>	2025-10-30 08:45:34 +01:00
Ayoub Mrini	6806b68f93	[release-3.7] fix: Remote-write: revert changes in the queue resharding logic (#17412 ) * Revert "chore: deprecate prometheus_remote_storage_{samples,exemplars,histograms}_in_total and prometheus_remote_storage_highest_timestamp_in_seconds" This reverts commit `ba14bc49db`. Signed-off-by: machine424 <ayoubmrini424@gmail.com> * Revert "storage/remote: compute highestTimestamp and dataIn at QueueManager level" This reverts commit `184c7eb918`. Signed-off-by: machine424 <ayoubmrini424@gmail.com> * fix(remote-write): bring back the per queue metrics Signed-off-by: machine424 <ayoubmrini424@gmail.com> * test(remote): add TestRemoteWrite_ReshardingWithoutDeadlock to reproduce the sharding scale up deadlock Signed-off-by: machine424 <ayoubmrini424@gmail.com> --------- Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-10-29 14:04:09 +00:00
geogrego	58dbe927d5	docs: minor improvement for docs Signed-off-by: geogrego <geogrego@outlook.com>	2025-10-29 14:42:14 +08:00
Arve Knudsen	c36e966bf8	OTLP: de-duplicate target_info samples with conflicting timestamps (#17400 ) Add logic to the target_info metric generation in the OTLP endpoint, so that any samples with the same timestamp for the same (target_info) series are de-duplicated. It comes out of a user's bug report about duplicated target_info samples in Grafana Mimir (which uses the Prometheus target_info generation logic). If I'm not mistaken, duplicate target_info samples should stem from multiple resources in the same OTLP request being translated to the same target_info label set. It shouldn't be caused by a Prometheus bug.	2025-10-28 14:13:43 +00:00
Minh Nguyen	6bb367970e	feat(promtool): add RW2 support to promtool push metrics using client_golang library (#17280 ) * Add WriteProto method and tests for promtool metrics This commit adds: 1. WriteProto method to storage/remote/client.go that handles marshaling and compression of protobuf messages 2. Updated parseAndPushMetrics in cmd/promtool/metrics.go to use the new WriteProto method 3. Comprehensive tests for PushMetrics functionality The WriteProto method provides a cleaner API for sending protobuf messages without manually handling marshaling and compression. Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * use Write method from exp/api/remote Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix lint Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix test Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * nit fixed Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix lint Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-27 13:56:48 +00:00
Minh Nguyen	f070e35358	[RW]: Adopt client_golang/exp/api/remote types for receiving RW1 and RW2 (#17197 ) Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> # Conflicts: # storage/remote/write_handler.go * add comment Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix failling test Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * nit_fixing Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix comment Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-24 10:31:34 +01:00
Julius Hinze	05612757b4	prometheusremotewrite: fix require.equal argument order (#17391 ) Signed-off-by: Julius Hinze <julius.hinze@grafana.com>	2025-10-23 15:13:32 +02:00
Arve Knudsen	ef42c088ba	OTLP: Add configuration parameters to control label name translation (#17345 ) As a follow-up to #17344, add two configuration parameters for controlling label name translation, both defaulting to on for backwards compatibility (currently these behaviours are hardcoded as enabled): * otlp.label_name_underscore_sanitization => Prefix label names starting with a single underscore with key_ when translating OTel attribute names * otlp.label_name_preserve_multiple_underscores => Keep multiple consecutive underscores in label names when translating OTel attribute names Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-22 08:27:35 +02:00
György Krajcsovits	ea398c15e8	Merge branch 'release-3.7' into krajo/merge-release-3071-to-main	2025-10-17 10:45:55 +02:00
Arve Knudsen	99d0967133	Fix lint failure Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:56:18 +02:00
Arve Knudsen	f5804e7cf2	Remove configuration parameters Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:37:24 +02:00
Arve Knudsen	3de3a296dd	Add reviewer feedback Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:13:08 +02:00
Arve Knudsen	dd3a607d2d	Add configuration parameters Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:13:08 +02:00
Arve Knudsen	7cf4b5da55	OTLP: Upgrade prometheus/otlptranslator The upgrade to prometheus/otlptranslator@7f02967de0 fixes two label name translation bugs, when in legacy name translation mode: * 'key' is no longer prefixed when label names start with an underscore * Multiple consecutive underscores are combined into one Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:13:08 +02:00
harsh kumar	16a9a827de	remote-write: Add type and unit labels to 2.0 receiver when feature flag enabled (#17329 ) * feat(remote): add support for type and unit labels in write handler Signed-off-by: Harsh <harshmastic@gmail.com> * minor fixes Signed-off-by: Harsh <harshmastic@gmail.com> * fix failing tests Signed-off-by: Harsh <harshmastic@gmail.com> * Update storage/remote/write_handler.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> * Update storage/remote/write_handler.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> * refactor: streamline label handling for type and unit in write handler tests Signed-off-by: Harsh <harshmastic@gmail.com> * test: enhance V2 message tests for type and unit labels Signed-off-by: Harsh <harshmastic@gmail.com> --------- Signed-off-by: Harsh <harshmastic@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2025-10-15 18:19:41 +01:00
beorn7	ad7d1aed99	Phase out native histogram feature flag The detailed plan for this is laid out in https://github.com/prometheus/prometheus/issues/16572 . This commit adds a global and local scrape config option `scrape_native_histograms`, which has to be set to true to ingest native histograms. To ease the transition, the feature flag is changed to simply set the default of `scrape_native_histograms` to true. Further implications: - The default scrape protocols now depend on the `scrape_native_histograms` setting. - Everywhere else, histograms are now "on by default". Documentation beyond the one for the feature flag and the scrape config are deliberately left out. See https://github.com/prometheus/prometheus/pull/17232 for that. Signed-off-by: beorn7 <beorn@grafana.com>	2025-10-15 14:50:52 +02:00
Fiona Liao	9a5bccbd4b	refactor: make OTEL temporality check easier to read (#16692 ) buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Build Prometheus for common architectures (0) (push) Waiting to run Details CI / Build Prometheus for common architectures (1) (push) Waiting to run Details CI / Build Prometheus for common architectures (2) (push) Waiting to run Details CI / Build Prometheus for all architectures (0) (push) Waiting to run Details CI / Build Prometheus for all architectures (1) (push) Waiting to run Details CI / Build Prometheus for all architectures (10) (push) Waiting to run Details CI / Build Prometheus for all architectures (11) (push) Waiting to run Details CI / Build Prometheus for all architectures (2) (push) Waiting to run Details CI / Build Prometheus for all architectures (3) (push) Waiting to run Details CI / Build Prometheus for all architectures (4) (push) Waiting to run Details CI / Build Prometheus for all architectures (5) (push) Waiting to run Details CI / Build Prometheus for all architectures (6) (push) Waiting to run Details CI / Build Prometheus for all architectures (7) (push) Waiting to run Details CI / Build Prometheus for all architectures (8) (push) Waiting to run Details CI / Build Prometheus for all architectures (9) (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details * Make OTEL temporality check easier to read * Add nolint comment Signed-off-by: Fiona Liao <fiona.liao@grafana.com>	2025-10-14 13:29:23 +02:00
George Krajcsovits	fe11cae637	Merge pull request #17287 from linasm/reject-nan-histogram-custom-bounds buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Build Prometheus for common architectures (0) (push) Waiting to run Details CI / Build Prometheus for common architectures (1) (push) Waiting to run Details CI / Build Prometheus for common architectures (2) (push) Waiting to run Details CI / Build Prometheus for all architectures (0) (push) Waiting to run Details CI / Build Prometheus for all architectures (1) (push) Waiting to run Details CI / Build Prometheus for all architectures (10) (push) Waiting to run Details CI / Build Prometheus for all architectures (11) (push) Waiting to run Details CI / Build Prometheus for all architectures (2) (push) Waiting to run Details CI / Build Prometheus for all architectures (3) (push) Waiting to run Details CI / Build Prometheus for all architectures (4) (push) Waiting to run Details CI / Build Prometheus for all architectures (5) (push) Waiting to run Details CI / Build Prometheus for all architectures (6) (push) Waiting to run Details CI / Build Prometheus for all architectures (7) (push) Waiting to run Details CI / Build Prometheus for all architectures (8) (push) Waiting to run Details CI / Build Prometheus for all architectures (9) (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details NHCB: Reject custom bucket bounds with NaN value	2025-10-06 18:11:03 +02:00
Linas Medziunas	c16db58061	NHCB: Reject custom bucket bounds with NaN value Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>	2025-10-06 16:37:28 +03:00
Minh Nguyen	106e6f2c77	[RW2] Return 400 for Exemplars without Series or Histograms not written (#17250 ) * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix cmt Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-06 12:53:44 +01:00
beorn7	3d7cf4c274	model/histogram: Validate non-negative count and zero bucket We have always validated that none of the bucket is negative. We should do the same for the count of observations and the zero bucket. Note that this was always implied in the protobuf exposition format because a count or a zero bucket population is ignored if it is not positive. Signed-off-by: beorn7 <beorn@grafana.com>	2025-10-01 16:40:41 +02:00
Bryan Boreham	0d3ec05056	Merge pull request #17043 from machine424/ffl chore: allow seamless use of testing/synctest for >=go1.24	2025-09-30 12:11:12 +01:00
György Krajcsovits	a5a6413c1a	better errors naming and formatting, typo fixes Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-23 11:20:55 +02:00
György Krajcsovits	6e42da8904	feat(remote): reduce resolution of native histograms on remote read If a sample read through remote read has too high resolution, reduce it to the maximum allowed. This is a slow path, but we only expect it to happen if the server side is newer version that allows higher resolution. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-23 11:20:55 +02:00
György Krajcsovits	794c545930	Merge remote-tracking branch 'origin/main' into krajo/native-histogram-schema-validation	2025-09-23 10:51:02 +02:00
Minh Nguyen	d04550a9c4	[RW2] Return 400 error code for wrongly-formatted histograms (#17210 ) * return 400 error code Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * add more cases Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * format code Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * nit_fixing Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-09-23 07:24:46 +02:00
machine424	365409d3be	chore: allow seamless use of testing/synctest for >=go1.24 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-09-19 22:48:25 +02:00
György Krajcsovits	5b39b79f5a	refactor error creation and tests Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-19 09:26:34 +02:00
György Krajcsovits	b99378f2c4	Merge remote-tracking branch 'origin/main' into krajo/native-histogram-schema-validation	2025-09-19 08:59:00 +02:00
George Krajcsovits	5e6900558a	Apply suggestions from code review Co-authored-by: Björn Rabenstein <beorn@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>	2025-09-19 08:58:27 +02:00
György Krajcsovits	f0a297bb7c	fix(remote): validate native histogram schema in remote read When remote read returns chunks, the validation is in tsdb/chunkenc. However when it returns samples, we need to modify the iterator to validate. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-18 11:09:45 +02:00
machine424	8462515c75	test(storage/remote/queue_manager_test.go): use synctest in TestShutdown for better control over time The test becomes flaky after it was asked to run on parallel and "fight" for resources let's hide all of that Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-09-17 11:20:07 +02:00
György Krajcsovits	0cf54d7819	perf(otlp): reduce logs from OTLP endpoint It's not possible to store created timestamp at the same timestamp as the current sample, so do not even try. In OpenTelemetry spec, if the start time is unknown, it will be set to the same timestamp as the first sample. https://opentelemetry.io/docs/specs/otel/metrics/data-model/#cumulative-streams-handling-unknown-start-time This means that we will get a lot of duplicate sample for timestamp errors and we should not log those. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-17 08:50:43 +02:00
György Krajcsovits	bdf547ae9c	fix(nativehistograms): validation should fail on unsupported schemas Histogram.Validate and FloatHistogram.Validate now return error on unsupported schemas. Scrape and remote-write handler reduces the schema to the maximum allowed if it is above the maximum, but below theoretical maximum of 52. For scrape the maximum is a configuration option, for remote-write it is 8. Note: OTLP endpont already does the reduction, without checking that it is below 52 as the spec does not specify a maximum. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-13 16:54:44 +02:00
Minh Nguyen	0fc2547740	Handle error gracefully for the `desymbolizeLabels` function in prompb/io/prometheus/write/v2/symbols.go (#17160 ) Signed-off-by: pipiland <user123@Minhs-Macbook.local> --------- Signed-off-by: pipiland <user123@Minhs-Macbook.local> Co-authored-by: pipiland <user123@Minhs-Macbook.local>	2025-09-08 13:04:55 -07:00
George Krajcsovits	979aea1d49	OTLP to directly write to an interface which can hide storage details (#16951 ) * OTLP writer writes directly to appender Do not convert to Remote-Write 1.0 protocol. Convert to TSDB Appender interface instead. For downstream projects that still convert OTLP to something else (e.g. Mimir using its own RW 1.0+2.0 compatible protocol), introduce a compatibility layer between OTLP decoding and TSDB Appender. This is the CombinedAppender that hides the implementation. Name is subject to change. --------- Signed-off-by: David Ashpole <dashpole@google.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: David Ashpole <dashpole@google.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-09-08 14:34:25 +02:00
Arve Knudsen	913cc8f72b	Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 (#17151 ) * Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 * Upgrade to client_golang@v1.23.2 --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-09-06 13:04:24 +02:00
George Krajcsovits	43c1535bdf	fix(rw1): drop unsupported NHCB and log (#17146 ) Remote Write one currently attempts to send native histograms with custom buckets, but these are not actually supported in RW1 protocol. Drop, measure and log instead. Fixes: #17140 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-05 08:32:37 +01:00
machine424	ba14bc49db	chore: deprecate prometheus_remote_storage_{samples,exemplars,histograms}_in_total and prometheus_remote_storage_highest_timestamp_in_seconds Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-09-01 13:19:28 +02:00
machine424	184c7eb918	storage/remote: compute highestTimestamp and dataIn at QueueManager level Because of relabelling, an endpoint can only select a subset of series that go through WriteStorage Having a highestTimestamp at WriteStorage level yields wrong values if the corresponding sample won't even make it to a remote queue. Currently PrometheusRemoteWriteBehind is based on that, and would fire if an endpoint is only interested in a subset of series that take time to appear. A "prometheus_remote_storage_queue_highest_timestamp_seconds" that only takes into account samples in the queue is introduced, and used in PrometheusRemoteWriteBehind and dashboards in documentation/prometheus-mixin Same applies to samplesIn/dataIn, QueueManager should know more about when to update those; when data is enqueued. That makes dataDropped unnecessary, thus help simplify the logic in QueueManager.calculateDesiredShards() Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-09-01 13:19:24 +02:00
bwplotka	172cde8af1	Revert "feat(storage): add new CombinedAppender interface and compatibility layer" This reverts commit `2fb680a229`.	2025-08-29 08:16:39 +01:00
bwplotka	794bf774c2	Reapply "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033 )" This reverts commit `f5fab47577`.	2025-08-29 08:16:37 +01:00
bwplotka	f5fab47577	Revert "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033 )" This reverts commit `c808a71e18`.	2025-08-29 08:15:28 +01:00

1 2 3 4 5 ...

1670 Commits