158 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
|
0fc2547740
|
Handle error gracefully for the `desymbolizeLabels` function in prompb/io/prometheus/write/v2/symbols.go (#17160)
Signed-off-by: pipiland <user123@Minhs-Macbook.local> --------- Signed-off-by: pipiland <user123@Minhs-Macbook.local> Co-authored-by: pipiland <user123@Minhs-Macbook.local> |
|
|
43c1535bdf
|
fix(rw1): drop unsupported NHCB and log (#17146)
Remote Write one currently attempts to send native histograms with custom buckets, but these are not actually supported in RW1 protocol. Drop, measure and log instead. Fixes: #17140 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> |
|
|
184c7eb918
|
storage/remote: compute highestTimestamp and dataIn at QueueManager level
Because of relabelling, an endpoint can only select a subset of series that go through WriteStorage Having a highestTimestamp at WriteStorage level yields wrong values if the corresponding sample won't even make it to a remote queue. Currently PrometheusRemoteWriteBehind is based on that, and would fire if an endpoint is only interested in a subset of series that take time to appear. A "prometheus_remote_storage_queue_highest_timestamp_seconds" that only takes into account samples in the queue is introduced, and used in PrometheusRemoteWriteBehind and dashboards in documentation/prometheus-mixin Same applies to samplesIn/dataIn, QueueManager should know more about when to update those; when data is enqueued. That makes dataDropped unnecessary, thus help simplify the logic in QueueManager.calculateDesiredShards() Signed-off-by: machine424 <ayoubmrini424@gmail.com> |
|
|
794bf774c2 |
Reapply "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)"
This reverts commit
|
|
|
f5fab47577 |
Revert "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)"
This reverts commit
|
|
|
c808a71e18
|
prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)
* chore: send Unit and Type when feature flag is enabled Signed-off-by: perebaj <perebaj@gmail.com> * remove unused code and comments Signed-off-by: perebaj <perebaj@gmail.com> * remove unreal scenario Signed-off-by: perebaj <perebaj@gmail.com> * remove unused if Signed-off-by: perebaj <perebaj@gmail.com> * remove unused labels Signed-off-by: perebaj <perebaj@gmail.com> * linter Signed-off-by: perebaj <perebaj@gmail.com> * enable type and unit through remotewrite config Signed-off-by: perebaj <perebaj@gmail.com> * remove test comment and capture type and unit when flag is enabled Signed-off-by: perebaj <perebaj@gmail.com> * gofumpt Signed-off-by: perebaj <perebaj@gmail.com> * modelTypeToWriteV2Type Signed-off-by: perebaj <perebaj@gmail.com> * use NewMetadataFromLabels Signed-off-by: perebaj <perebaj@gmail.com> * capture feature flag from main Signed-off-by: perebaj <perebaj@gmail.com> * simplifying logic Signed-off-by: perebaj <perebaj@gmail.com> * remove unused function Signed-off-by: perebaj <perebaj@gmail.com> * formatting code Signed-off-by: perebaj <perebaj@gmail.com> * gofumpt Signed-off-by: perebaj <perebaj@gmail.com> * remove public var: EnableTypeAndUnitLabels Signed-off-by: perebaj <perebaj@gmail.com> * remove enableTypeAndUnitLabels from TestPopulateV2TimeSeries_typeAndUnitLabels Signed-off-by: perebaj <perebaj@gmail.com> * remove enableTypeAndUnitLabels from main Signed-off-by: perebaj <perebaj@gmail.com> * use schema helper to populate metadata Signed-off-by: perebaj <perebaj@gmail.com> * remove metadata since nil is the default value Signed-off-by: perebaj <perebaj@gmail.com> * add TestPopulateV2TimeSeries_UnexpectedMetadata Signed-off-by: perebaj <perebaj@gmail.com> * Update storage/remote/queue_manager_test.go Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> --------- Signed-off-by: perebaj <perebaj@gmail.com> Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> |
|
|
71c21fb9e4 |
Fix minor issues after applying analyzer "modernize"
- The tool left an empty line behind that we don't need anymore, see https://github.com/prometheus/prometheus/pull/17092. (Arguably not a bug in the tool but just our stricter style about empty lines.) - In tsdb/index/postings_test.go , our (admittedly somewhat convoluted) code structure tricked the tool so it spit out something that wouldn't even compile. - storage/remote/queue_manager_test.go is just a minor formatting nit. Signed-off-by: beorn7 <beorn@grafana.com> |
|
|
747c5ee2b1 |
Apply analyzer "modernize" to the whole codebase
See https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize for details. This ran into a few issues (arguably bugs in the modernize tool), which I will fix in the next commit, so that we have transparency what was done automatically. Beyond those hiccups, I believe all the changes applied are legitimate. Even where there might be no tangible direct gain, I would argue it's still better to use the "modern" way to avoid micro discussions in tiny style PRs later. Signed-off-by: beorn7 <beorn@grafana.com> |
|
|
7cf585527f
|
remote_write: add metric for unexpected metadata in populateV2TimeSeries (#17034)
add metric to track unexpected metadata seen in populateV2TimeSeries, which would indicate metadata incorrectly routed in queue_manager code paths --------- Signed-off-by: leegin <leegin.t@gmail.com> Signed-off-by: Darkknight <leegin.t@gmail.com> |
|
|
a3c4a9bd18 |
[TESTS] remote-write: Make TestShutdown non-parallel to reduce flakes.
Resolves #17045. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
93bbf4bc90
|
Merge pull request #17041 from bernot-dev/remove-queue-manager-startup-benchmark
test: remove obsolete queue manager test |
|
|
0a40df33fb
|
Make metric/label name validation scheme explicit (#16928)
* Parameterize metric/label name validation scheme Parameterized metric/label name validation scheme --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Julius Hinze <julius.hinze@grafana.com> |
|
|
575a60ec92
|
test: fix flaky test
A race condition in TestSendSamplesWithBackoffWithSampleAgeLimit was observed in CI where the sample age limit was too close to the backoff time, causing samples to be dropped intermittently. Increasing the SampleAgeLimit resolves the problem. Signed-off-by: Adam Bernot <bernot@google.com> |
|
|
8cf67d99ba
|
test: remove obsolete test
As mentioned in #16182, the BenchmarkStartup test for the queue manager covers an old API and uses settings that will not occur in production Signed-off-by: Adam Bernot <bernot@google.com> |
|
|
fe1bb53372 |
parralell storage/remote
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> |
|
|
cef219c31c |
chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> |
|
|
a85618854a
|
Merge pull request #16721 from AxcelXander/fix-issue-14414-metadata-test
remote: Add metadata validation to TestSampleDelivery for v2 protocol |
|
|
472f0de661
|
Enhance TestDropOldTimeSeries to test both v1 and v2 protocols (#16709)
buf.build / lint and publish (push) Waiting to run
Details
CI / Go tests (push) Waiting to run
Details
CI / More Go tests (push) Waiting to run
Details
CI / Go tests with previous Go version (push) Waiting to run
Details
CI / UI tests (push) Waiting to run
Details
CI / Go tests on Windows (push) Waiting to run
Details
CI / Mixins tests (push) Waiting to run
Details
CI / Build Prometheus for common architectures (0) (push) Waiting to run
Details
CI / Build Prometheus for common architectures (1) (push) Waiting to run
Details
CI / Build Prometheus for common architectures (2) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (0) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (1) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (10) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (11) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (2) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (3) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (4) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (5) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (6) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (7) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (8) (push) Waiting to run
Details
CI / Build Prometheus for all architectures (9) (push) Waiting to run
Details
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
Details
CI / Check generated parser (push) Waiting to run
Details
CI / golangci-lint (push) Waiting to run
Details
CI / fuzzing (push) Waiting to run
Details
CI / codeql (push) Waiting to run
Details
CI / Publish main branch artifacts (push) Blocked by required conditions
Details
CI / Publish release artefacts (push) Blocked by required conditions
Details
CI / Publish UI on npm Registry (push) Blocked by required conditions
Details
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
Details
- Wrapped existing test logic in a loop to run with both protocol versions - Ensures consistent behavior across protocol versions for dropping old time series Signed-off-by: AxcelXander <tyz666@bu.edu> Co-authored-by: AxcelXander <tyz666@bu.edu> |
|
|
5fa1146e21
|
chore: enable gci linter (#16245)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> |
|
|
7a7bc65237
|
Add util/compression package to consolidate snappy/zstd use in Prometheus. (#16156)
# Conflicts: # tsdb/db_test.go Apply suggestions from code review tmp Addressed comments. Update util/compression/buffers.go Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com> |
|
|
c7d4b53ec1 |
chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> |
|
|
de23a9667c
|
prw2: Split PRW2.0 from metadata-wal-records feature (#16030)
Rationales: * metadata-wal-records might be deprecated and replaced going forward: https://github.com/prometheus/prometheus/issues/15911 * PRW 2.0 works without metadata just fine (although it sends untyped metrics as expected). Signed-off-by: bwplotka <bwplotka@gmail.com> |
|
|
9385f31147 |
scrape: Fix metadata in WAL not working for histograms and summaries.
The was a bug (due to confusion?) on the local metadata cache that is cached by metric family not the series metric name. The fix is to NOT use that local cache at all (it's still needed for current metadata API implementation, added TODO on how we can get rid of it). I went ahead and also rename Metric field in metadata structs to MetricFamily to make clear it's not always __name__. Signed-off-by: bwplotka <bwplotka@gmail.com> |
|
|
c8c128b0f1 |
fix TestDropOldTimeSeries on 32-bit
Signed-off-by: Joel Beckmeyer <joel@beckmeyer.us> |
|
|
0ef0b75a4f |
[TESTS] Remote-Write: Fix BenchmarkStartup
It was crashing due to uninitialized metrics, and not terminating due to incorrectly reading segment names. We need to export `SetMetrics` to avoid the first problem. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
af1a19fc78 |
enable errorf rule from perfsprint linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> |
|
|
6ebfbd2d54 |
chore!: adopt log/slog, remove go-kit/log
For: #14355 This commit updates Prometheus to adopt stdlib's log/slog package in favor of go-kit/log. As part of converting to use slog, several other related changes are required to get prometheus working, including: - removed unused logging util func `RateLimit()` - forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger - move some of the json file logging functionality to use prom/common package functionality - refactored some of the new json file logging for scraping - changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers - updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition - added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> |
|
|
005bd33fe2
|
support v2 proto for BenchmarkSampleSend (#14935)
Signed-off-by: Callum Styan <callumstyan@gmail.com> |
|
|
c328d5fc88
|
fix rwv2 build write request benchmark, also change how the memory usage (#14925)
is reported for these benchmarks to more accurately reflect what's actually allocated Signed-off-by: Callum Styan <callumstyan@gmail.com> |
|
|
50cd453c8f
|
chore: Fix typos (#14868)
* Fix typos --------- Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com> |
|
|
2110661121 |
fix: fix slice init length
Signed-off-by: cuishuang <imcusg@gmail.com> |
|
|
1561815732
|
remote write: increase time threshold for resharding (#14450)
Don't reshard if we haven't successfully sent a sample in the last shardUpdateDuration seconds. Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: kushagra Shukla <kushalshukla110@gmail.com> |
|
|
a60e5ce362
|
[PRW 2.0] Added Sender and RW Handler support for Response Stats. (#14444)
* [PRW 2.0] Added Sender support for Response Stats. Chained on top of https://github.com/prometheus/prometheus/pull/14427 Fixes https://github.com/prometheus/prometheus/issues/14359 Signed-off-by: bwplotka <bwplotka@gmail.com> * Addressed comments. Signed-off-by: bwplotka <bwplotka@gmail.com> * move write stats to it's own file Signed-off-by: Callum Styan <callumstyan@gmail.com> * Clean up header usage Signed-off-by: Callum Styan <callumstyan@gmail.com> * add missing license to new stats file Signed-off-by: Callum Styan <callumstyan@gmail.com> * Addressed all comments. Signed-off-by: bwplotka <bwplotka@gmail.com> --------- Signed-off-by: bwplotka <bwplotka@gmail.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Callum Styan <callumstyan@gmail.com> |
|
|
caa71fb3c7 |
chore(storage/remote): collect maxTimestamp when value is 0 as well.
This change enables the PrometheusRemoteWriteBehind alert’s expression to be evaluated even when the remote endpoint has never been reached. As a result, PrometheusRemoteWriteBehind will fire to easily detect configuration mistakes (such as incorrect endpoint URLs) or unrecoverable connectivity issues. See https://github.com/prometheus/prometheus/issues/14350 for details. Signed-off-by: machine424 <ayoubmrini424@gmail.com> |
|
|
9198952f7c
|
[PRW 2.0] Merging `remote-write-2.0` feature branch to main (PRW 2.0 support + metadata in WAL) (#14395)
* Remote Write 1.1: e2e benchmarks (#13102) * Remote Write e2e benchmarks Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Prometheus ports automatically assigned Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * make dashboard editable + more modular to different job label values Signed-off-by: Callum Styan <callumstyan@gmail.com> * Dashboard improvements * memory stats * diffs look at counter increases Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * run script: absolute path for config templates Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * grafana dashboard improvements * show actual values of metrics * add memory stats and diff Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * dashboard changes Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Callum Styan <callumstyan@gmail.com> * replace snappy encoding library Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add new proto types Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add decode function for new write request proto Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add lookup table struct that is used to build the symbol table in new write request format Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Implement code paths for new proto format Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * update example server to include handler for new format Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Add new test client Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * tests and new -> original proto mapping util Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add new proto support on receiver end Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Fix test Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * no-brainer copypaste but more performance write support Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove some comented code Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix mocks and fixture Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add basic reduce remote write handler benchmark Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * refactor out common code between write methods Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix: queue manager to include float histograms in new requests Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add sender-side tests and fix failing ones Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * refactor queue manager code to remove some duplication Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix build Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Improve sender benchmarks and some allocations Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Use github.com/golang/snappy Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * cleanup: remove hardcoded fake url for testing Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Add 1.1 version handling code Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Remove config, update proto Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * gofmt Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix NewWriteClient and change new flags wording Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fields rewording in handler Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remote write handler to checks version header Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix typo in log Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * lint Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Add minmized remote write proto format Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add functions for translating between new proto formats symbol table and actual prometheus labels Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add functionality for new minimized remote write request format Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix minor things Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Make LabelSymbols a fixed32 Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove unused proto type Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * update tests Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix build for stringlabels tag Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Use two uint32 to encode (offset,leng) Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * manually optimize varint marshaling Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Use unsafe []byte->string cast to reuse buffer Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix writeRequestMinimizedFixture Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove all code from previous interning approach the 'minimized' version is now the only v1.1 version Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * minimally-tested exemplar support for rw 1.1 Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * refactor new version flag to make it easier to pick a specific format instead of having multiple flags, plus add new formats for testing Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * use exp slices for backwards compat. to go 1.20 plus add copyright header to test file Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix label ranging Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * Add bytes slice (instead of slice of 32bit vars) format for testing Co-authored-by: Nicolás Pazos <npazosmendez@gmail.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * test additional len and lenbytes formats Co-authored-by: Nicolás Pazos <npazosmendez@gmail.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove mistaken package lock changes Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove formats we've decided not to use Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove more format types we probably won't use Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * More cleanup Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * use require instead of assert in custom marshal test Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * cleanup; remove some unused functions Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * more cleanup, mostly linting fixes Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove package-lock.json change again Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * more cleanup, address review comments Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix test panic Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix minor lint issue + use labels Range function since it looks like the tests fail to do `range labels.Labels` on CI Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * new interning format based on []string indeces Co-authored-by: bwplotka <bwplotka@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove all new rw formats but the []string one also adapt tests to the new format Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * cleanup rwSymbolTable Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * add some TODOs for later Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * don't reserve field 3 for new proto and add TODO Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix custom marshaling Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * lint Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * additional merge fixes Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * lint fixes Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * fix server example Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * revert package-lock.json changes Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * update example prometheus version Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * define separate proto types for remote write 2.0 Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * lint Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * rename new proto types and move to separate pkg Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * update prometheus version for example Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * make proto Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * make Metadata not nullable Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * remove old MinSample proto message Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com> * change enum names to fit buf build recommend enum naming and lint rules Signed-off-by: Callum Styan <callumstyan@gmail.com> * remote: Added test for classic histogram grouping when sending rw; Fixed queue manager test delay. (#13421) Signed-off-by: bwplotka <bwplotka@gmail.com> * Remote write v2: metadata support in every write request (#13394) * Approach bundling metadata along with samples and exemplars Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add first test; rebase with main Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Alternative approach: bundle metadata in TimeSeries protobuf Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * update go mod to match main branch Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix after rebase Signed-off-by: Callum Styan <callumstyan@gmail.com> * we're not going to modify the 1.X format anymore Signed-off-by: Callum Styan <callumstyan@gmail.com> * Modify AppendMetadata based on the fact that we be putting metadata into timeseries Signed-off-by: Callum Styan <callumstyan@gmail.com> * Rename enums for remote write versions to something that makes more sense + remove the added `sendMetadata` flag. Signed-off-by: Callum Styan <callumstyan@gmail.com> * rename flag that enables writing of metadata records to the WAL Signed-off-by: Callum Styan <callumstyan@gmail.com> * additional clean up Signed-off-by: Callum Styan <callumstyan@gmail.com> * lint Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix usage of require.Len Signed-off-by: Callum Styan <callumstyan@gmail.com> * some clean up from review comments Signed-off-by: Callum Styan <callumstyan@gmail.com> * more review fixes Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Paschalis Tsilias <paschalist0@gmail.com> * remote write 2.0: sync with `main` branch (#13510) * consoles: exclude iowait and steal from CPU Utilisation 'iowait' and 'steal' indicate specific idle/wait states, which shouldn't be counted into CPU Utilisation. Also see https://github.com/prometheus-operator/kube-prometheus/pull/796 and https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/667. Per the iostat man page: %idle Show the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request. %iowait Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request. %steal Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> * tsdb: shrink txRing with smaller integers 4 billion active transactions ought to be enough for anyone. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * tsdb: create isolation transaction slice on demand When Prometheus restarts it creates every series read in from the WAL, but many of those series will be finished, and never receive any more samples. By defering allocation of the txRing slice to when it is first needed, we save 32 bytes per stale series. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * add cluster variable to Overview dashboard Signed-off-by: Erik Sommer <ersotech@posteo.de> * promql: simplify Native Histogram arithmetics Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com> * Cut 2.49.0-rc.0 (#13270) * Cut 2.49.0-rc.0 Signed-off-by: bwplotka <bwplotka@gmail.com> * Removed the duplicate. Signed-off-by: bwplotka <bwplotka@gmail.com> --------- Signed-off-by: bwplotka <bwplotka@gmail.com> * Add unit protobuf parser Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it> * Go on adding protobuf parsing for unit Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it> * ui: create a reproduction for https://github.com/prometheus/prometheus/issues/13292 Signed-off-by: machine424 <ayoubmrini424@gmail.com> * Get conditional right Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it> * Get VM Scale Set NIC (#13283) Calling `*armnetwork.InterfacesClient.Get()` doesn't work for Scale Set VM NIC, because these use a different Resource ID format. Use `*armnetwork.InterfacesClient.GetVirtualMachineScaleSetNetworkInterface()` instead. This needs both the scale set name and the instance ID, so add an `InstanceID` field to the `virtualMachine` struct. `InstanceID` is empty for a VM that isn't a ScaleSetVM. Signed-off-by: Daniel Nicholls <daniel.nicholls@resdiary.com> * Cut v2.49.0-rc.1 Signed-off-by: bwplotka <bwplotka@gmail.com> * Delete debugging lines, amend error message for unit Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it> * Correct order in error message Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it> * Consider storage.ErrTooOldSample as non-retryable Signed-off-by: Daniel Kerbel <nmdanny@gmail.com> * scrape_test.go: Increase scrape interval in TestScrapeLoopCache to reduce potential flakiness Signed-off-by: machine424 <ayoubmrini424@gmail.com> * Avoid creating string for suffix, consider counters without _total suffix Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it> * build(deps): bump github.com/prometheus/client_golang Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.17.0 to 1.18.0. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](https://github.com/prometheus/client_golang/compare/v1.17.0...v1.18.0) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump actions/setup-node from 3.8.1 to 4.0.1 Bumps [actions/setup-node](https://github.com/actions/setup-node) from 3.8.1 to 4.0.1. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits]( |
|
|
00b110c65c
|
Fix data corruption in remote write if max_sample_age is applied (#14078)
* fix: try to reproduce the bug from https://github.com/prometheus/prometheus/issues/13979 in a test case Signed-off-by: David Vavra <sevenood@gmail.com> * fix: data corruption in remote write if max_sample_age is applied Signed-off-by: David Vavra <sevenood@gmail.com> * add benchmark for buildTimeSeries which does the filtering Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: David Vavra <sevenood@gmail.com> Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: David Vavra <sevenood@gmail.com> Co-authored-by: Callum Styan <callumstyan@gmail.com> |
|
|
35564c0cb0
|
Export remote.LabelsToLabelsProto() and remote.LabelProtosToLabels()
Signed-off-by: Marco Pracucci <marco@pracucci.com> |
|
|
f10c3454e9 |
Enable perfsprint linter and fix up code
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com> |
|
|
6f595c6762
|
golangci-lint: enable whitespace linter (#13905)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> |
|
|
a09465baee
|
storage/remote: disable resharding during active retry backoffs (#13562)
* storage/remote: disable resharding during active retry backoffs Today, remote_write reshards based on pure throughput. This is problematic if throughput has been diminished because of HTTP 429s; increasing the number of shards due to backpressure will only exacerbate the problem. This commit disables resharding for twice the retry backoff, ensuring that resharding will never occur during an active backoff, and that resharding does not become enabled again until enough time has elapsed to allow any pending requests to be retried. Signed-off-by: Robert Fratto <robertfratto@gmail.com> * storage/remote: test that resharding is disabled on retry Signed-off-by: Robert Fratto <robertfratto@gmail.com> * storage/remote: address review feedback Signed-off-by: Robert Fratto <robertfratto@gmail.com> * storage/remote: track time where resharding initially got disabled This change introduces a second atomic int64 to roughly track when resharding got disabled. This int64 is only updated after updating the disabled timestamp if resharding was previously enabled. Signed-off-by: Robert Fratto <robertfratto@gmail.com> --------- Signed-off-by: Robert Fratto <robertfratto@gmail.com> |
|
|
2ac1632eec |
storage/remote: improve symbol-table handling
On the incoming path, `writeHandler.write()` creates a new table for each request. `labelProtosToLabels` takes a `ScratchBuilder` now. Call `NewScratchBuilder` as required in tests. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
8f525b4ba4 |
storage/remote tests: refactor: extract function newTestQueueManager
To reduce repetition. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
c0e36e6bb3 |
Standardise exemplar label as "trace_id"
This is consistent with the OpenTelemetry standard, and an example in OpenMetrics. https://github.com/open-telemetry/opentelemetry-specification/blob/89aa01348139/specification/metrics/data-model.md#exemplars https://github.com/OpenObservability/OpenMetrics/blob/138654493130/specification/OpenMetrics.md#exemplars-1 Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
17f48f2b3b |
Tests: use replacement DeepEquals in more places
Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
8655fe5401
|
Merge pull request #13491 from bboreham/faster-store-series
storage/remote: speed up StoreSeries by re-using labels.Builder |
|
|
b9fdf3dad1 |
storage/remote: document why two benchmarks are skipped
One was silently doing nothing; one was doing something but the work didn't go up linearly with iteration count. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
dcd024a095 |
storage/remote: speed up StoreSeries by re-using labels.Builder
Relabeling can take a pre-populated `Builder` instead of making a new one every time. This is much more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
d9483bb77c |
storage/remote: add BenchmarkStoreSeries
Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |
|
|
78c5ce3196
|
Drop old inmemory samples (#13002)
* Drop old inmemory samples Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Avoid copying timeseries when the feature is disabled Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Run gofmt Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Clarify docs Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Add more logging info Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Remove loggers Signed-off-by: Marc Tuduri <marctc@protonmail.com> * optimize function and add tests Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Simplify filter Signed-off-by: Marc Tuduri <marctc@protonmail.com> * rename var Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Update help info from metrics Signed-off-by: Marc Tuduri <marctc@protonmail.com> * use metrics to keep track of drop elements during buildWriteRequest Signed-off-by: Marc Tuduri <marctc@protonmail.com> * rename var in tests Signed-off-by: Marc Tuduri <marctc@protonmail.com> * pass time.Now as parameter Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Change buildwriterequest during retries Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Revert "Remove loggers" This reverts commit 54f91dfcae20488944162335ab4ad8be459df1ab. Signed-off-by: Marc Tuduri <marctc@protonmail.com> * use log level debug for loggers Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Fix linter Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove noisy debug-level logs; add 'reason' label to drop metrics Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove accidentally committed files Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Propagate logger to buildWriteRequest to log dropped data Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix docs comment Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make drop reason more specific Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove unnecessary pass of logger Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use snake_case for reason label Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix dropped samples metric Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> --------- Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> Signed-off-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> |
|
|
8065bef172 |
Move metric type definitions to common/model
They are used in multiple repos, so common is a better place for them. Several packages now don't depend on `model/textparse`, e.g. `storage/remote`. Also remove `metadata` struct from `api.go`, since it was identical to a struct in the `metadata` package. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> |