Commit Graph

278 Commits

Author SHA1 Message Date
Ganesh Vernekar 5b1e6fe398 Take the stale series compaction config from the config file
CI / Go tests (push) Has been cancelled Details
CI / More Go tests (push) Has been cancelled Details
CI / Go tests with previous Go version (push) Has been cancelled Details
CI / UI tests (push) Has been cancelled Details
CI / Go tests on Windows (push) Has been cancelled Details
CI / Mixins tests (push) Has been cancelled Details
CI / Build Prometheus for common architectures (0) (push) Has been cancelled Details
CI / Build Prometheus for common architectures (1) (push) Has been cancelled Details
CI / Build Prometheus for common architectures (2) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (0) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (1) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (10) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (11) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (2) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (3) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (4) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (5) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (6) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (7) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (8) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (9) (push) Has been cancelled Details
CI / Check generated parser (push) Has been cancelled Details
CI / golangci-lint (push) Has been cancelled Details
CI / fuzzing (push) Has been cancelled Details
CI / codeql (push) Has been cancelled Details
CI / Report status of build Prometheus for all architectures (push) Has been cancelled Details
CI / Publish main branch artifacts (push) Has been cancelled Details
CI / Publish release artefacts (push) Has been cancelled Details
CI / Publish UI on npm Registry (push) Has been cancelled Details
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-25 18:40:28 -07:00
Ganesh Vernekar 3b79fb207e Ignore compacted stale series from WAL
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-25 15:11:17 -07:00
Ganesh Vernekar 5349f2ceaa Add DB Options to trigger staleseries compactions
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-25 15:11:17 -07:00
Ganesh Vernekar de8ea977a9 Add support for stale series compaction to TSDB
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-25 15:11:17 -07:00
Bryan Boreham 498f63e60b
Merge pull request #17029 from pr00se/wal-checkpoint-dropped-samples
TSDB: use timestamps rather than WAL segment numbers to track how long deleted series should be retained in checkpoints
2025-08-20 11:15:10 +01:00
Ganesh Vernekar a86d9a3858
Merge pull request #16925 from prometheus/codesome/stale-series-tracking
tsdb: Track stale series in the Head block based on stale sample
2025-08-19 15:35:19 -07:00
Ganesh Vernekar 7a947d3629 Track stale series in the Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:27 -07:00
Patryk Prus 5cb0192626
Address linter errors
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:25:14 -04:00
Patryk Prus 0fea41ed53
Refactor keep function to work for both agent and non-agent implementations
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:47 -04:00
Patryk Prus 6875022873
Update head.walExpiries with record timestamps during WAL replay
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:47 -04:00
Patryk Prus 218558f543
Store mint rather than the last WAL segment in head.walExpiries during head GC
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:41 -04:00
Matthieu MOREL cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
Bryan Boreham a11772234d
Merge pull request #16333 from colega/fix-series-create-gc-race
fix: race condition between series creation and garbage collection
2025-04-17 12:15:11 +01:00
machine424 a825d448da feat(tsdb/(head|agent)): dereference the pools at the end of the WL replay to
not wait for an extra GC cycle until the built-in cleanup mechanism
kicks in

See https://github.com/prometheus/prometheus/pull/15778

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-04-17 13:06:08 +02:00
Ben Kochie a721daf981
Log WAL segment loading time (#16336)
Improve readability of "WAL segment loaded" by logging the duration
of each load. This helps make it easier to spot slow WAL file load
times.

Signed-off-by: SuperQ <superq@gmail.com>
2025-03-31 06:05:14 +02:00
Oleg Zaytsev e4fe8d8684
Create memSeries with pendingCommit=true
This fixes TestHead_RaceBetweenSeriesCreationAndGC.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2025-03-27 11:11:57 +01:00
Matthieu MOREL 5fa1146e21
chore: enable gci linter (#16245)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-03-22 15:46:13 +00:00
Fiona Liao 37c2ebb5fd
Make out-of-order native histograms flag a no-op and always enable (#16207)
* Remove experimental out-of-order native histogram flag

This feature has been available in Prometheus since September 2024,
and has no known issues. Therefore proposing to remove the flag
entirely and always have it on. Note that there are still two
settings that need to be configured (out-of-order time window > 0
and native histograms enabled) for this feature to work.

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Keep feature flag with warning

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update CHANGELOG.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Additional cleanup of comments and test names

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

---------

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-03-18 10:59:02 +00:00
Patryk Prus 401dbacf2e
Add counters for unknown series references during WAL/WBL replay
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-03-17 15:17:53 -04:00
Patryk Prus 61aa82865d
TSDB: keep duplicate series records in checkpoints while their samples may still be present (#16060)
Renames the head's deleted map to walExpiries, and creates entries for any
duplicate series records encountered during WAL replay, with the expiry set
to the highest current WAL segment number. Any subsequent WAL
checkpoints will see the duplicate series entry in the walExpiries map, and
keep the series record until the last WAL segment that could contain its
samples is deleted.

Other considerations:

WBL: series records aren't written to the WBL, so there are no duplicates to deal with
agent mode: has its own WAL replay logic that handles duplicate series records differently, and is outside the scope of this PR
2025-03-05 13:45:08 -05:00
Arve Knudsen 7cbf749096
Upgrade to github.com/oklog/ulid/v2 (#16168)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-03-05 16:03:25 +01:00
Bryan Boreham 42d55505f9
Merge pull request #12659 from prymitive/memChunk
Short-cut common memChunk operations
2025-02-25 11:33:56 +00:00
machine424 d644324407
feat(tsdb/(head|agent)): reuse pools across segments to avoid generating garbage during WL replay
This is part of the "reduce WAL replay overhead/garbage" effort to help with https://github.com/prometheus/prometheus/issues/6934.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-02-10 22:40:24 +01:00
Alan Protasio 9d1abbb9ed
Call PostCreation callback only after the new series is added to the mempotings (#15579)
Signed-off-by: alanprot <alanprot@gmail.com>
2025-01-28 12:11:58 +01:00
Bryan Boreham ac4f8a5e23
[ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880)
* [ENHANCEMENT] TSDB: Improve calculation of space used by labels

The labels for each series in the Head take up some some space in the
Postings index, but far more space in the `memSeries` structure.

Instead of having the Postings index calculate this overhead, which is
a layering violation, have the caller pass in a function to do it.

Provide three implementations of this function for the three Labels
versions.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-16 09:42:52 +00:00
Pedro Tanaka bab587b9dc
Agent: allow for ingestion of CT samples (#15124)
* Remove unused option from HeadOptions

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Improve docs for appendable() method in head appender

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Ingest CT (float) samples in Agent DB

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* allow for ingestion of CT native histogram

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding some verification for ct ts

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Validating CT histogram before append and add newly created series to pending series

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* checking the wal for written samples

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Checking for samples in test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding case for validations

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* fixing comparison when dedupelabels is enabled

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* unite tests, use table testing

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Implement CT related methods in timestampTracker for write storage

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding error case to test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* removing unused fields

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Updating lastTs for series when adding CT to invalidate duplicates

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* making sure that updating the lastTS wont cause OOO later on in Commit();

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

---------

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-27 01:06:34 +01:00
Łukasz Mierzwa b6e22cd346 Short-cut common memChunk operations
memChunk is a linked list, speed up some common operations when there's no need to iterate all elements on the list.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2024-10-25 12:19:20 +01:00
TJ Hoplock 6ebfbd2d54 chore!: adopt log/slog, remove go-kit/log
For: #14355

This commit updates Prometheus to adopt stdlib's log/slog package in
favor of go-kit/log. As part of converting to use slog, several other
related changes are required to get prometheus working, including:
- removed unused logging util func `RateLimit()`
- forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger
- move some of the json file logging functionality to use prom/common package functionality
- refactored some of the new json file logging for scraping
- changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers
- updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition
- added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-07 15:58:50 -04:00
György Krajcsovits 44ebbb8458 Fix missing histogram copy in sampleRing
The specialized version of sample add to the ring:
func addH(s hSample, buf []hSample, r *sampleRing) []hSample
func addFH(s fhSample, buf []fhSample, r *sampleRing) []fhSample
already correctly copy histogram samples from the reused hReader, fhReader
buffers, but the generic version does not. This means that the
data is overwritten on the next read if the sample ring has seen histogram
and float samples at the same time and switched to generic mode.

The `genericAdd` function (which was commented anyway) is by now quite
different from the specialized functions so that this commit deletes
it.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-02 13:57:28 +02:00
Carrie Edwards 14e3c05ce8
tsdb: Add support for ingestion of out-of-order native histogram samples (#14546)
Add support for ingesting OOO native histograms

* Add flag for enabling and disabling OOO native histogram ingestion
* Update OOO querying tests to include native histogram samples
* Add OOO head tests
* Add test for OOO native histogram counter reset headers

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored by: Jeanette Tan <jeanette.tan@grafana.com>
Co-authored by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Fiona Liao <fiona.liao@grafana.com>
2024-09-17 11:19:06 +02:00
Nathan Baulch 50cd453c8f
chore: Fix typos (#14868)
* Fix typos

---------

Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>
2024-09-10 22:32:03 +02:00
George Krajcsovits 536d9f9ce9
BUGFIX: TSDB: panic in query during truncation with OOO head (#14831)
Check if headQuerier is nil before trying to use it.

* TestQueryOOOHeadDuringTruncate: unit test to check query during truncate
Regression test for #14822

* Simulate race between query and Compact()

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-05 17:17:42 +01:00
Marco Pracucci ef649d5968
Revert " Store `mmMaxTime` in same field as `seriesShard`"
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-08-26 08:56:16 +02:00
Oleg Zaytsev 0300ad58a9
Revert the option regardless of error
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 11:31:31 +02:00
Oleg Zaytsev d8e1b6bdfd
Store mmMaxTime in same field as seriesShard
We don't use seriesShard during DB initialization, so we can use the
same 8 bytes to store mmMaxTime, and save those during the rest of the
lifetime of the database.

This doesn't affect CPU performance.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 10:20:29 +02:00
Bryan Boreham d878146c70 TSDB: shrink memSeries by moving bools together
In each case the following member requires 8-byte alignment, so moving
one beside the other shrinks memSeries from 176 to 168 bytes, when
compiled with `-tags stringlabels`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-07-15 16:09:48 +01:00
Bryan Boreham 709c5d6fc3
TSDB: Lock around access to labels in head under -tags dedupelabels (#14322)
* TSDB: Document what needs locking in memSeries

* TSDB: Lock around access to series labels

So we can modify them to reset the symbol-table.

* TSDB: Make label locking conditional on build tag

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-07-05 10:11:32 +01:00
Oleg Zaytsev fd1a89b7c8
Pass affected labels to `MemPostings.Delete()` (#14307)
* Pass affected labels to MemPostings.Delete

As suggested by @bboreham, we can track the labels of the deleted series
and avoid iterating through all the label/value combinations.

This looks much faster on the MemPostings.Delete call. We don't have a
benchmark on stripeSeries.gc() where we'll pay the price of iterating
the labels of each one of the deleted series.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 10:28:56 +00:00
Alan Protasio 8894d65cd6
Fix head stats and hooks when replaying a corrupted snapshot (#14079)
* Fixing head stats and hooks when replaying a corrupted snapshot

Signed-off-by: alanprot <alanprot@gmail.com>

* Fixing create/removed series metrics

Signed-off-by: alanprot <alanprot@gmail.com>

* Refactoring to have common code between gc and flush method

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* refactor

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-05-24 22:43:21 -04:00
Nick Pillitteri 481f14e1c0
TSDB: Don't rely on integer overflow in head compaction check (#13755)
* TSDB: Don't compact the head block when empty

Don't compact the Head block if there have not yet been any samples
appended.

Previously, the logic for determining if the head should be compacted
relied on the default values for min and max time and integer overflow
when they were checked in `Head.compactable()`. The check in
`Head.compactable()` effectively did `math.MinInt64 - math.MaxInt64`
which overflowed and wrapped to `1`. Since `1` is less than `1.5`
times the chunk range, compaction did not happen. This was the correct
behavior but relying on overflow wrapping is surprising.

This change add a method for checking if the min and max time for the
head is unset and uses it to short-circuit compaction in that case.
It also replaces several explicit checks for the default value to
determine if the head has not yet had any samples added.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2024-03-26 12:17:38 +01:00
Ben Ye ceca6c4716
[ENHANCEMENT] TSDB: Log more statistics during startup (#13838)
* log chunk snapshot and mmap chunks replay duration together with total replay duration

Signed-off-by: Ben Ye <benye@amazon.com>
2024-03-26 11:16:27 +00:00
György Krajcsovits 4d4d822c36 Add native histograms to latency/duration metrics
Dogfood native histograms.
Allow dependent projects to migrate to native histograms.

I took the defaults from client_golang.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-01 14:44:38 +01:00
Bryan Boreham 93b72ec5dd tsdb: create SymbolTables for labels as required
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Fiona Liao 52389647b2 Add type label to outOfOrderSamplesAppended metric
Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
2024-02-19 15:24:39 +00:00
Bryan Boreham cd4562d3a6
Merge pull request #13473 from bboreham/pure-mutex
tsdb: use cheaper Mutex on series
2024-01-30 09:57:08 +00:00
Marco Pracucci 501bc6419e
Add ShardedPostings() support to TSDB (#10421)
This PR is a reference implementation of the proposal described in #10420.

In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing).

Follow up work
As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes.

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-29 11:57:27 +00:00
Bryan Boreham 66237c1996 tsdb: use cheaper Mutex on series
Mutex is 8 bytes; RWMutex is 24 bytes and much more complicated. Since
`RLock` is only used in two places, `UpdateMetadata` and `Delete`,
neither of which are hotspots, we should use the cheaper one.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-26 11:01:39 +00:00
Bryan Boreham b9eab6e4b8
tsdb: simplify internal series delete function (#13261)
Lifting an optimisation from Agent code, `seriesHashmap.del` can use
the unique series reference, doesn't need to check Labels.
Also streamline the logic for deleting from `unique` and `conflicts` maps,
and add some comments to help the next person.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-25 11:57:54 +01:00
Bryan Boreham 3f30ad3cc2
Merge pull request #13015 from bboreham/smaller-txring
tsdb: make transaction isolation data structures smaller
2024-01-25 10:48:15 +00:00
Matthieu MOREL 8f6cf3aabb tsdb: use Go standard errors
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-11 12:18:54 +00:00