Commit Graph

1441 Commits

Author SHA1 Message Date
Łukasz Mierzwa b880cea613 Fix locks in db.reloadBlocks()
This partially reverts ae3d392aa9.

ae3d392aa9 added a call to db.mtx.Lock() that lasts for the entire duration of db.reloadBlocks(),
previous db.mtx would be locked only during critical part of db.reloadBlocks().
The motivation was to protect against races:
9e0351e161 (r555699794)
The 'reloads' being mentioned are (I think) reloadBlocks() calls, rather than db.reload() or other methods.
TestTombstoneCleanRetentionLimitsRace was added to catch this but I wasn't able to ever get any error out of it, even after disabling all calls to db.mtx in reloadBlocks() and CleanTombstones().
To make things more complicated CleanupTombstones() itself calls reloadBlocks(), so it seems that the real issue is that we might have concurrent calls to reloadBlocks().

The problem with this change is that db.reloadBlocks() can take a very long time, that's because it might need to load very large blocks from disk, which is slow.
While db.mtx is locked a large chunk of the db is locked, including queries, since db.mtx read lock is needed for db.Querier() call.
One of the issues this manifests itself as is a gap in all metrics and blocked queries just after a large block compaction happens.
When compaction merges multiple day-or-more blocks into a week-or-more block it create a single very big block.
After that block is written it needs to be loaded and that seems to be taking many seconds (30-45), during which mtx is held and everything is blocked.

Turns out that there is another lock that is more fine grained and aimed at this specific use case:

// cmtx ensures that compactions and deletions don't run simultaneously.
cmtx sync.Mutex

All calls to reloadBlocks() are wrapped inside cmtx lock. The only exception is db.reload() which this change fixes.
We can't add cmtx lock inside reloadBlocks() itself because it's called by a number of functions, some of which are already holding cmtx.

Looking at the code I think it is sufficient to hold cmtx and skip a reloadBlocks() wide mtx call.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2025-01-09 17:05:39 +00:00
Arve Knudsen f030894c2c
Fix issues raised by staticcheck (#15722)
Fix issues raised by staticcheck

We are not enabling staticcheck explicitly, though, because it has too many false positives.

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-01-09 17:51:26 +01:00
Ben Ye 919a5b657e
Expose ListPostings Length via Len() method (#15678)
tsdb: expose remaining ListPostings Length

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-07 17:58:26 +01:00
György Krajcsovits 1e420ef373 Merge branch 'main' into cedwards/nhcb-wal-wbl
# Conflicts:
#	tsdb/tsdbutil/histogram.go
2025-01-02 12:50:19 +01:00
György Krajcsovits a7ccc8e091 record_test.go: avoid captures, simply return test refs
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-01-02 12:45:20 +01:00
Bryan Boreham 096e2aa7bd
Merge pull request #14518 from bboreham/faster-listpostings-merge
TSDB: Optimization: Merge postings using concrete type
2025-01-02 10:43:45 +00:00
Bryan Boreham b2fa1c9524 TSDB benchmarks: Commit periodically to speed up init
When creating dummy data for benchmarks, call `Commit()` periodically to
avoid growing the appender to enormous size.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-30 17:42:56 +00:00
johncming 061400e31b
tsdb: export CheckpointPrefix constant (#15636)
Exported the CheckpointPrefix constant to be used in other packages.
Updated references to the constant in db.go and checkpoint.go files.
This change improves code readability and maintainability.

Signed-off-by: johncming <johncming@yahoo.com>
Co-authored-by: johncming <conjohn668@gmail.com>
2024-12-29 17:54:45 +01:00
Carrie Edwards 1508149184 Update benchmark test and comment 2024-12-27 09:09:13 -08:00
Bryan Boreham cfa32f3d28 TSDB: Move merge of head postings into index
This enables it to take advantage of a more compact data structure
since all postings are known to be `*ListPostings`.

Remove the `Get` member which was not used for anything else, and fix up
tests.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 19:22:30 +00:00
Bryan Boreham 0a8779f46d TSDB: Make mergedPostings generic
Now we can call it with more specific types which is more efficient than
making everything go through the `Postings` interface.

Benchmark the concrete type.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 17:09:21 +00:00
Bryan Boreham 1b22242024 TSDB BenchmarkMerge: run fewer sizes
As long as we run small and big sizes, we don't need all the sizes inbetween.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 17:09:21 +00:00
Bryan Boreham e630ffdbed TSDB: extend BenchmarkMemPostings_PostingsForLabelMatching to check merge speed
We need to create more postings entries so the merger has some work to do.
Not material for the regexp ones as they match so few series.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 17:09:21 +00:00
Björn Rabenstein 318d6bc4bf
Merge pull request #15548 from TinfoilSubmarine/fix/386-test-failures
test: fixes for 32-bit archs
2024-12-18 15:49:30 +01:00
Björn Rabenstein ff398062cb
Merge pull request #15679 from colega/update-comment-on-mempostings-lvs
Update comment on MemPostings.lvs
2024-12-17 19:41:56 +01:00
Oleg Zaytsev c8359fcd6b
Fix bug in lbl!~".+" shortcut (#15684)
We were appending to the wrong slice, so instead of removing values, we
were adding them.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-12-17 17:34:24 +01:00
Oleg Zaytsev 17d5bc4e54
Update comment on MemPostings.lvs
There was a missing verb there.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-12-16 17:20:51 +01:00
Joel Beckmeyer 39f5a07236 fix TestOOOHeadChunkReader_Chunk on 32-bit
Signed-off-by: Joel Beckmeyer <joel@beckmeyer.us>
2024-12-16 10:45:07 -05:00
Bryan Boreham ac4f8a5e23
[ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880)
* [ENHANCEMENT] TSDB: Improve calculation of space used by labels

The labels for each series in the Head take up some some space in the
Postings index, but far more space in the `memSeries` structure.

Instead of having the Postings index calculate this overhead, which is
a layering violation, have the caller pass in a function to do it.

Provide three implementations of this function for the three Labels
versions.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-16 09:42:52 +00:00
David Ashpole 953a873342
update links to openmetrics to reference the v1.0.0 release
Signed-off-by: David Ashpole <dashpole@google.com>
2024-12-13 21:32:27 +00:00
György Krajcsovits df88de5800 Fix lint for real
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 12:52:01 +01:00
György Krajcsovits cf36792e14 Fix unused import
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 12:49:28 +01:00
György Krajcsovits fdb1516af1 Fix lint errors
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 12:47:43 +01:00
György Krajcsovits d64d1c4c0a Benchmark encoding classic and nhcb
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 10:59:06 +01:00
György Krajcsovits a325ff142c fix(test): do not run automatic WAL truncate during test
Remove the 2 minute timeout as the default is 2 hours and wouldn't
interfere. With the test. Otherwise the extra samples combined with
race detection can push the test over 2 minutes and make it fail.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-10 17:30:46 +01:00
György Krajcsovits 07276aeece fix(test): if we are dereferencing a slice we should check its len
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-10 16:25:50 +01:00
György Krajcsovits 8f572fe905 fix(lint): linter errors
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-10 16:25:20 +01:00
György Krajcsovits b94c87bea6 fix(test): TestCheckpoint segment size too low
The segment size was too low for the additional NHCB data, thus it created
more segments then expected. This meant that less were in the lower
numbered segments, which meant more was kept.

FAIL: TestCheckpoint (4.05s)
  FAIL: TestCheckpoint/compress=none (0.22s)
        checkpoint_test.go:361:
            	Error Trace:	/home/krajo/go/github.com/prometheus/prometheus/tsdb/wlog/checkpoint_test.go:361
            	Error:      	"0.8586956521739131" is not less than "0.8"
            	Test:       	TestCheckpoint/compress=none

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-10 16:16:46 +01:00
György Krajcsovits efdd0880c1 Merge branch 'main' into cedwards/nhcb-wal-wbl
# Conflicts:
#	tsdb/docs/format/wal.md
2024-12-10 14:33:35 +01:00
bwplotka eeef17ea0a docs: Added native histogram WAL record documentation.
Signed-off-by: bwplotka <bwplotka@gmail.com>
2024-12-09 11:47:28 +00:00
Carrie Edwards 1933ccc9be Fix test 2024-12-06 14:55:19 -08:00
Carrie Edwards a046417bc0 Use new record type only for NHCB 2024-12-06 13:46:20 -08:00
Carrie Edwards 45944c1847 Extend tsdb agent tests with custom bucket histograms 2024-12-05 09:21:47 -08:00
Carrie Edwards 6b44c1437f Fix comment and histogram record string 2024-12-05 09:21:47 -08:00
Carrie Edwards f8a39767a4 Update WAL doc to include native histogram encodings 2024-12-05 09:21:47 -08:00
Carrie Edwards 6684344026 Rename old histogram record type, use old names for new records 2024-12-05 09:21:47 -08:00
Carrie Edwards 454f6d39ca Add separate handling for histograms and custom bucket histograms 2024-12-05 09:21:47 -08:00
Carrie Edwards 37df50adb9 Attempt for record type 2024-12-05 09:21:47 -08:00
Carrie Edwards cfcd51538d Remove references to custom values record 2024-12-05 09:21:47 -08:00
Carrie Edwards 6d413fad36 Use histogram records for custom value handling 2024-12-05 09:21:47 -08:00
Carrie Edwards aa144b7263 Handle custom buckets in WAL and WBL 2024-12-05 09:21:47 -08:00
Antoine Pultier f1340bac64
documentation: put back trailing punctuation.
markdownlint wasn't happy about the trailing punctuation in the headings.

Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>
2024-12-03 14:36:56 +01:00
Antoine Pultier 5c2fd7988b
Merge remote-tracking branch 'upstream/main' into patch-2
Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>
2024-12-03 14:32:28 +01:00
Antoine Pultier 6046769941
tsdb documenation: Improve Chunk documentation
Signed-off-by: Antoine Pultier <45740+fungiboletus@users.noreply.github.com>

Signed-off-by: Antoine Pultier <45740+fungiboletus@users.noreply.github.com>
2024-12-03 14:24:50 +01:00
Oleg Zaytsev cd1f8ac129
MemPostings: keep a map of label values slices (#15426)
While investigating lock contention on `MemPostings`, we saw that lots
of locking is happening in `LabelValues` and
`PostingsForLabelsMatching`, both copying the label values slices while
holding the mutex.

This adds an extra map that holds an append-only label values slice for
each one of the label names. Since the slice is append-only, it can be
copied without holding the mutex.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-29 12:52:56 +01:00
Charles Korn 96adc410ba
tsdb/chunkenc: don't reuse custom value slices between histograms
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-11-29 16:28:09 +11:00
Oleg Zaytsev 9ad93ba8df
Optimize l=~".+" matcher (#15474)
Since dot is matching newline now, `l=~".+"` is "any non empty label
value", and #14144 added a specific method in the index for that so we
don't need to run the matcher on each one of the label values.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-27 12:33:20 +01:00
Bryan Boreham ca3119bd24 TSDB: eliminate one yolostring
When the only use of a []byte->string conversion is as a map key, Go
doesn't allocate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-26 17:21:55 +00:00
Bryan Boreham e98c19c1ce [PERF] TSDB: Cache all symbols for compaction
Trade a bit more memory for a lot less CPU spent looking up symbols.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-26 17:21:55 +00:00
Oleg Zaytsev 9aa6e041d3
MemPostings: allocate ListPostings once in PFALV (#15465)
Same as #15427 but for the new method added in #14144

Instead of allocating each ListPostings one by one, allocate them all in
one go.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-26 16:03:45 +01:00
DC d535d501d1
[DOCS] Improve description of WAL record format (#14936)
Signed-off-by: DC <413331538@qq.com>
2024-11-26 11:48:17 +00:00
Bryan Boreham dd0252a774
Merge pull request #15380 from bboreham/improve-loadwbl
[BUGFIX] TSDB: Apply fixes from loadWAL to loadWBL
2024-11-25 17:31:49 +00:00
Bryan Boreham 7996a13fdd
Merge pull request #15403 from bboreham/fix-rw-benchmark-startup
[TESTS] Remote-Write: Fix BenchmarkStartup
2024-11-25 17:31:24 +00:00
Oleg Zaytsev cc390aab64
MemPostings: allocate ListPostings once in PFLM (#15427)
Instead of allocating ListPostings pointers one by one, allocate a slice
and take pointers from that. It's faster, and also generates less
garbage (NewListPostings is one of the top offenders in number of
allocations).

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-20 17:52:20 +01:00
Arve Knudsen 89bbb885e5
Upgrade to golangci-lint v1.62.0 (#15424)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-20 17:22:20 +01:00
Björn Rabenstein 384c5951ef
Merge pull request #14489 from harry671003/implement_metadata_limit
storage: Implement limit in mergeGenericQuerier
2024-11-19 17:32:16 +01:00
Arve Knudsen 06d54fcc6c
[PERF] TSDB: Optimize inverse matching (#14144)
Simple follow-up to #13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching.

Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method.

The performance is much improved for a subset of the cases; there are up to
~60% CPU gains and ~12.5% reduction in memory usage. 

Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since
`inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-19 15:49:01 +00:00
Bryan Boreham 0ef0b75a4f [TESTS] Remote-Write: Fix BenchmarkStartup
It was crashing due to uninitialized metrics, and not terminating due to
incorrectly reading segment names.

We need to export `SetMetrics` to avoid the first problem.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-15 11:22:07 +00:00
Fiona Liao c599d37668
Always return unknown hint for first sample in non-gauge histogram chunk (#15343)
Always return unknown hint for first sample in non-gauge histogram chunk

---------

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-11-12 15:14:06 +01:00
Bryan Boreham 5450e6d368 [BUGFIX] TSDB: Apply fixes from loadWAL to loadWBL
Move a couple of variables inside the scope of a goroutine, to avoid
data races.

Use `zeropool` to reduce garbage and avoid some lint warnings.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-11 18:41:33 +00:00
Ben Ye 140f4aa9ae
feat: Allow customizing TSDB postings decoder (#13567)
* allow customizing TSDB postings decoder

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-11 07:59:24 +01:00
Ben Ye f9057544cb
Fix AllPostings added twice (#13893)
* handle all postings added twice

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-10 18:17:21 +01:00
🌲 Harry 🌊 John 🏔 f9bc50b247 storage: Implement limit in mergeGenericQuerier
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-11-07 09:08:23 -08:00
Bryan Boreham f42b37ff2f
[BUGFIX] TSDB: Fix race on stale values in headAppender (#15322)
* [BUGFIX] TSDB: Fix race on stale values in headAppender

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Simplify

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-06 16:51:39 +01:00
Matthieu MOREL af1a19fc78 enable errorf rule from perfsprint linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-11-06 16:50:36 +01:00
Bryan Boreham 02aa6d1de6
Merge pull request #15338 from bboreham/cosmetic-tsdb
[COMMENT] Remove duplicate line
2024-11-05 12:03:04 +00:00
Oleg Zaytsev b1e4052682
MemPostings.Delete(): make pauses to unlock and let the readers read (#15242)
This introduces back some unlocking that was removed in #13286 but in a
more balanced way, as suggested by @pracucci.

For TSDBs with a lot of churn, Delete() can take a couple of seconds,
and while it's holding the mutex, reads and writes are blocked waiting
for that mutex, increasing the number of connections handled and memory
usage.

This implementation pauses every 4K labels processed (note that also
compared to #13286 we're not processing all the label-values anymore,
but only the affected ones, because of #14307), makes sure that it's
possible to get the read lock, and waits for a few milliseconds more.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2024-11-05 12:59:57 +01:00
Bryan Boreham 541c7fd9fe [COMMENT] Remove duplicate line
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-05 11:03:40 +00:00
Alban Hurtaud 4b56af7eb8
Add hidden flag for the delayed compaction random time window (#14919)
* Add hidden flag for the delayed compaction random time window

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Update cmd/prometheus/main.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>

* Update cmd/prometheus/main.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>

* Update tsdb/db.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>

* Fix flag name according to review - add test for delay

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Fix afer main rebase

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Implement review comments

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Update generatedelaytest to try with limit values

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

---------

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
2024-11-04 08:26:26 +01:00
Bryan Boreham 2fbbfc3da8 Revert "Fix `MemPostings.Add` and `MemPostings.Get` data race (#15141)"
This reverts commit 50ef0dc954.

Memory allocation goes so high in Prombench that the system is unusable.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-03 12:30:34 +00:00
Bryan Boreham e2e01c1cff
Merge pull request #15216 from yeya24/log-last-series-labels
log last series labelset when hitting OOO series labels
2024-11-01 14:15:39 +00:00
Oleg Zaytsev ba11a55df4
Revert "Process `MemPostings.Delete()` with `GOMAXPROCS` workers"
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-10-29 17:13:40 +01:00
Nicolas Takashi b6c538972c
[REFACTORY] simplify appender commit (#15112)
* [REFACTOR] simplify appender commit

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-10-29 12:34:02 +00:00
Arve Knudsen 706dcfeecf
tsdb.CircularExemplarStorage: Avoid racing (#15231)
* tsdb.CircularExemplarStorage: Avoid racing

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-10-29 10:40:46 +01:00
Pedro Tanaka bab587b9dc
Agent: allow for ingestion of CT samples (#15124)
* Remove unused option from HeadOptions

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Improve docs for appendable() method in head appender

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Ingest CT (float) samples in Agent DB

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* allow for ingestion of CT native histogram

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding some verification for ct ts

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Validating CT histogram before append and add newly created series to pending series

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* checking the wal for written samples

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Checking for samples in test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding case for validations

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* fixing comparison when dedupelabels is enabled

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* unite tests, use table testing

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Implement CT related methods in timestampTracker for write storage

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding error case to test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* removing unused fields

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Updating lastTs for series when adding CT to invalidate duplicates

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* making sure that updating the lastTS wont cause OOO later on in Commit();

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

---------

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-27 01:06:34 +01:00
Ayoub Mrini 93db81dd3d
Merge pull request #14983 from machine424/dopp
fix(storage/mergeQuerier): fix a data race
2024-10-25 18:34:51 +02:00
Łukasz Mierzwa b6e22cd346 Short-cut common memChunk operations
memChunk is a linked list, speed up some common operations when there's no need to iterate all elements on the list.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2024-10-25 12:19:20 +01:00
Ben Ye 99882eec3b log last series labelset when hitting OOO series labels during compaction
Signed-off-by: Ben Ye <benye@amazon.com>
2024-10-24 09:27:15 -07:00
Vanshika cccbe72514
TSDB: Fix some edge cases when OOO is enabled (#14710)
Fix some edge cases when OOO is enabled

Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Signed-off-by: Vanshika <102902652+Vanshikav123@users.noreply.github.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-10-23 17:34:28 +02:00
machine424 cebcdce78a
fix(storage/mergeQuerier): copy the matcjers slice before passing it to queriers as
some of them may alter it.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-10-22 14:08:47 +02:00
machine424 eb523a6b29
fix(storage/mergeQuerier): add a reproducer for data race that occurs when one of the queriers alters the passed matchers and propose a fix
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-10-22 14:08:46 +02:00
György Krajcsovits a4083f14e8 Fix populateWithDelChunkSeriesIterator corrupting chunk meta
When handling recoded histogram chunks the min time of the chunk is
updated by mistake. It should only update when the chunk is completely new.
Otherwise the ongoing chunk's meta will be later than the previously
written samples in it.

Same bug as https://github.com/prometheus/prometheus/pull/14629

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-18 10:34:22 +02:00
György Krajcsovits e6a682f046 Reproduce populateWithDelChunkSeriesIterator corrupting chunk meta
When handling recoded histogram chunks the min time of the chunk is
updated by mistake. It should only update when the chunk is completely
new.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-18 10:34:22 +02:00
machine424 ab2475c426
test(tsdb): add a reproducer for https://github.com/prometheus/prometheus/issues/14422
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-10-15 20:39:25 +02:00
Bryan Boreham 1e1f6ab9df
Merge pull request #15120 from bboreham/floor-ino-mint
[BUGFIX] TSDB: Don't read in-order chunks from before head MinTime
2024-10-15 10:27:38 +01:00
George Krajcsovits b8867f8ead
Merge pull request #15142 from krajorama/fix-appendhistogram-race
bugfix: data race in head.Appender.AppendHistogram and Commit
2024-10-14 08:13:39 +02:00
Oleg Zaytsev 50ef0dc954
Fix `MemPostings.Add` and `MemPostings.Get` data race (#15141)
* Tests for Mempostings.{Add,Get} data race
* Fix MemPostings.{Add,Get} data race

We can't modify the postings list that are held in MemPostings as they
might already be in use by some readers.

* Modify BenchmarkHeadStripeSeriesCreate to have common labels

If there are no common labels on the series, we don't excercise the
ordering part of MemSeries, as we're just creating slices of one element
for each label value.

---------

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-10-11 15:21:15 +02:00
György Krajcsovits bb70370d72 TSDB head: fix race between AppendHistogram and Commit
Move writing memSeries lastHistogramValue and lastFloatHistogramValue
after series creation under lock.

The resulting code isn't totally correct in the sense that we're setting
these values before Commit() , so they might be overwritten/rolled back
later.

Also Append of stale sample checks the values without lock, so there's
still a potential race.

The correct solution would be to set these only in Commit() which we
actually do, but then Commit() would also need to process samples in
order and not floats first, then histograms, then float histograms - which
leads to not knowing what stale marker to write for histograms.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-10 16:59:15 +02:00
György Krajcsovits 631fadc4ca Unit test for data race in head.Appender.AppendHistogram
Two Appenders race when creating a series with a native histogram
as the memSeries will be common and the lastHistogram field is written
without lock.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-10 14:10:07 +02:00
beorn7 12c39d5421 docs: Some nitpicking in chunks.md
- `float histogram` → `floathistogram`, as it is used in the code.
- Actual link encodings to the code (to find the actual numerical values).
- `<bytes>` → `<data>` for consistency.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-10-09 14:32:12 +02:00
beorn7 a4cb52ff15 docs: Update chunk layot for NHCB
Signed-off-by: beorn7 <beorn@grafana.com>
2024-10-09 14:19:20 +02:00
Björn Rabenstein 02d0de9987
Merge pull request #14997 from fionaliao/fl/update-format-docs
Update chunk format docs with native histograms and OOO
2024-10-09 13:29:01 +02:00
TJ Hoplock 6ebfbd2d54 chore!: adopt log/slog, remove go-kit/log
For: #14355

This commit updates Prometheus to adopt stdlib's log/slog package in
favor of go-kit/log. As part of converting to use slog, several other
related changes are required to get prometheus working, including:
- removed unused logging util func `RateLimit()`
- forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger
- move some of the json file logging functionality to use prom/common package functionality
- refactored some of the new json file logging for scraping
- changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers
- updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition
- added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-07 15:58:50 -04:00
Bryan Boreham 91de19fbef [BUGFIX] TSDB: Don't read in-order chunks from before head MinTime
Because we are reimplementing the `IndexReader` to fetch in-order and
out-of-order chunks together, we must reproduce the behaviour of
`Head.indexRange()`, which floors the minimum time queried at `head.MinTime()`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-10-07 13:50:03 +01:00
Matthieu MOREL ab64966e9d
fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" (#15094)
* fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()"

---------

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-10-06 16:35:29 +00:00
György Krajcsovits 44ebbb8458 Fix missing histogram copy in sampleRing
The specialized version of sample add to the ring:
func addH(s hSample, buf []hSample, r *sampleRing) []hSample
func addFH(s fhSample, buf []fhSample, r *sampleRing) []fhSample
already correctly copy histogram samples from the reused hReader, fhReader
buffers, but the generic version does not. This means that the
data is overwritten on the next read if the sample ring has seen histogram
and float samples at the same time and switched to generic mode.

The `genericAdd` function (which was commented anyway) is by now quite
different from the specialized functions so that this commit deletes
it.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-02 13:57:28 +02:00
Bryan Boreham 54de4fb780
Merge pull request #14975 from colega/process-mempostings-delete-with-gomaxprocs-workers
Process `MemPostings.Delete()` with `GOMAXPROCS` workers
2024-09-29 07:58:42 +01:00
Fiona Liao fd62dbc291 Update chunk format docs with native histograms and OOO
Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
2024-09-27 18:57:58 +01:00
Ayoub Mrini 105ab2e95a
fix(test): adjust defer invocations (#14996)
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-09-27 17:13:51 +01:00
Oleg Zaytsev ada8a6ef10
Add some more tests for MemPostings_Delete
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-27 10:14:39 +02:00
Arthur Silva Sens d5f65cfce0
Merge pull request #14694 from prometheus/ct-histogram
Histogram CT Zero ingestion
2024-09-26 12:48:46 -03:00
Arthur Silva Sens 95a53ef982
Join tests for appending float and histogram CTs
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-26 11:29:31 -03:00
Arthur Silva Sens 6bd9b1a7cc
Histogram CT Zero ingestion
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-26 11:29:22 -03:00
Oleg Zaytsev 4fd2556baa
Extract processWithBoundedParallelismAndConsistentWorkers
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-26 15:43:19 +02:00
Oleg Zaytsev ccd0308abc
Don't do anything if MemPostings are empty
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 15:00:10 +02:00
Oleg Zaytsev 9c417aa710
Fix deadlock with empty MemPostings
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 14:08:50 +02:00
Bryan Boreham 5d8f0ef0c2
Merge pull request #14721 from bboreham/exp-grow-postings
[PERF] TSDB: Grow postings by doubling
2024-09-25 10:47:55 +01:00
Oleg Zaytsev e196b977af
Process MemPostings.Delete() with GOMAXPROCS workers
We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex.

This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 10:38:47 +02:00
Bryan Boreham ca673eb749 Merge remote-tracking branch 'origin/release-2.55' into merge-2.55-into-main
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-22 17:49:34 +01:00
Bryan Boreham 31c5760551
Neater string vs byte-slice conversions (#14425)
unsafe.Slice and unsafe.StringData were added in Go 1.20

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-21 12:19:21 +02:00
Bryan Boreham d42232e178
Merge pull request #14932 from bboreham/chunk-xor-combine-writebits
[PERF] TSDB: Chunk encoding: shorten some write sequences
2024-09-20 17:53:54 +01:00
Bryan Boreham 6f0d6038b7 [BUGFIX] TSDB: Only query chunks up to truncation time (#14948)
If the query overlaps the range currently undergoing compaction, we
should only fetch chunks up to that time. Need to store that min time
in `HeadAndOOOIndexReader`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-20 17:44:04 +01:00
Bryan Boreham 9215252221
[BUGFIX] TSDB: Only query chunks up to truncation time (#14948)
If the query overlaps the range currently undergoing compaction, we
should only fetch chunks up to that time. Need to store that min time
in `HeadAndOOOIndexReader`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-20 18:40:17 +02:00
Ganesh Vernekar 5ccb069414 Backward compatibility with upcoming index v3
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-09-19 10:27:52 +01:00
George Krajcsovits 0d22a91267 Merge pull request #14874 from krajorama/fix-panic-in-ooo-query2
BUGFIX: TSDB: panic in chunk querier
2024-09-19 10:03:53 +01:00
Bryan Boreham e8c2d916ec lint
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-18 15:23:46 +01:00
Bryan Boreham 648a668835 [PERF] Chunk encoding: combine timestamp writes
Instead of a 2-bit write followed by a 14-bit write, do two 8-bit
writes, which goes much faster since it avoids looping.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-18 13:19:21 +01:00
Bryan Boreham b9a9689aae [PERF] Chunk encoding: simplify writeByte
Rather than append a zero then set the value at that position, append the value.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-18 13:19:04 +01:00
Bryan Boreham b65f1b6560 TSDB: Improve xor-chunk benchmarks
Benchmarks must do the same work N times.
Run 3 cases, where the values are constant, vary a bit, and vary a lot.

Also aim for 120 samples same as TSDB default.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-18 13:14:49 +01:00
Bryan Boreham bb47f78929
Merge pull request #14505 from marioferh/improve_performance_regex
[CHANGE] regexp . to match \n and optimize performance
2024-09-18 09:54:16 +01:00
Antoine Pultier d90d0976b5
fix(bstream/writeByte): ensure it appends only one byte (#14854)
fix(bstream/writeByte): ensure it appends only one byte

Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>
2024-09-17 16:28:33 +02:00
machine424 d1b4312f0a fix(wlog/watcher_test.go): make TestRun_AvoidNotifyWhenBehind more resilient
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-09-17 13:11:04 +02:00
Mario Fernandez 5814920601
Fix: optimize .* regexp performance
Shortcut for `.*` matches newlines as well.
Add preamble change ^(?s:
Add test
dotAll flag por al regex
Add and fix regex tests

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
2024-09-17 12:18:31 +02:00
Bryan Boreham d5f4fabd12
Merge pull request #14911 from bboreham/clarify-postings-benchmark
TSDB: Simplify benchmark regexps
2024-09-17 11:52:13 +02:00
Carrie Edwards 14e3c05ce8
tsdb: Add support for ingestion of out-of-order native histogram samples (#14546)
Add support for ingesting OOO native histograms

* Add flag for enabling and disabling OOO native histogram ingestion
* Update OOO querying tests to include native histogram samples
* Add OOO head tests
* Add test for OOO native histogram counter reset headers

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored by: Jeanette Tan <jeanette.tan@grafana.com>
Co-authored by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Fiona Liao <fiona.liao@grafana.com>
2024-09-17 11:19:06 +02:00
Harry John 919dc0cbc6
storage: Update LabelQuerier interface to return sorted label values (#14849)
* Change LabelQuerier.LabelValues() to return sorted values

---------

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-09-17 08:55:02 +02:00
Bryan Boreham a8133f3e87 TSDB: Simplify benchmark regexps
Several regexps were coded like `"^.*$"`, which is an unnatural
formulation nobody is likely to use. Inside `NewMatcher`, `^` and `$`
are added anyway, which makes the form in the benchmark redundant.

It even printed it out in the expected way.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-16 17:48:05 +01:00
George Krajcsovits 5aa3d8260a
TSDB: OOO native histograms: prep for multiple ooo head chunks (#14850)
* tsdb: mmapCurrentOOOHeadChunk prepare for multiple ooo chunks

Currently float samples can only create a single ooo head chunk, but
native histograms can result in multiple due to counter resets, etc.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* tsdb: getOOOSeriesChunks prepare for multiple ooo chunks

Currently float samples can only create a single ooo head chunk, but
native histograms can result in multiple due to counter resets, etc.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-11 23:55:39 +01:00
Nathan Baulch 50cd453c8f
chore: Fix typos (#14868)
* Fix typos

---------

Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>
2024-09-10 22:32:03 +02:00
Bryan Boreham 16e5e99546
Merge pull request #14767 from bboreham/fix-encoding-comment
[Comment] Correct the comment on Decbuf.UvarintBytes
2024-09-09 12:52:36 +01:00
György Krajcsovits d3f4e7c223 Remove unnecessary conversion
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-09 12:51:02 +02:00
György Krajcsovits 60ab1cc5a5 BUGFIX: TSDB: panic in chunk querier
Followup to #14831

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-09 12:43:02 +02:00
George Krajcsovits 536d9f9ce9
BUGFIX: TSDB: panic in query during truncation with OOO head (#14831)
Check if headQuerier is nil before trying to use it.

* TestQueryOOOHeadDuringTruncate: unit test to check query during truncate
Regression test for #14822

* Simulate race between query and Compact()

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-05 17:17:42 +01:00
Antoine Pultier f5971bf292
tsdb documenation: More details about chunks
Signed-off-by: Antoine Pultier <45740+fungiboletus@users.noreply.github.com>
2024-09-04 14:57:30 +02:00
Joshua Hesketh f2064c7987
NH: Do not re-use spans between histograms (#14771)
promql, tsdb (histograms): Do not re-use spans between histograms

When multiple points exist with the same native histogram schemas they
share their spans.
This causes a problem when a native histogram (NH) schema is modified (for example, during
a Sum) then the other NH's with the same spans are also modified. As such,
we should create a new Span for each NH. This will ensure NH's interfaces
are safe to use without considering the effect on other histograms.

At the moment this doesn't present itself as a problem because in all
aggregations and functions operating on native histograms they are copied
by the promql query engine first.

Signed-off-by: Joshua Hesketh <josh@nitrotech.org>

---------

Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
2024-09-04 12:07:16 +02:00
George Krajcsovits 282fb1632a
Merge pull request #14772 from krajorama/fix-mockseriesiterator
Fix: chunkenc.MockSeriesIterator
2024-09-03 16:55:26 +02:00
Arthur Silva Sens 442f24e099
chore: Simplify TestHeadAppender_AppendCTZeroSample (#14812)
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-02 21:30:37 +01:00
Arve Knudsen 2cfc7b244a
Merge pull request #14700 from shandongzhejiang/main
Comments: fix some function names
2024-09-02 18:59:28 +02:00
Oleg Zaytsev ce7d830f1f
Bring back BenchmarkLoadRealWLs (#14757)
This was part of #14525 which was reverted.
I still think that having this benchmark committed in to the repo is
useful.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-02 17:20:10 +01:00
György Krajcsovits a693dd19f2 Fix: chunkenc.MockSeriesIterator
Starts its index from 0 , but users call Next() before first sample
so it needs to start from -1

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-30 16:44:36 +02:00
Bryan Boreham 0a4f130b39 [Comment] Correct the comment on Decbuf.UvarintBytes
The value is valid when returned, but can become invalid later.

Return to previous wording.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-30 09:40:18 +01:00
Callum Styan a77f5007f9
fix bug with metadata for rw2 (#14766)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2024-08-30 08:14:20 +01:00
Bryan Boreham 1f38ae7bca [TESTS] TSDB: fix up OOO tests for new Series behaviour
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-29 10:59:09 +01:00
Bryan Boreham cde42f30e9 TSDB: streamline reading of overlapping head chunks
`getOOOSeriesChunks` was already finding sets of overlapping chunks; we
store those in a `multiMeta` struct so that `ChunkOrIterable` can
reconstruct an `Iterable` easily and predictably.

We no longer need a `MergeOOO` flag to indicate that this Meta should
be merged with other ones; this is explicit in the `multiMeta` structure.

We also no longer need `chunkMetaAndChunkDiskMapperRef`.

Add `wrapOOOHeadChunk` to defeat `chunkenc.Pool` - chunks are reset
during compaction, but if we wrap them (like `safeHeadChunk` was doing
then this is skipped) .

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-29 10:57:29 +01:00
Bryan Boreham 838e49e7b8 [REFACTOR] TSDB: move chunkFromSeries from headChunkReader to head
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-29 10:51:48 +01:00
Björn Rabenstein 1d6e0071b7
Merge pull request #14751 from riskrole/main
chore: fix some comments
2024-08-28 16:38:39 +02:00
riskrole 406bf775aa chore: fix some comments
Signed-off-by: riskrole <yuhang@before.tech>
2024-08-28 11:26:57 +08:00
Marco Pracucci ef649d5968
Revert " Store `mmMaxTime` in same field as `seriesShard`"
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-08-26 08:56:16 +02:00
Bryan Boreham 33adbe47b1 [PERF] TSDB: Grow postings by doubling
Go's built-in append() grows larger slices with factor 1.3, which means we do a lot more allocating and copying for larger postings.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-24 11:16:58 +01:00
György Krajcsovits 183bbc39a2 Make requesting merge with OOO head explicit in chunk.Meta
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-23 15:50:53 +02:00
György Krajcsovits 41c076196e New cases in Test_ChunkQuerier_OOOQuery and Test_Querier_OOOQuery
Case 1: OOO in-memory head chunk overlaps with first mmaped in-order chunk.

Query: |----------------------------------------------------------------|
InO:    |------mmap---------------||---------mem----------------------|
OOO:     |-----mem-----------|

This triggers ChunkOrIterableWithCopy not including OOO head chunks bug.

Similar to #14693 however testing the end of the interval doesn't
trigger the problem because there the in-order head chunk will be
trimmed with a tombstone, causing the code to switch to ChunkOrIterable
which was fixed.
See a36d1a8a92/tsdb/querier.go (L646)
where len(p.bufIter.Intervals) will be non zero, because it includes the
tombstone to trim the result to the query max time.

Thus a new test is added to check the overlap at the beginning of the
interval that has a separate chunk, which does not need trimming.

Note: same test doesn't fail for sample querier in Test_Querier_OOOQuery
as that doesn't use copy, that is copyHeadChunk is false in the if
condition above.

Case 2:

OOO mmaped head chunk overlaps with first mmaped in-order chunk.

Query: |----------------------------------------------------------------|
InO:    |------mmap---------------||---------mem----------------------|
OOO:     |-----mmap-----------|                             |--mem--|

In this case the meta contains the reference of the in-order chunk and
no indication that a merge is needed with the OOO mmaped chunk.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-23 15:50:47 +02:00
Arve Knudsen b0aba26ed5 tsdb: Fix ValNone typo in comment
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-23 08:20:20 +02:00
beorn7 0f760f63dd lint: Revamp our linting rules, mostly around doc comments
Several things done here:

- Set `max-issues-per-linter` to 0 so that we actually see all linter
  warnings and not just 50 per linter. (As we also set
  `max-same-issues` to 0, I assume this was the intention from the
  beginning.)

- Stop using the golangci-lint default excludes (by setting
  `exclude-use-default: false`. Those are too generous and don't match
  our style conventions. (I have re-added some of the excludes
  explicitly in this commit. See below.)

- Re-add the `errcheck` exclusion we have used so far via the
  defaults.

- Exclude the signature requirement `govet` has for `Seek` methods
  because we use non-standard `Seek` methods a lot. (But we keep other
  requirements, while the default excludes completely disabled the
  check for common method segnatures.)

- Exclude warnings about missing doc comments on exported symbols. (We
  used to be pretty adamant about doc comments, but stopped that at
  some point in the past. By now, we have about 500 missing doc
  comments. We may consider reintroducing this check, but that's
  outside of the scope of this commit. The default excludes of
  golangci-lint essentially ignore doc comments completely.)

- By stop using the default excludes, we now get warnings back on
  malformed doc comments. That's the most impactful change in this
  commit. It does not enforce doc comments (again), but _if_ there is
  a doc comment, it has to have the recommended form. (Most of the
  changes in this commit are fixing this form.)

- Improve wording/spelling of some comments in .golangci.yml, and
  remove an outdated comment.

- Leave `package-comments` inactive, but add a TODO asking if we
  should change that.

- Add a new sub-linter `comment-spacings` (and fix corresponding
  comments), which avoids missing spaces after the leading `//`.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-08-22 17:36:11 +02:00
Bryan Boreham 9a74d53935
[BUGFIX] TSDB: Fix query overlapping in-order and ooo head (#14693)
* tsdb: Unit test query overlapping in order and ooo head

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* TSDB: Merge overlapping head chunk

The basic idea is that getOOOSeriesChunks can populate Meta.Chunk, but since
it only returns one Meta per overlapping time-slot, that pointer may end up in a
Meta with a head-chunk ID. So we need HeadAndOOOChunkReader.ChunkOrIterable()
to call mergedChunks in that case.

Previously, mergedChunks was checking that meta.Ref was a valid OOO chunk reference,
but it never actually uses that reference; it just finds all chunks overlapping in time.
So we can delete that code.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-21 14:24:20 +01:00
shandongzhejiang b2712ff284 chore: fix some function names
Signed-off-by: shandongzhejiang <shandongzhejiang@icloud.com>
2024-08-21 11:09:37 +08:00
Arve Knudsen 3a78e76282 Upgrade golangci-lint to v1.60.1
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-18 12:13:25 +02:00
Bryan Boreham 87dccb1d1b
Merge pull request #14649 from machine424/ftest
fix(tsdb/db_test.go): close the corrupted chunk after creating it to satisfy Windows FS
2024-08-16 11:57:54 +01:00
Arve Knudsen 66388f706a
Merge pull request #14042 from aknuds1/arve/wlog-histograms
tsdb/wlog: Only treat unknown record types as failure
2024-08-16 12:00:49 +02:00
Björn Rabenstein 1daf7cdd62
Merge pull request #14626 from cuiweiyuan/main
chore: fix some function names
2024-08-15 11:46:21 +02:00
cuiweiyuan 1800af54f0 chore: fix some function names
Signed-off-by: cuiweiyuan <cuiweiyuan@aliyun.com>
2024-08-15 13:57:21 +08:00
Arve Knudsen b5d13a1ab5 Merge remote-tracking branch 'prometheus/main' into arve/wlog-histograms
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-14 19:04:53 +01:00
Bryan Boreham 512c67ec26 TSDB: Never go over maximum number of OOO chunks
In `mmapCurrentOOOHeadChunk`, check if the number is at the maximum and
drop the data with an error log. This is not expected to happen as the
maximum is over 8 million; that's 8 years of 1 sample every second.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:59 +01:00
Bryan Boreham 9135da1e4f TSDB: Review feedback
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Re-enable check in `createHeadWithOOOSamples` which wasn't really broken.
* Move code making `Block` into a `Queryable` into test file.
* Make `getSeriesChunks` return a slice (renamed `appendSeriesChunks`).
* Rename `oooMergedChunks` to `mergedChunks`.
* Improve comment on `ChunkOrIterableWithCopy`.
* Name return values from unpackHeadChunkRef.

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:44 +01:00
Bryan Boreham 7ffd3ca280 TSDB: Cosmetic: move HeadAndOOO implementations where old code was
This makes the diffs easier to follow.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:13 +01:00
Bryan Boreham e95607b276 TSDB: Lock round access to labels, where necessary
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:13 +01:00
Bryan Boreham 26b3de0438 TSDB: Remove OOOHeadIndexReader
Use headIndexReader instead.

OOOCompactionHeadIndexReader needs to be expanded slightly, because it previously delegated to OOOHeadIndexReader.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:13 +01:00
Bryan Boreham a299c7b6d6 TSDB: Remove OOOHeadChunkReader
Use HeadAndOOOChunkReader instead.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:13 +01:00
Bryan Boreham e7e50a3afd TSDB: Remove code for querying OOO-head only
Just query via `HeadAndOOOQuerier`, which will skip series where no
in-order chunks are in range.

Now we don't need `OOORangeHead`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:13 +01:00
Bryan Boreham 0a2ff76881 TSDB tests: Fix up BenchmarkQueries
Was not working even on main.  Some cases still error.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 13:41:04 +01:00
Bryan Boreham f261597944 TSDB: Fix up LabelValues to work for OOO-only head
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham 6529d6336c TSDB: NewHeadAndOOOChunkReader takes headChunkReader
So we can pass nil and have it read just OOO chunks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham e04d137649 [PERF] TSDB: Query head and ooo-head together
Add `HeadAndOOOQuerier` which iterates just once over series, then
where necessary merges chunks from in-order and out-of-order lists.

Add a ChunkQuerier for in-order and ooo together

Add copy-last-chunk behaviour to HeadAndOOOChunkReader

Out-of-order chunk IDs are distinguished from in-order by setting bit 23.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham da31da3ea6 Refactor: extract selectSeriesSet and selectChunkSeriesSet
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham 7e24844d08 Refactor: extract headChunkReader.chunkFromSeries()
For when you have a series locked already.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham a32aca0cd7 Refactoring: extract getOOOSeriesChunks
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham c75c8f8329 Refactoring: extract getSeriesChunks
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham 0c852680bf [Benchmark] TSDB: Add BenchmarkQuerierSelectWithOutOfOrder
Refactor existing BenchmarkQuerierSelect to provide the set-up.

Note that Head queries now run faster because they use a RangeHead.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
György Krajcsovits 41656162fc tsdb: prepare inserting native histograms into OOO head
Rename a variable.
Add parameters to memSeries.insert function.

No effect on how float samples are handled.

Related to #14546

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-14 11:13:47 +02:00
Bryan Boreham aa4b056ad0
Merge pull request #13200 from bboreham/wlog-defer
tsdb/wlog: close segment files sooner
2024-08-13 14:11:38 +01:00
machine424 82f38d3e9a
fix(tsdb/db_test.go): close the corrupted chunk after creating it to satisfy Windows FS
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-08-09 14:53:57 +02:00
George Krajcsovits cf62fb5c44
Merge pull request #14629 from krajorama/fix-to-encoded-chunks
Fix ToEncodedChunks minT for recoded chunks
2024-08-08 20:00:31 +02:00
György Krajcsovits 1ea3781699 Fix ToEncodedChunks minT for recoded chunks
Discovered while working on #14546 OOO native histograms.
Not triggered on main before #14546 as the code path is unused.

There was a bug where the min time of a chunk was adjusted even
if it was only recoded and not completely new.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-08 15:22:46 +02:00
Ben Ye b7a58dcf3d
Add hidden flag to disable overlapping compaction (#14581)
TSDB: add hidden flag to disable overlapping compaction

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-08-08 12:09:39 +02:00
George Krajcsovits 3a673cd0bc
Merge pull request #14598 from krajorama/fix-compaction-panic
Fix: panic: runtime error: index out of range [4] with length 4
2024-08-07 17:14:14 +02:00
machine424 92873d3009 feat: allow to delay head compaction start time helping Prometheus instances to
avoid simultaneous compactions and reduce stress on shared resources.

This is enabled via `--enable-feature=delayed-compaction`.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-08-07 17:10:27 +02:00
Oleg Zaytsev 0833d2a230
Fix appendable: check whether last val was a histogram (#14613)
* Fix appendable: check whether last val was a histogram

When appending a float, we were checking whether lastValue was equal to
current value, but we didn't check whether last value was a float value.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-08-07 15:02:59 +02:00
György Krajcsovits 98ecdf3589 Fix corrupting spans via iterator sharing
Iterator may share spans without copy, so we always have to make a copy
before modification - copy-on-write.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-06 16:51:20 +02:00
György Krajcsovits d2f6fa7289 Fix lint error
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-06 13:24:46 +02:00
György Krajcsovits 1b6d1366d8 Fix re-code histogram and chunk re-code conflict
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-06 13:09:17 +02:00
György Krajcsovits aff089a014 Reproduce recoding bug with new and missing buckets
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-06 10:51:44 +02:00
Bryan Boreham 80adc5baf4 Merge remote-tracking branch 'origin/main' into merge-2.54-to-main 2024-08-06 09:19:55 +01:00
machine424 9e43ad2e37 chore(remote_write): clean up as watcher.go is part of wlog now
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-08-05 13:40:23 +02:00
Bryan Boreham 015638c4b6 [BUGFIX] TSDB: Exclude OOO chunks mapped after compaction starts
Otherwise the writer can end up with invalid chunks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-05 10:35:34 +01:00
Bryan Boreham bded853035 [Test] TSDB: TestOOOCompaction with samples added after compaction starts
Test fails due to bug.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-05 10:35:34 +01:00
George Krajcsovits 00ab05c3b9
Native histograms: fix spurios counter reset when merging recoded chunk to normal chunk (#14513)
* chunkenc: allow missing empty buckets on histogram append

Allow appending to chunks when the histogram to be added is missing
some buckets, but the missing buckets are empty in the chunk.
For example bucket at index 5 is present in the chunk, but its value
is 0 and the new histogram doesn't have a bucket at index 5.

This fixes an issue of merging chunks where one chunk was recoded to
retroactively have some empty buckets in all the histograms and we are
merging in a histogram that doesn't have the empty bucket (because it
was not recoded yet).

The operation alters the histogram that is being added, however this has
already been the case when appending gauge histograms. Thus the test
TestHistogramSeriesToChunks in storage package is changed to explicitly
test what happened to the appended histogram - Compact(0) call is removed.

The new expandIntSpansAndBuckets and expandFloatSpansAndBuckets functions
are a merge of expandSpansForward and counterResetInAnyBucket and
counterResetInAnyFloatBucket.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-01 09:22:32 +02:00
Bartlomiej Plotka 6816149852
Merge pull request #14525 from colega/merge-mmmaxtime-into-shardhash
Store `mmMaxTime` in same field as `seriesShard`
2024-07-31 08:39:38 +02:00
Max Amin 84b819a69f
feat: add Google cloud roundtripper for remote write (#14346)
* feat: Google Auth for remote write

Signed-off-by: Max Amin <maxamin@google.com>

---------

Signed-off-by: Max Amin <maxamin@google.com>
2024-07-30 16:25:19 +01:00
Oleg Zaytsev 0300ad58a9
Revert the option regardless of error
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 11:31:31 +02:00
Oleg Zaytsev d8e1b6bdfd
Store mmMaxTime in same field as seriesShard
We don't use seriesShard during DB initialization, so we can use the
same 8 bytes to store mmMaxTime, and save those during the rest of the
lifetime of the database.

This doesn't affect CPU performance.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 10:20:29 +02:00
Oleg Zaytsev b7f2f3c3ac
Add BenchmarkLoadRealWLs
This benchmark runs on real WLs rather than fake generated ones.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 10:19:56 +02:00