prometheus

Commit Graph

Author	SHA1	Message	Date
Łukasz Mierzwa	b880cea613	Fix locks in db.reloadBlocks() This partially reverts `ae3d392aa9`. `ae3d392aa9` added a call to db.mtx.Lock() that lasts for the entire duration of db.reloadBlocks(), previous db.mtx would be locked only during critical part of db.reloadBlocks(). The motivation was to protect against races: `9e0351e161 (r555699794)` The 'reloads' being mentioned are (I think) reloadBlocks() calls, rather than db.reload() or other methods. TestTombstoneCleanRetentionLimitsRace was added to catch this but I wasn't able to ever get any error out of it, even after disabling all calls to db.mtx in reloadBlocks() and CleanTombstones(). To make things more complicated CleanupTombstones() itself calls reloadBlocks(), so it seems that the real issue is that we might have concurrent calls to reloadBlocks(). The problem with this change is that db.reloadBlocks() can take a very long time, that's because it might need to load very large blocks from disk, which is slow. While db.mtx is locked a large chunk of the db is locked, including queries, since db.mtx read lock is needed for db.Querier() call. One of the issues this manifests itself as is a gap in all metrics and blocked queries just after a large block compaction happens. When compaction merges multiple day-or-more blocks into a week-or-more block it create a single very big block. After that block is written it needs to be loaded and that seems to be taking many seconds (30-45), during which mtx is held and everything is blocked. Turns out that there is another lock that is more fine grained and aimed at this specific use case: // cmtx ensures that compactions and deletions don't run simultaneously. cmtx sync.Mutex All calls to reloadBlocks() are wrapped inside cmtx lock. The only exception is db.reload() which this change fixes. We can't add cmtx lock inside reloadBlocks() itself because it's called by a number of functions, some of which are already holding cmtx. Looking at the code I think it is sufficient to hold cmtx and skip a reloadBlocks() wide mtx call. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:05:39 +00:00
Arve Knudsen	f030894c2c	Fix issues raised by staticcheck (#15722 ) Fix issues raised by staticcheck We are not enabling staticcheck explicitly, though, because it has too many false positives. --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-01-09 17:51:26 +01:00
Ben Ye	919a5b657e	Expose ListPostings Length via Len() method (#15678 ) tsdb: expose remaining ListPostings Length Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	2025-01-07 17:58:26 +01:00
György Krajcsovits	1e420ef373	Merge branch 'main' into cedwards/nhcb-wal-wbl # Conflicts: # tsdb/tsdbutil/histogram.go	2025-01-02 12:50:19 +01:00
György Krajcsovits	a7ccc8e091	record_test.go: avoid captures, simply return test refs Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-01-02 12:45:20 +01:00
Bryan Boreham	096e2aa7bd	Merge pull request #14518 from bboreham/faster-listpostings-merge TSDB: Optimization: Merge postings using concrete type	2025-01-02 10:43:45 +00:00
Bryan Boreham	b2fa1c9524	TSDB benchmarks: Commit periodically to speed up init When creating dummy data for benchmarks, call `Commit()` periodically to avoid growing the appender to enormous size. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-30 17:42:56 +00:00
johncming	061400e31b	tsdb: export CheckpointPrefix constant (#15636 ) Exported the CheckpointPrefix constant to be used in other packages. Updated references to the constant in db.go and checkpoint.go files. This change improves code readability and maintainability. Signed-off-by: johncming <johncming@yahoo.com> Co-authored-by: johncming <conjohn668@gmail.com>	2024-12-29 17:54:45 +01:00
Carrie Edwards	1508149184	Update benchmark test and comment	2024-12-27 09:09:13 -08:00
Bryan Boreham	cfa32f3d28	TSDB: Move merge of head postings into index This enables it to take advantage of a more compact data structure since all postings are known to be `*ListPostings`. Remove the `Get` member which was not used for anything else, and fix up tests. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 19:22:30 +00:00
Bryan Boreham	0a8779f46d	TSDB: Make mergedPostings generic Now we can call it with more specific types which is more efficient than making everything go through the `Postings` interface. Benchmark the concrete type. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 17:09:21 +00:00
Bryan Boreham	1b22242024	TSDB BenchmarkMerge: run fewer sizes As long as we run small and big sizes, we don't need all the sizes inbetween. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 17:09:21 +00:00
Bryan Boreham	e630ffdbed	TSDB: extend BenchmarkMemPostings_PostingsForLabelMatching to check merge speed We need to create more postings entries so the merger has some work to do. Not material for the regexp ones as they match so few series. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 17:09:21 +00:00
Björn Rabenstein	318d6bc4bf	Merge pull request #15548 from TinfoilSubmarine/fix/386-test-failures test: fixes for 32-bit archs	2024-12-18 15:49:30 +01:00
Björn Rabenstein	ff398062cb	Merge pull request #15679 from colega/update-comment-on-mempostings-lvs Update comment on MemPostings.lvs	2024-12-17 19:41:56 +01:00
Oleg Zaytsev	c8359fcd6b	Fix bug in lbl!~".+" shortcut (#15684 ) We were appending to the wrong slice, so instead of removing values, we were adding them. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-12-17 17:34:24 +01:00
Oleg Zaytsev	17d5bc4e54	Update comment on MemPostings.lvs There was a missing verb there. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-12-16 17:20:51 +01:00
Joel Beckmeyer	39f5a07236	fix TestOOOHeadChunkReader_Chunk on 32-bit Signed-off-by: Joel Beckmeyer <joel@beckmeyer.us>	2024-12-16 10:45:07 -05:00
Bryan Boreham	ac4f8a5e23	[ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880 ) * [ENHANCEMENT] TSDB: Improve calculation of space used by labels The labels for each series in the Head take up some some space in the Postings index, but far more space in the `memSeries` structure. Instead of having the Postings index calculate this overhead, which is a layering violation, have the caller pass in a function to do it. Provide three implementations of this function for the three Labels versions. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-16 09:42:52 +00:00
David Ashpole	953a873342	update links to openmetrics to reference the v1.0.0 release Signed-off-by: David Ashpole <dashpole@google.com>	2024-12-13 21:32:27 +00:00
György Krajcsovits	df88de5800	Fix lint for real Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 12:52:01 +01:00
György Krajcsovits	cf36792e14	Fix unused import Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 12:49:28 +01:00
György Krajcsovits	fdb1516af1	Fix lint errors Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 12:47:43 +01:00
György Krajcsovits	d64d1c4c0a	Benchmark encoding classic and nhcb Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 10:59:06 +01:00
György Krajcsovits	a325ff142c	fix(test): do not run automatic WAL truncate during test Remove the 2 minute timeout as the default is 2 hours and wouldn't interfere. With the test. Otherwise the extra samples combined with race detection can push the test over 2 minutes and make it fail. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 17:30:46 +01:00
György Krajcsovits	07276aeece	fix(test): if we are dereferencing a slice we should check its len Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 16:25:50 +01:00
György Krajcsovits	8f572fe905	fix(lint): linter errors Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 16:25:20 +01:00
György Krajcsovits	b94c87bea6	fix(test): TestCheckpoint segment size too low The segment size was too low for the additional NHCB data, thus it created more segments then expected. This meant that less were in the lower numbered segments, which meant more was kept. FAIL: TestCheckpoint (4.05s) FAIL: TestCheckpoint/compress=none (0.22s) checkpoint_test.go:361: Error Trace: /home/krajo/go/github.com/prometheus/prometheus/tsdb/wlog/checkpoint_test.go:361 Error: "0.8586956521739131" is not less than "0.8" Test: TestCheckpoint/compress=none Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 16:16:46 +01:00
György Krajcsovits	efdd0880c1	Merge branch 'main' into cedwards/nhcb-wal-wbl # Conflicts: # tsdb/docs/format/wal.md	2024-12-10 14:33:35 +01:00
bwplotka	eeef17ea0a	docs: Added native histogram WAL record documentation. Signed-off-by: bwplotka <bwplotka@gmail.com>	2024-12-09 11:47:28 +00:00
Carrie Edwards	1933ccc9be	Fix test	2024-12-06 14:55:19 -08:00
Carrie Edwards	a046417bc0	Use new record type only for NHCB	2024-12-06 13:46:20 -08:00
Carrie Edwards	45944c1847	Extend tsdb agent tests with custom bucket histograms	2024-12-05 09:21:47 -08:00
Carrie Edwards	6b44c1437f	Fix comment and histogram record string	2024-12-05 09:21:47 -08:00
Carrie Edwards	f8a39767a4	Update WAL doc to include native histogram encodings	2024-12-05 09:21:47 -08:00
Carrie Edwards	6684344026	Rename old histogram record type, use old names for new records	2024-12-05 09:21:47 -08:00
Carrie Edwards	454f6d39ca	Add separate handling for histograms and custom bucket histograms	2024-12-05 09:21:47 -08:00
Carrie Edwards	37df50adb9	Attempt for record type	2024-12-05 09:21:47 -08:00
Carrie Edwards	cfcd51538d	Remove references to custom values record	2024-12-05 09:21:47 -08:00
Carrie Edwards	6d413fad36	Use histogram records for custom value handling	2024-12-05 09:21:47 -08:00
Carrie Edwards	aa144b7263	Handle custom buckets in WAL and WBL	2024-12-05 09:21:47 -08:00
Antoine Pultier	f1340bac64	documentation: put back trailing punctuation. markdownlint wasn't happy about the trailing punctuation in the headings. Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>	2024-12-03 14:36:56 +01:00
Antoine Pultier	5c2fd7988b	Merge remote-tracking branch 'upstream/main' into patch-2 Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>	2024-12-03 14:32:28 +01:00
Antoine Pultier	6046769941	tsdb documenation: Improve Chunk documentation Signed-off-by: Antoine Pultier <45740+fungiboletus@users.noreply.github.com> Signed-off-by: Antoine Pultier <45740+fungiboletus@users.noreply.github.com>	2024-12-03 14:24:50 +01:00
Oleg Zaytsev	cd1f8ac129	MemPostings: keep a map of label values slices (#15426 ) While investigating lock contention on `MemPostings`, we saw that lots of locking is happening in `LabelValues` and `PostingsForLabelsMatching`, both copying the label values slices while holding the mutex. This adds an extra map that holds an append-only label values slice for each one of the label names. Since the slice is append-only, it can be copied without holding the mutex. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-11-29 12:52:56 +01:00
Charles Korn	96adc410ba	tsdb/chunkenc: don't reuse custom value slices between histograms Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-11-29 16:28:09 +11:00
Oleg Zaytsev	9ad93ba8df	Optimize l=~".+" matcher (#15474 ) Since dot is matching newline now, `l=~".+"` is "any non empty label value", and #14144 added a specific method in the index for that so we don't need to run the matcher on each one of the label values. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-11-27 12:33:20 +01:00
Bryan Boreham	ca3119bd24	TSDB: eliminate one yolostring When the only use of a []byte->string conversion is as a map key, Go doesn't allocate. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-11-26 17:21:55 +00:00
Bryan Boreham	e98c19c1ce	[PERF] TSDB: Cache all symbols for compaction Trade a bit more memory for a lot less CPU spent looking up symbols. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-11-26 17:21:55 +00:00
Oleg Zaytsev	9aa6e041d3	MemPostings: allocate ListPostings once in PFALV (#15465 ) Same as #15427 but for the new method added in #14144 Instead of allocating each ListPostings one by one, allocate them all in one go. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-11-26 16:03:45 +01:00
DC	d535d501d1	[DOCS] Improve description of WAL record format (#14936 ) Signed-off-by: DC <413331538@qq.com>	2024-11-26 11:48:17 +00:00
Bryan Boreham	dd0252a774	Merge pull request #15380 from bboreham/improve-loadwbl [BUGFIX] TSDB: Apply fixes from loadWAL to loadWBL	2024-11-25 17:31:49 +00:00
Bryan Boreham	7996a13fdd	Merge pull request #15403 from bboreham/fix-rw-benchmark-startup [TESTS] Remote-Write: Fix BenchmarkStartup	2024-11-25 17:31:24 +00:00
Oleg Zaytsev	cc390aab64	MemPostings: allocate ListPostings once in PFLM (#15427 ) Instead of allocating ListPostings pointers one by one, allocate a slice and take pointers from that. It's faster, and also generates less garbage (NewListPostings is one of the top offenders in number of allocations). Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-11-20 17:52:20 +01:00
Arve Knudsen	89bbb885e5	Upgrade to golangci-lint v1.62.0 (#15424 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-11-20 17:22:20 +01:00
Björn Rabenstein	384c5951ef	Merge pull request #14489 from harry671003/implement_metadata_limit storage: Implement limit in mergeGenericQuerier	2024-11-19 17:32:16 +01:00
Arve Knudsen	06d54fcc6c	[PERF] TSDB: Optimize inverse matching (#14144 ) Simple follow-up to #13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching. Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method. The performance is much improved for a subset of the cases; there are up to ~60% CPU gains and ~12.5% reduction in memory usage. Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since `inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-11-19 15:49:01 +00:00
Bryan Boreham	0ef0b75a4f	[TESTS] Remote-Write: Fix BenchmarkStartup It was crashing due to uninitialized metrics, and not terminating due to incorrectly reading segment names. We need to export `SetMetrics` to avoid the first problem. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-11-15 11:22:07 +00:00
Fiona Liao	c599d37668	Always return unknown hint for first sample in non-gauge histogram chunk (#15343 ) Always return unknown hint for first sample in non-gauge histogram chunk --------- Signed-off-by: Fiona Liao <fiona.liao@grafana.com> Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-11-12 15:14:06 +01:00
Bryan Boreham	5450e6d368	[BUGFIX] TSDB: Apply fixes from loadWAL to loadWBL Move a couple of variables inside the scope of a goroutine, to avoid data races. Use `zeropool` to reduce garbage and avoid some lint warnings. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-11-11 18:41:33 +00:00
Ben Ye	140f4aa9ae	feat: Allow customizing TSDB postings decoder (#13567 ) * allow customizing TSDB postings decoder --------- Signed-off-by: Ben Ye <benye@amazon.com>	2024-11-11 07:59:24 +01:00
Ben Ye	f9057544cb	Fix AllPostings added twice (#13893 ) * handle all postings added twice --------- Signed-off-by: Ben Ye <benye@amazon.com>	2024-11-10 18:17:21 +01:00
🌲 Harry 🌊 John 🏔	f9bc50b247	storage: Implement limit in mergeGenericQuerier Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>	2024-11-07 09:08:23 -08:00
Bryan Boreham	f42b37ff2f	[BUGFIX] TSDB: Fix race on stale values in headAppender (#15322 ) * [BUGFIX] TSDB: Fix race on stale values in headAppender Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Simplify Signed-off-by: Bryan Boreham <bjboreham@gmail.com> --------- Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-11-06 16:51:39 +01:00
Matthieu MOREL	af1a19fc78	enable errorf rule from perfsprint linter Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-11-06 16:50:36 +01:00
Bryan Boreham	02aa6d1de6	Merge pull request #15338 from bboreham/cosmetic-tsdb [COMMENT] Remove duplicate line	2024-11-05 12:03:04 +00:00
Oleg Zaytsev	b1e4052682	MemPostings.Delete(): make pauses to unlock and let the readers read (#15242 ) This introduces back some unlocking that was removed in #13286 but in a more balanced way, as suggested by @pracucci. For TSDBs with a lot of churn, Delete() can take a couple of seconds, and while it's holding the mutex, reads and writes are blocked waiting for that mutex, increasing the number of connections handled and memory usage. This implementation pauses every 4K labels processed (note that also compared to #13286 we're not processing all the label-values anymore, but only the affected ones, because of #14307), makes sure that it's possible to get the read lock, and waits for a few milliseconds more. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>	2024-11-05 12:59:57 +01:00
Bryan Boreham	541c7fd9fe	[COMMENT] Remove duplicate line Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-11-05 11:03:40 +00:00
Alban Hurtaud	4b56af7eb8	Add hidden flag for the delayed compaction random time window (#14919 ) * Add hidden flag for the delayed compaction random time window Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Update cmd/prometheus/main.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Update cmd/prometheus/main.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Update tsdb/db.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Fix flag name according to review - add test for delay Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Fix afer main rebase Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Implement review comments Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Update generatedelaytest to try with limit values Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> --------- Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2024-11-04 08:26:26 +01:00
Bryan Boreham	2fbbfc3da8	Revert "Fix `MemPostings.Add` and `MemPostings.Get` data race (#15141 )" This reverts commit `50ef0dc954`. Memory allocation goes so high in Prombench that the system is unusable. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-11-03 12:30:34 +00:00
Bryan Boreham	e2e01c1cff	Merge pull request #15216 from yeya24/log-last-series-labels log last series labelset when hitting OOO series labels	2024-11-01 14:15:39 +00:00
Oleg Zaytsev	ba11a55df4	Revert "Process `MemPostings.Delete()` with `GOMAXPROCS` workers" Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-10-29 17:13:40 +01:00
Nicolas Takashi	b6c538972c	[REFACTORY] simplify appender commit (#15112 ) * [REFACTOR] simplify appender commit Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com> Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com> Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>	2024-10-29 12:34:02 +00:00
Arve Knudsen	706dcfeecf	tsdb.CircularExemplarStorage: Avoid racing (#15231 ) * tsdb.CircularExemplarStorage: Avoid racing --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-10-29 10:40:46 +01:00
Pedro Tanaka	bab587b9dc	Agent: allow for ingestion of CT samples (#15124 ) * Remove unused option from HeadOptions Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * Improve docs for appendable() method in head appender Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * Ingest CT (float) samples in Agent DB Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * allow for ingestion of CT native histogram Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * adding some verification for ct ts Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * Validating CT histogram before append and add newly created series to pending series Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * checking the wal for written samples Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * Checking for samples in test Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * adding case for validations Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * fixing comparison when dedupelabels is enabled Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * unite tests, use table testing Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * Implement CT related methods in timestampTracker for write storage Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * adding error case to test Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * removing unused fields Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * Updating lastTs for series when adding CT to invalidate duplicates Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> * making sure that updating the lastTS wont cause OOO later on in Commit(); Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com> --------- Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>	2024-10-27 01:06:34 +01:00
Ayoub Mrini	93db81dd3d	Merge pull request #14983 from machine424/dopp fix(storage/mergeQuerier): fix a data race	2024-10-25 18:34:51 +02:00
Łukasz Mierzwa	b6e22cd346	Short-cut common memChunk operations memChunk is a linked list, speed up some common operations when there's no need to iterate all elements on the list. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2024-10-25 12:19:20 +01:00
Ben Ye	99882eec3b	log last series labelset when hitting OOO series labels during compaction Signed-off-by: Ben Ye <benye@amazon.com>	2024-10-24 09:27:15 -07:00
Vanshika	cccbe72514	TSDB: Fix some edge cases when OOO is enabled (#14710 ) Fix some edge cases when OOO is enabled Signed-off-by: Vanshikav123 <vanshikav928@gmail.com> Signed-off-by: Vanshika <102902652+Vanshikav123@users.noreply.github.com> Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com> Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>	2024-10-23 17:34:28 +02:00
machine424	cebcdce78a	fix(storage/mergeQuerier): copy the matcjers slice before passing it to queriers as some of them may alter it. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-10-22 14:08:47 +02:00
machine424	eb523a6b29	fix(storage/mergeQuerier): add a reproducer for data race that occurs when one of the queriers alters the passed matchers and propose a fix Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-10-22 14:08:46 +02:00
György Krajcsovits	a4083f14e8	Fix populateWithDelChunkSeriesIterator corrupting chunk meta When handling recoded histogram chunks the min time of the chunk is updated by mistake. It should only update when the chunk is completely new. Otherwise the ongoing chunk's meta will be later than the previously written samples in it. Same bug as https://github.com/prometheus/prometheus/pull/14629 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-10-18 10:34:22 +02:00
György Krajcsovits	e6a682f046	Reproduce populateWithDelChunkSeriesIterator corrupting chunk meta When handling recoded histogram chunks the min time of the chunk is updated by mistake. It should only update when the chunk is completely new. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-10-18 10:34:22 +02:00
machine424	ab2475c426	test(tsdb): add a reproducer for https://github.com/prometheus/prometheus/issues/14422 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-10-15 20:39:25 +02:00
Bryan Boreham	1e1f6ab9df	Merge pull request #15120 from bboreham/floor-ino-mint [BUGFIX] TSDB: Don't read in-order chunks from before head MinTime	2024-10-15 10:27:38 +01:00
George Krajcsovits	b8867f8ead	Merge pull request #15142 from krajorama/fix-appendhistogram-race bugfix: data race in head.Appender.AppendHistogram and Commit	2024-10-14 08:13:39 +02:00
Oleg Zaytsev	50ef0dc954	Fix `MemPostings.Add` and `MemPostings.Get` data race (#15141 ) * Tests for Mempostings.{Add,Get} data race * Fix MemPostings.{Add,Get} data race We can't modify the postings list that are held in MemPostings as they might already be in use by some readers. * Modify BenchmarkHeadStripeSeriesCreate to have common labels If there are no common labels on the series, we don't excercise the ordering part of MemSeries, as we're just creating slices of one element for each label value. --------- Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-10-11 15:21:15 +02:00
György Krajcsovits	bb70370d72	TSDB head: fix race between AppendHistogram and Commit Move writing memSeries lastHistogramValue and lastFloatHistogramValue after series creation under lock. The resulting code isn't totally correct in the sense that we're setting these values before Commit() , so they might be overwritten/rolled back later. Also Append of stale sample checks the values without lock, so there's still a potential race. The correct solution would be to set these only in Commit() which we actually do, but then Commit() would also need to process samples in order and not floats first, then histograms, then float histograms - which leads to not knowing what stale marker to write for histograms. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-10-10 16:59:15 +02:00
György Krajcsovits	631fadc4ca	Unit test for data race in head.Appender.AppendHistogram Two Appenders race when creating a series with a native histogram as the memSeries will be common and the lastHistogram field is written without lock. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-10-10 14:10:07 +02:00
beorn7	12c39d5421	docs: Some nitpicking in chunks.md - `float histogram` → `floathistogram`, as it is used in the code. - Actual link encodings to the code (to find the actual numerical values). - `<bytes>` → `<data>` for consistency. Signed-off-by: beorn7 <beorn@grafana.com>	2024-10-09 14:32:12 +02:00
beorn7	a4cb52ff15	docs: Update chunk layot for NHCB Signed-off-by: beorn7 <beorn@grafana.com>	2024-10-09 14:19:20 +02:00
Björn Rabenstein	02d0de9987	Merge pull request #14997 from fionaliao/fl/update-format-docs Update chunk format docs with native histograms and OOO	2024-10-09 13:29:01 +02:00
TJ Hoplock	6ebfbd2d54	chore!: adopt log/slog, remove go-kit/log For: #14355 This commit updates Prometheus to adopt stdlib's log/slog package in favor of go-kit/log. As part of converting to use slog, several other related changes are required to get prometheus working, including: - removed unused logging util func `RateLimit()` - forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger - move some of the json file logging functionality to use prom/common package functionality - refactored some of the new json file logging for scraping - changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers - updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition - added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>	2024-10-07 15:58:50 -04:00
Bryan Boreham	91de19fbef	[BUGFIX] TSDB: Don't read in-order chunks from before head MinTime Because we are reimplementing the `IndexReader` to fetch in-order and out-of-order chunks together, we must reproduce the behaviour of `Head.indexRange()`, which floors the minimum time queried at `head.MinTime()`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-10-07 13:50:03 +01:00
Matthieu MOREL	ab64966e9d	fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" (#15094 ) * fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" --------- Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-10-06 16:35:29 +00:00
György Krajcsovits	44ebbb8458	Fix missing histogram copy in sampleRing The specialized version of sample add to the ring: func addH(s hSample, buf []hSample, r sampleRing) []hSample func addFH(s fhSample, buf []fhSample, r sampleRing) []fhSample already correctly copy histogram samples from the reused hReader, fhReader buffers, but the generic version does not. This means that the data is overwritten on the next read if the sample ring has seen histogram and float samples at the same time and switched to generic mode. The `genericAdd` function (which was commented anyway) is by now quite different from the specialized functions so that this commit deletes it. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-10-02 13:57:28 +02:00
Bryan Boreham	54de4fb780	Merge pull request #14975 from colega/process-mempostings-delete-with-gomaxprocs-workers Process `MemPostings.Delete()` with `GOMAXPROCS` workers	2024-09-29 07:58:42 +01:00
Fiona Liao	fd62dbc291	Update chunk format docs with native histograms and OOO Signed-off-by: Fiona Liao <fiona.liao@grafana.com>	2024-09-27 18:57:58 +01:00
Ayoub Mrini	105ab2e95a	fix(test): adjust defer invocations (#14996 ) Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-09-27 17:13:51 +01:00
Oleg Zaytsev	ada8a6ef10	Add some more tests for MemPostings_Delete Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-09-27 10:14:39 +02:00
Arthur Silva Sens	d5f65cfce0	Merge pull request #14694 from prometheus/ct-histogram Histogram CT Zero ingestion	2024-09-26 12:48:46 -03:00
Arthur Silva Sens	95a53ef982	Join tests for appending float and histogram CTs Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>	2024-09-26 11:29:31 -03:00
Arthur Silva Sens	6bd9b1a7cc	Histogram CT Zero ingestion Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>	2024-09-26 11:29:22 -03:00
Oleg Zaytsev	4fd2556baa	Extract processWithBoundedParallelismAndConsistentWorkers Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-09-26 15:43:19 +02:00
Oleg Zaytsev	ccd0308abc	Don't do anything if MemPostings are empty Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-09-25 15:00:10 +02:00
Oleg Zaytsev	9c417aa710	Fix deadlock with empty MemPostings Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-09-25 14:08:50 +02:00
Bryan Boreham	5d8f0ef0c2	Merge pull request #14721 from bboreham/exp-grow-postings [PERF] TSDB: Grow postings by doubling	2024-09-25 10:47:55 +01:00
Oleg Zaytsev	e196b977af	Process MemPostings.Delete() with GOMAXPROCS workers We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex. This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-09-25 10:38:47 +02:00
Bryan Boreham	ca673eb749	Merge remote-tracking branch 'origin/release-2.55' into merge-2.55-into-main Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-22 17:49:34 +01:00
Bryan Boreham	31c5760551	Neater string vs byte-slice conversions (#14425 ) unsafe.Slice and unsafe.StringData were added in Go 1.20 Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-21 12:19:21 +02:00
Bryan Boreham	d42232e178	Merge pull request #14932 from bboreham/chunk-xor-combine-writebits [PERF] TSDB: Chunk encoding: shorten some write sequences	2024-09-20 17:53:54 +01:00
Bryan Boreham	6f0d6038b7	[BUGFIX] TSDB: Only query chunks up to truncation time (#14948 ) If the query overlaps the range currently undergoing compaction, we should only fetch chunks up to that time. Need to store that min time in `HeadAndOOOIndexReader`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-20 17:44:04 +01:00
Bryan Boreham	9215252221	[BUGFIX] TSDB: Only query chunks up to truncation time (#14948 ) If the query overlaps the range currently undergoing compaction, we should only fetch chunks up to that time. Need to store that min time in `HeadAndOOOIndexReader`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-20 18:40:17 +02:00
Ganesh Vernekar	5ccb069414	Backward compatibility with upcoming index v3 Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2024-09-19 10:27:52 +01:00
George Krajcsovits	0d22a91267	Merge pull request #14874 from krajorama/fix-panic-in-ooo-query2 BUGFIX: TSDB: panic in chunk querier	2024-09-19 10:03:53 +01:00
Bryan Boreham	e8c2d916ec	lint Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-18 15:23:46 +01:00
Bryan Boreham	648a668835	[PERF] Chunk encoding: combine timestamp writes Instead of a 2-bit write followed by a 14-bit write, do two 8-bit writes, which goes much faster since it avoids looping. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-18 13:19:21 +01:00
Bryan Boreham	b9a9689aae	[PERF] Chunk encoding: simplify writeByte Rather than append a zero then set the value at that position, append the value. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-18 13:19:04 +01:00
Bryan Boreham	b65f1b6560	TSDB: Improve xor-chunk benchmarks Benchmarks must do the same work N times. Run 3 cases, where the values are constant, vary a bit, and vary a lot. Also aim for 120 samples same as TSDB default. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-18 13:14:49 +01:00
Bryan Boreham	bb47f78929	Merge pull request #14505 from marioferh/improve_performance_regex [CHANGE] regexp . to match \n and optimize performance	2024-09-18 09:54:16 +01:00
Antoine Pultier	d90d0976b5	fix(bstream/writeByte): ensure it appends only one byte (#14854 ) fix(bstream/writeByte): ensure it appends only one byte Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>	2024-09-17 16:28:33 +02:00
machine424	d1b4312f0a	fix(wlog/watcher_test.go): make TestRun_AvoidNotifyWhenBehind more resilient Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-09-17 13:11:04 +02:00
Mario Fernandez	5814920601	Fix: optimize .* regexp performance Shortcut for `.*` matches newlines as well. Add preamble change ^(?s: Add test dotAll flag por al regex Add and fix regex tests Signed-off-by: Mario Fernandez <mariofer@redhat.com>	2024-09-17 12:18:31 +02:00
Bryan Boreham	d5f4fabd12	Merge pull request #14911 from bboreham/clarify-postings-benchmark TSDB: Simplify benchmark regexps	2024-09-17 11:52:13 +02:00
Carrie Edwards	14e3c05ce8	tsdb: Add support for ingestion of out-of-order native histogram samples (#14546 ) Add support for ingesting OOO native histograms * Add flag for enabling and disabling OOO native histogram ingestion * Update OOO querying tests to include native histogram samples * Add OOO head tests * Add test for OOO native histogram counter reset headers Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Co-authored by: Carrie Edwards <edwrdscarrie@gmail.com> Co-authored by: Jeanette Tan <jeanette.tan@grafana.com> Co-authored by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Co-authored by: Fiona Liao <fiona.liao@grafana.com>	2024-09-17 11:19:06 +02:00
Harry John	919dc0cbc6	storage: Update LabelQuerier interface to return sorted label values (#14849 ) * Change LabelQuerier.LabelValues() to return sorted values --------- Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>	2024-09-17 08:55:02 +02:00
Bryan Boreham	a8133f3e87	TSDB: Simplify benchmark regexps Several regexps were coded like `"^.*$"`, which is an unnatural formulation nobody is likely to use. Inside `NewMatcher`, `^` and `$` are added anyway, which makes the form in the benchmark redundant. It even printed it out in the expected way. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-09-16 17:48:05 +01:00
George Krajcsovits	5aa3d8260a	TSDB: OOO native histograms: prep for multiple ooo head chunks (#14850 ) * tsdb: mmapCurrentOOOHeadChunk prepare for multiple ooo chunks Currently float samples can only create a single ooo head chunk, but native histograms can result in multiple due to counter resets, etc. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * tsdb: getOOOSeriesChunks prepare for multiple ooo chunks Currently float samples can only create a single ooo head chunk, but native histograms can result in multiple due to counter resets, etc. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> --------- Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-09-11 23:55:39 +01:00
Nathan Baulch	50cd453c8f	chore: Fix typos (#14868 ) * Fix typos --------- Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>	2024-09-10 22:32:03 +02:00
Bryan Boreham	16e5e99546	Merge pull request #14767 from bboreham/fix-encoding-comment [Comment] Correct the comment on Decbuf.UvarintBytes	2024-09-09 12:52:36 +01:00
György Krajcsovits	d3f4e7c223	Remove unnecessary conversion Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-09-09 12:51:02 +02:00
György Krajcsovits	60ab1cc5a5	BUGFIX: TSDB: panic in chunk querier Followup to #14831 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-09-09 12:43:02 +02:00
George Krajcsovits	536d9f9ce9	BUGFIX: TSDB: panic in query during truncation with OOO head (#14831 ) Check if headQuerier is nil before trying to use it. * TestQueryOOOHeadDuringTruncate: unit test to check query during truncate Regression test for #14822 * Simulate race between query and Compact() Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-09-05 17:17:42 +01:00
Antoine Pultier	f5971bf292	tsdb documenation: More details about chunks Signed-off-by: Antoine Pultier <45740+fungiboletus@users.noreply.github.com>	2024-09-04 14:57:30 +02:00
Joshua Hesketh	f2064c7987	NH: Do not re-use spans between histograms (#14771 ) promql, tsdb (histograms): Do not re-use spans between histograms When multiple points exist with the same native histogram schemas they share their spans. This causes a problem when a native histogram (NH) schema is modified (for example, during a Sum) then the other NH's with the same spans are also modified. As such, we should create a new Span for each NH. This will ensure NH's interfaces are safe to use without considering the effect on other histograms. At the moment this doesn't present itself as a problem because in all aggregations and functions operating on native histograms they are copied by the promql query engine first. Signed-off-by: Joshua Hesketh <josh@nitrotech.org> --------- Signed-off-by: Joshua Hesketh <josh@nitrotech.org>	2024-09-04 12:07:16 +02:00
George Krajcsovits	282fb1632a	Merge pull request #14772 from krajorama/fix-mockseriesiterator Fix: chunkenc.MockSeriesIterator	2024-09-03 16:55:26 +02:00
Arthur Silva Sens	442f24e099	chore: Simplify TestHeadAppender_AppendCTZeroSample (#14812 ) Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>	2024-09-02 21:30:37 +01:00
Arve Knudsen	2cfc7b244a	Merge pull request #14700 from shandongzhejiang/main Comments: fix some function names	2024-09-02 18:59:28 +02:00
Oleg Zaytsev	ce7d830f1f	Bring back BenchmarkLoadRealWLs (#14757 ) This was part of #14525 which was reverted. I still think that having this benchmark committed in to the repo is useful. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-09-02 17:20:10 +01:00
György Krajcsovits	a693dd19f2	Fix: chunkenc.MockSeriesIterator Starts its index from 0 , but users call Next() before first sample so it needs to start from -1 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-30 16:44:36 +02:00
Bryan Boreham	0a4f130b39	[Comment] Correct the comment on Decbuf.UvarintBytes The value is valid when returned, but can become invalid later. Return to previous wording. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-30 09:40:18 +01:00
Callum Styan	a77f5007f9	fix bug with metadata for rw2 (#14766 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2024-08-30 08:14:20 +01:00
Bryan Boreham	1f38ae7bca	[TESTS] TSDB: fix up OOO tests for new Series behaviour Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-29 10:59:09 +01:00
Bryan Boreham	cde42f30e9	TSDB: streamline reading of overlapping head chunks `getOOOSeriesChunks` was already finding sets of overlapping chunks; we store those in a `multiMeta` struct so that `ChunkOrIterable` can reconstruct an `Iterable` easily and predictably. We no longer need a `MergeOOO` flag to indicate that this Meta should be merged with other ones; this is explicit in the `multiMeta` structure. We also no longer need `chunkMetaAndChunkDiskMapperRef`. Add `wrapOOOHeadChunk` to defeat `chunkenc.Pool` - chunks are reset during compaction, but if we wrap them (like `safeHeadChunk` was doing then this is skipped) . Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-29 10:57:29 +01:00
Bryan Boreham	838e49e7b8	[REFACTOR] TSDB: move chunkFromSeries from headChunkReader to head Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-29 10:51:48 +01:00
Björn Rabenstein	1d6e0071b7	Merge pull request #14751 from riskrole/main chore: fix some comments	2024-08-28 16:38:39 +02:00
riskrole	406bf775aa	chore: fix some comments Signed-off-by: riskrole <yuhang@before.tech>	2024-08-28 11:26:57 +08:00
Marco Pracucci	ef649d5968	Revert " Store `mmMaxTime` in same field as `seriesShard`" Signed-off-by: Marco Pracucci <marco@pracucci.com>	2024-08-26 08:56:16 +02:00
Bryan Boreham	33adbe47b1	[PERF] TSDB: Grow postings by doubling Go's built-in append() grows larger slices with factor 1.3, which means we do a lot more allocating and copying for larger postings. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-24 11:16:58 +01:00
György Krajcsovits	183bbc39a2	Make requesting merge with OOO head explicit in chunk.Meta Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-23 15:50:53 +02:00
György Krajcsovits	41c076196e	New cases in Test_ChunkQuerier_OOOQuery and Test_Querier_OOOQuery Case 1: OOO in-memory head chunk overlaps with first mmaped in-order chunk. Query: \|----------------------------------------------------------------\| InO: \|------mmap---------------\|\|---------mem----------------------\| OOO: \|-----mem-----------\| This triggers ChunkOrIterableWithCopy not including OOO head chunks bug. Similar to #14693 however testing the end of the interval doesn't trigger the problem because there the in-order head chunk will be trimmed with a tombstone, causing the code to switch to ChunkOrIterable which was fixed. See `a36d1a8a92/tsdb/querier.go (L646)` where len(p.bufIter.Intervals) will be non zero, because it includes the tombstone to trim the result to the query max time. Thus a new test is added to check the overlap at the beginning of the interval that has a separate chunk, which does not need trimming. Note: same test doesn't fail for sample querier in Test_Querier_OOOQuery as that doesn't use copy, that is copyHeadChunk is false in the if condition above. Case 2: OOO mmaped head chunk overlaps with first mmaped in-order chunk. Query: \|----------------------------------------------------------------\| InO: \|------mmap---------------\|\|---------mem----------------------\| OOO: \|-----mmap-----------\| \|--mem--\| In this case the meta contains the reference of the in-order chunk and no indication that a merge is needed with the OOO mmaped chunk. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-23 15:50:47 +02:00
Arve Knudsen	b0aba26ed5	tsdb: Fix ValNone typo in comment Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-08-23 08:20:20 +02:00
beorn7	0f760f63dd	lint: Revamp our linting rules, mostly around doc comments Several things done here: - Set `max-issues-per-linter` to 0 so that we actually see all linter warnings and not just 50 per linter. (As we also set `max-same-issues` to 0, I assume this was the intention from the beginning.) - Stop using the golangci-lint default excludes (by setting `exclude-use-default: false`. Those are too generous and don't match our style conventions. (I have re-added some of the excludes explicitly in this commit. See below.) - Re-add the `errcheck` exclusion we have used so far via the defaults. - Exclude the signature requirement `govet` has for `Seek` methods because we use non-standard `Seek` methods a lot. (But we keep other requirements, while the default excludes completely disabled the check for common method segnatures.) - Exclude warnings about missing doc comments on exported symbols. (We used to be pretty adamant about doc comments, but stopped that at some point in the past. By now, we have about 500 missing doc comments. We may consider reintroducing this check, but that's outside of the scope of this commit. The default excludes of golangci-lint essentially ignore doc comments completely.) - By stop using the default excludes, we now get warnings back on malformed doc comments. That's the most impactful change in this commit. It does not enforce doc comments (again), but _if_ there is a doc comment, it has to have the recommended form. (Most of the changes in this commit are fixing this form.) - Improve wording/spelling of some comments in .golangci.yml, and remove an outdated comment. - Leave `package-comments` inactive, but add a TODO asking if we should change that. - Add a new sub-linter `comment-spacings` (and fix corresponding comments), which avoids missing spaces after the leading `//`. Signed-off-by: beorn7 <beorn@grafana.com>	2024-08-22 17:36:11 +02:00
Bryan Boreham	9a74d53935	[BUGFIX] TSDB: Fix query overlapping in-order and ooo head (#14693 ) * tsdb: Unit test query overlapping in order and ooo head Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> * TSDB: Merge overlapping head chunk The basic idea is that getOOOSeriesChunks can populate Meta.Chunk, but since it only returns one Meta per overlapping time-slot, that pointer may end up in a Meta with a head-chunk ID. So we need HeadAndOOOChunkReader.ChunkOrIterable() to call mergedChunks in that case. Previously, mergedChunks was checking that meta.Ref was a valid OOO chunk reference, but it never actually uses that reference; it just finds all chunks overlapping in time. So we can delete that code. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-21 14:24:20 +01:00
shandongzhejiang	b2712ff284	chore: fix some function names Signed-off-by: shandongzhejiang <shandongzhejiang@icloud.com>	2024-08-21 11:09:37 +08:00
Arve Knudsen	3a78e76282	Upgrade golangci-lint to v1.60.1 Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-08-18 12:13:25 +02:00
Bryan Boreham	87dccb1d1b	Merge pull request #14649 from machine424/ftest fix(tsdb/db_test.go): close the corrupted chunk after creating it to satisfy Windows FS	2024-08-16 11:57:54 +01:00
Arve Knudsen	66388f706a	Merge pull request #14042 from aknuds1/arve/wlog-histograms tsdb/wlog: Only treat unknown record types as failure	2024-08-16 12:00:49 +02:00
Björn Rabenstein	1daf7cdd62	Merge pull request #14626 from cuiweiyuan/main chore: fix some function names	2024-08-15 11:46:21 +02:00
cuiweiyuan	1800af54f0	chore: fix some function names Signed-off-by: cuiweiyuan <cuiweiyuan@aliyun.com>	2024-08-15 13:57:21 +08:00
Arve Knudsen	b5d13a1ab5	Merge remote-tracking branch 'prometheus/main' into arve/wlog-histograms Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-08-14 19:04:53 +01:00
Bryan Boreham	512c67ec26	TSDB: Never go over maximum number of OOO chunks In `mmapCurrentOOOHeadChunk`, check if the number is at the maximum and drop the data with an error log. This is not expected to happen as the maximum is over 8 million; that's 8 years of 1 sample every second. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:59 +01:00
Bryan Boreham	9135da1e4f	TSDB: Review feedback Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Re-enable check in `createHeadWithOOOSamples` which wasn't really broken. * Move code making `Block` into a `Queryable` into test file. * Make `getSeriesChunks` return a slice (renamed `appendSeriesChunks`). * Rename `oooMergedChunks` to `mergedChunks`. * Improve comment on `ChunkOrIterableWithCopy`. * Name return values from unpackHeadChunkRef. Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:44 +01:00
Bryan Boreham	7ffd3ca280	TSDB: Cosmetic: move HeadAndOOO implementations where old code was This makes the diffs easier to follow. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:13 +01:00
Bryan Boreham	e95607b276	TSDB: Lock round access to labels, where necessary Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:13 +01:00
Bryan Boreham	26b3de0438	TSDB: Remove OOOHeadIndexReader Use headIndexReader instead. OOOCompactionHeadIndexReader needs to be expanded slightly, because it previously delegated to OOOHeadIndexReader. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:13 +01:00
Bryan Boreham	a299c7b6d6	TSDB: Remove OOOHeadChunkReader Use HeadAndOOOChunkReader instead. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:13 +01:00
Bryan Boreham	e7e50a3afd	TSDB: Remove code for querying OOO-head only Just query via `HeadAndOOOQuerier`, which will skip series where no in-order chunks are in range. Now we don't need `OOORangeHead`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:13 +01:00
Bryan Boreham	0a2ff76881	TSDB tests: Fix up BenchmarkQueries Was not working even on main. Some cases still error. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 13:41:04 +01:00
Bryan Boreham	f261597944	TSDB: Fix up LabelValues to work for OOO-only head Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
Bryan Boreham	6529d6336c	TSDB: NewHeadAndOOOChunkReader takes headChunkReader So we can pass nil and have it read just OOO chunks. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
Bryan Boreham	e04d137649	[PERF] TSDB: Query head and ooo-head together Add `HeadAndOOOQuerier` which iterates just once over series, then where necessary merges chunks from in-order and out-of-order lists. Add a ChunkQuerier for in-order and ooo together Add copy-last-chunk behaviour to HeadAndOOOChunkReader Out-of-order chunk IDs are distinguished from in-order by setting bit 23. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
Bryan Boreham	da31da3ea6	Refactor: extract selectSeriesSet and selectChunkSeriesSet Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
Bryan Boreham	7e24844d08	Refactor: extract headChunkReader.chunkFromSeries() For when you have a series locked already. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
Bryan Boreham	a32aca0cd7	Refactoring: extract getOOOSeriesChunks Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
Bryan Boreham	c75c8f8329	Refactoring: extract getSeriesChunks Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
Bryan Boreham	0c852680bf	[Benchmark] TSDB: Add BenchmarkQuerierSelectWithOutOfOrder Refactor existing BenchmarkQuerierSelect to provide the set-up. Note that Head queries now run faster because they use a RangeHead. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-14 11:19:02 +01:00
György Krajcsovits	41656162fc	tsdb: prepare inserting native histograms into OOO head Rename a variable. Add parameters to memSeries.insert function. No effect on how float samples are handled. Related to #14546 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-14 11:13:47 +02:00
Bryan Boreham	aa4b056ad0	Merge pull request #13200 from bboreham/wlog-defer tsdb/wlog: close segment files sooner	2024-08-13 14:11:38 +01:00
machine424	82f38d3e9a	fix(tsdb/db_test.go): close the corrupted chunk after creating it to satisfy Windows FS Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-08-09 14:53:57 +02:00
George Krajcsovits	cf62fb5c44	Merge pull request #14629 from krajorama/fix-to-encoded-chunks Fix ToEncodedChunks minT for recoded chunks	2024-08-08 20:00:31 +02:00
György Krajcsovits	1ea3781699	Fix ToEncodedChunks minT for recoded chunks Discovered while working on #14546 OOO native histograms. Not triggered on main before #14546 as the code path is unused. There was a bug where the min time of a chunk was adjusted even if it was only recoded and not completely new. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-08 15:22:46 +02:00
Ben Ye	b7a58dcf3d	Add hidden flag to disable overlapping compaction (#14581 ) TSDB: add hidden flag to disable overlapping compaction Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	2024-08-08 12:09:39 +02:00
George Krajcsovits	3a673cd0bc	Merge pull request #14598 from krajorama/fix-compaction-panic Fix: panic: runtime error: index out of range [4] with length 4	2024-08-07 17:14:14 +02:00
machine424	92873d3009	feat: allow to delay head compaction start time helping Prometheus instances to avoid simultaneous compactions and reduce stress on shared resources. This is enabled via `--enable-feature=delayed-compaction`. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-08-07 17:10:27 +02:00
Oleg Zaytsev	0833d2a230	Fix appendable: check whether last val was a histogram (#14613 ) * Fix appendable: check whether last val was a histogram When appending a float, we were checking whether lastValue was equal to current value, but we didn't check whether last value was a float value. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-08-07 15:02:59 +02:00
György Krajcsovits	98ecdf3589	Fix corrupting spans via iterator sharing Iterator may share spans without copy, so we always have to make a copy before modification - copy-on-write. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-06 16:51:20 +02:00
György Krajcsovits	d2f6fa7289	Fix lint error Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-06 13:24:46 +02:00
György Krajcsovits	1b6d1366d8	Fix re-code histogram and chunk re-code conflict Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-06 13:09:17 +02:00
György Krajcsovits	aff089a014	Reproduce recoding bug with new and missing buckets Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-06 10:51:44 +02:00
Bryan Boreham	80adc5baf4	Merge remote-tracking branch 'origin/main' into merge-2.54-to-main	2024-08-06 09:19:55 +01:00
machine424	9e43ad2e37	chore(remote_write): clean up as watcher.go is part of wlog now Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-08-05 13:40:23 +02:00
Bryan Boreham	015638c4b6	[BUGFIX] TSDB: Exclude OOO chunks mapped after compaction starts Otherwise the writer can end up with invalid chunks. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-05 10:35:34 +01:00
Bryan Boreham	bded853035	[Test] TSDB: TestOOOCompaction with samples added after compaction starts Test fails due to bug. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-08-05 10:35:34 +01:00
George Krajcsovits	00ab05c3b9	Native histograms: fix spurios counter reset when merging recoded chunk to normal chunk (#14513 ) * chunkenc: allow missing empty buckets on histogram append Allow appending to chunks when the histogram to be added is missing some buckets, but the missing buckets are empty in the chunk. For example bucket at index 5 is present in the chunk, but its value is 0 and the new histogram doesn't have a bucket at index 5. This fixes an issue of merging chunks where one chunk was recoded to retroactively have some empty buckets in all the histograms and we are merging in a histogram that doesn't have the empty bucket (because it was not recoded yet). The operation alters the histogram that is being added, however this has already been the case when appending gauge histograms. Thus the test TestHistogramSeriesToChunks in storage package is changed to explicitly test what happened to the appended histogram - Compact(0) call is removed. The new expandIntSpansAndBuckets and expandFloatSpansAndBuckets functions are a merge of expandSpansForward and counterResetInAnyBucket and counterResetInAnyFloatBucket. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-01 09:22:32 +02:00
Bartlomiej Plotka	6816149852	Merge pull request #14525 from colega/merge-mmmaxtime-into-shardhash Store `mmMaxTime` in same field as `seriesShard`	2024-07-31 08:39:38 +02:00
Max Amin	84b819a69f	feat: add Google cloud roundtripper for remote write (#14346 ) * feat: Google Auth for remote write Signed-off-by: Max Amin <maxamin@google.com> --------- Signed-off-by: Max Amin <maxamin@google.com>	2024-07-30 16:25:19 +01:00
Oleg Zaytsev	0300ad58a9	Revert the option regardless of error Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-07-30 11:31:31 +02:00
Oleg Zaytsev	d8e1b6bdfd	Store mmMaxTime in same field as seriesShard We don't use seriesShard during DB initialization, so we can use the same 8 bytes to store mmMaxTime, and save those during the rest of the lifetime of the database. This doesn't affect CPU performance. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-07-30 10:20:29 +02:00
Oleg Zaytsev	b7f2f3c3ac	Add BenchmarkLoadRealWLs This benchmark runs on real WLs rather than fake generated ones. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-07-30 10:19:56 +02:00

... 2 3 4 5 6 ...

1441 Commits