elasticsearch

Commit Graph

Author	SHA1	Message	Date
Stuart Tettemer	ecd13e3f11	Metrics test framework (#101168 ) Adds a test framework that validates instruments are registered before they are called and are not double registered. Also records all invocations of Instruments and allows test authors to add validation to instruments.	2023-10-24 09:05:16 -05:00
Yang Wang	aa30dad01f	S3 CAS operation should respect abortMutipartUpload failure (#101253 ) We inadvertently made s3 CAS operation to ignore abortMutipartUpload failures in #98664. This PR fixes it.	2023-10-24 04:37:05 -04:00
David Turner	2757e30010	Make S3 anti-contention delay configurable (#101245 ) The anti-contention delay in the S3 repository's compare-and-exchange operation is hard-coded at 1 second today, but sometimes we encounter a repository that needs much longer to perform a compare-and-exchange operation when under contention. With this commit we make the anti-contention delay configurable.	2023-10-24 08:13:55 +01:00
Stuart Tettemer	d8b2c52c82	Metrics refactor - split registry and service (#101154 ) This splits out the registry and the service, which makes testing easier and removes much of the delegation from the old `APMMeter` to `Instruments` (now renamed `APMMeterRegistry`). APMMeterService takes care of the lifecycle and APMMeterRegistry holds the instruments.	2023-10-23 13:28:46 -05:00
David Turner	1eda6ac74b	Extract ESIntegTestCase#prepareSearch (#101175 ) Relates #101172	2023-10-20 06:18:58 -04:00
Ryan Ernst	8a1db8c6c3	Move index version constants to IndexVersions (#101094 ) Similar to the TransportVersions holder class, IndexVersions is the new place to contain all constants for IndexVersion. This commit moves all existing constants to the new class. It is purely mechanical.	2023-10-19 20:44:51 -04:00
Armin Braun	bae6991fb3	Remove ~600 references to SearchResponse in tests (#100966 ) We'd like to make `SearchResponse` reference counted and pooled but there are around 6k instances of tests that create a `SearchResponse` local variable that would need to be released manually to avoid leaks in the tests. This does away with about 10% of these spots by adding an override for `assertHitCount` that handles the actual execution of the search request and its release automatically and making use of it in all spots where the `.get()` on the request build could be inlined semi-automatically and in a straight-forward fashion without other code changes.	2023-10-17 15:43:36 +02:00
Ievgen Degtiarenko	14952cd7ea	Fix collision in field names (#100940 ) Emmited metrics could not be index as elasticsearch.metrics.s3.exceptions field is both long counter and a parent object for a histogram. This change renames histogram to avoid the conflict.	2023-10-17 09:40:41 +02:00
Yang Wang	65b4d594ae	Push s3 requests count via metrics API (#100383 ) This PR builds on top of #100464 to publish s3 request count via the metrics API. The metric takes the name of `repositories.requests.count` with attributes/dimensions of `{"repo_type": "s3", "repo_name": "xxx", "operation": "xxx", "purpose": "xxx"}`. Closes: ES-6801	2023-10-16 10:01:26 +11:00
David Turner	63030a31cb	Compute repo format version within delete/cleanup (#100714 ) Today we rely on the caller computing the appropriate repository format version based on the nodes in the cluster and the snapshots in (some recent copy of) the `RepositoryData`. This commit moves that computation into `createSnapshotsDeletion` so that (a) we can be sure to use the same `RepositoryData` used for the rest of the process, and (b) we avoid dispatching work to the SNAPSHOT pool twice. Relates a comment on #100657	2023-10-12 02:23:07 -04:00
Ievgen Degtiarenko	77b4fd12bf	Log aws request metrics (#100272 ) This change logs amount of requests, exceptions and throttles from aws s3 api aggregated over a tumbling window.	2023-10-10 17:16:04 +02:00
David Turner	969fd2acbf	Snapshot deletion process cleanups (#100568 ) Reorders the methods involved in snapshot deletion to be closer together and better match the flow of execution, and harmonises the names of many parameters and local variables to make it easier to follow them through the process.	2023-10-10 13:39:58 +01:00
Yang Wang	525fe59ee2	Make APM meter available in s3 blobstore (#100464 ) This PR wires the new Meter interface into S3BlobStore. The new meter field remains unused in this PR. Actual metric collection will be addressed in follow-ups. Relates: ES-6801	2023-10-09 06:01:20 -04:00
Yang Wang	a4db40d89c	Record operation purpose for s3 stats collection (#100236 ) A new no-op OperationPurpose parameter is added in #99615 to all blob store/container operation method. This PR updates the s3 stats collection code to actually use this parameter for finer grained stats collection and reports. This differentiation between purposes are kept internally for now. The stats are currently aggregated over operations for existing stats reporting. This means responses from both GetRepositoriesMetering API and GetBlobStoreStats API will not be changed. We will have follow-ups to expose the finer stats separately. Relates: #99615 Relates: ES-6800	2023-10-09 19:55:31 +11:00
Armin Braun	b7eafce32c	Make some practically static methods static (#97565 ) Another round of automated fixes to this, marking things that can be made static as static. Saves some JIT cycles but also turns some lambdas from capturing to non-capturing and makes the "utilityness" of some classes visible.	2023-10-06 23:37:07 +02:00
Yang Wang	5628392fa5	Differentiate stats for the same blobstore operation with purposes (#99615 ) Today blobstore stats are collected against each HTTP operation, e.g. Get, List. This is not granular enough because the same HTTP operration can be performed for different purposes, e.g. cluster state, indices or translog. This PR adds a new Purpose enum to provide further breakdown for the same HTTP operation. Relates: ES-6800	2023-10-02 06:37:08 -04:00
Przemyslaw Gomulka	eca41871aa	Use TelemetryProvider in Plugin::createComponents (#99737 ) in order to avoid adding yet anther parameter to createComponents a Tracer interface is replaced with TelemetryProvider. this allows to get both Tracer and Metric (in the future) interfaces	2023-09-22 14:48:11 +02:00
Przemyslaw Gomulka	b6747b48ba	Rename tracing to telemetry package (#99710 ) This commit renames the tracing to telemetry.tracing in both xpack/APM and elasticserach's org.elasticsearch.tracing.Tracer (the api) the xpack/APM is renamed as follows: org.elasticsearch.telemetry.apm - the only exported package org.elasticsearch.telemetry.apm.settings - APMSettings org.elasticsearch.telemetry.apm.tracing - APMTracer org.elasticsearch.tracing.Tracer is moved to org.elasticsearch.telemetry.tracing.Tracer (responsible for majority of the changes in this PR)	2023-09-20 16:58:02 +02:00
David Turner	4ee229779b	Clean up delete code in S3BlobContainer (#99447 ) Simplifies things using utils from `Iterators` that didn't exist when the code was first written.	2023-09-12 07:16:39 +01:00
David Turner	082f36578d	Remove string-based scheduleUnlessShuttingDown (#99131 ) Relates #99051, #99027	2023-09-04 03:31:58 -04:00
Simon Cooper	e1f353c2cf	Convert even more index created version to IndexVersion (#99088 )	2023-08-31 13:08:26 +01:00
David Turner	a20ee3f8f2	Migrate simple usages of ThreadPool#schedule (#99051 ) In #99027 we deprecated the string-based version of `ThreadPool#schedule`. This commit migrates all the simple usages of this API to the new version.	2023-08-31 07:37:31 +01:00
Francisco Fernández Castaño	f6a2b5c9ef	Add bulk delete method to BlobStore interface and implementations (#98948 )	2023-08-29 12:25:03 +02:00
David Turner	e4af2bfe92	Add TTL on S3 CAS uploads (#98664 ) Compare-and-swap operations on a S3 repository are implemented using multipart uploads. Today to try and avoid collisions we refuse to perform a compare-and-swap if there are other concurrent uploads in process. However this means that a node which crashes partway through a compare-and-swap will block all future register operations. With this commit we introduce a time-to-live on S3 multipart uploads, such that uploads older than the TTL now do not block future compare-and-swap attempts.	2023-08-23 06:51:58 +01:00
Yang Wang	93ba27697e	Collect additional object store stats for S3 (#98083 ) This PR adds additional stats collectiosn for s3, including Delete and Abort. It also fixes an issue where ListNextBatchObject is not metered.	2023-08-03 06:31:04 -04:00
Simon Cooper	55cf37cedd	Migrate Snapshot repository version to IndexVersion (#97226 )	2023-07-04 11:42:46 +01:00
Simon Cooper	5486667d73	Convert snapshot version to IndexVersion (#96857 )	2023-06-28 16:04:19 +01:00
Ievgen Degtiarenko	d9b6c5ae29	Wire IndicesService to plugins (#97081 ) This change exposes IndicesService to the plugins via Plugin#createComponents	2023-06-27 18:02:23 +02:00
Armin Braun	dd7d381922	Dry up getting cluster admin client in tests (#96952 ) Drying this up further and adding the same short-cut for single node tests. Dealing with most of the spots that I could grab via automatic refactorings.	2023-06-22 14:27:23 +02:00
Tim Brooks	ac829edc55	Enable skip methods on retrying inputstreams (#96337 ) Currently we have a number of input streams that specifically override the skip() method disabling the ability to skip bytes. In each case the skip implementation works as we have properly implemented the read(byte[]) methods used to discard bytes. However, we appear to have disabled it as it would be possible to retry from the end of a skip if there is a failure in the middle. At this time, that optimization is not really necessary, however, we sporadically used skip so it would be nice for the IS to support the method. This commit enables the super.skip() and adds a comment about future optimizations.	2023-05-25 10:11:27 -06:00
David Turner	9761089698	Fix NPE in S3BlobContainer (#96168 ) Relates #96019 Closes #96162	2023-05-16 11:36:12 -04:00
David Turner	350beea181	Arbitrary bytes in blob store register (#96019 ) Today the blob store register supports recording only a `long`, represented as an 8-byte blob. We need to store a little more data in the register, so this commit generalises things to work with a `BytesReference` directly.	2023-05-16 06:16:21 -04:00
Rene Groeschke	44cc172219	Update Gradle wrapper to 8.1 (#94663 ) - Udpate docker compose plugin to use 8.1 compliant version - Fix deprecations of test task configurations	2023-04-13 16:11:51 +02:00
Mark Vieira	cbc73a7665	Register test artifacts for service-account security QA project (#94602 )	2023-03-21 12:15:05 -07:00
David Turner	49d5cd7f26	S3 compare-and-exchange implementation (#94150 ) Adds an implementation of `compareAndExchangeRegister` to `S3BlobContainer`.	2023-02-28 06:44:15 -05:00
David Turner	95daf492fc	Async blob-store compare-and-exchange API (#94092 ) Further work towards the S3 compare-and-exchange implementation showed that we would like this API to permit async operations. This commit moves to an async API. Also, this change made it fairly awkward to use an exception to deliver to the caller the indication that the current value could not be read, so this commit adjusts things to use `OptionalLong` throughout as suggested in the discussion on #93955.	2023-02-27 08:41:34 +00:00
Armin Braun	a6f63df111	Introduce BlobStoreRepository CAS Mechanism (#93825 ) Only for testing purposes through the `FsRepository` for now and rather simple, but should get the job done and technically be correct for a compliant NFS implementation. Co-authored-by: David Turner <david.turner@elastic.co>	2023-02-16 14:26:12 +00:00
Joe Gallo	582f1be95e	Update log4j2 LICENSE and NOTICE files (#93611 )	2023-02-09 08:53:43 -05:00
Armin Braun	f2760c6e18	Nicer buffer handling (#93491 ) Some optimisations that I found when reusing searchable snapshot code elsewhere: * Add an efficient input stream -> byte buffer path that avoids allocations + copies for heap buffers, this is non-trivial in its effects IMO * Also at least avoid allocations and use existing thread-local buffer when doing input stream -> direct bb * move `readFully` to lower level streams class to enable this * Use same thread local direct byte buffer for frozen and caching index input instead of constantly allocating new heap buffers and writing those to disk inefficiently	2023-02-06 10:55:56 +01:00
Ievgen Degtiarenko	ad229dd70e	Update createComponents to supply AllocationService instead of AllocationDeciders (#92785 )	2023-01-10 14:18:33 +01:00
Artem Prigoda	2bc7398754	Use `Strings.format` instead of `String.format(Locale.ROOT, ...)` in tests (#92106 ) Use local-independent `Strings.format` method instead of `String.format(Locale.ROOT, ...)`. Inline `ESTestCase.forbidden` calls with `Strings.format` for the consistency sake. Add `Strings.format` alias in `common.Strings`	2023-01-03 19:28:27 +01:00
Mark Vieira	c2eda511de	Add JUnit rule based integration test cluster orchestration framework (#92379 ) This commit adds a new test framework for configuring and orchestrating test clusters for both Java and YAML REST testing. This will eventually replace the existing "test-clusters" Gradle plugin and the build-time cluster orchestration.	2022-12-21 15:33:46 -08:00
Yang Wang	b22719844d	Add getRandom method to BuildParams for convenience (#91674 ) It should help with reducing the ceremonies needed for getting a reproducible random value (mostly boolean) in build.gradle files. Relates: https://github.com/elastic/elasticsearch/pull/91536#discussion_r1026075192	2022-11-19 13:26:54 +11:00
Armin Braun	362a7f0a95	Make some ByteSizeValue instances always use the singleton (#91178 ) This comes out of a user heap dump investigation. In some snapshot corner cases we ran into about 100M of duplicate 0b instances. -> even though it's a little heavy handed, lets make it so the common constants that we already have are used whenever possible.	2022-10-28 16:53:49 +02:00
Rene Groeschke	43a0377735	Update forbiddenapis to 3.4 (#90624 ) Fix breaking changes to source validation after change in default jdk rule set	2022-10-06 16:52:06 +02:00
Armin Braun	97c533a562	Increase snaphot pool max size to 10 (#90282 ) As discussed, we can be up to twice as fast without increasing CPU use much on high latency blob stores so increasing the pool size to 10 here to better utilize larger data nodes.	2022-09-23 17:06:57 +02:00
Artem Prigoda	db359d9693	Log unsuccessful attempts to get credentials from web identity tokens (#88241 ) Currently, we only verify that local environment for web identity tokens is correctly set up, but we don't verify whether it's possible to exchange the token to credentials from the STS. If we can't get credentials from the STS, we silently fall back to the EC2 credentials provider. Let's try to log the web identity token auth errors, so the users get a clear message in the logs in case the STS is unavailable for the ES server.	2022-09-08 20:34:19 +02:00
Francisco Fernández Castaño	7a07853965	Add SDK request logging to debug failures of S3BlobStoreRepositoryTests#testRequestStats (#89912 ) Relates #88841	2022-09-08 16:49:45 +02:00
Nikola Grcevski	fc819609a1	Add allocation deciders in createComponents (#89836 ) With this change we are adding the allocation deciders in create components we can simplify the use in the Autoscaling plugin and implement reserved state handler in the future.	2022-09-07 09:28:07 -04:00
Artem Prigoda	9b459a25c8	[repository-s3] Update the AWS SDK to 1.12.270 (#88932 ) * Update commons-code to 1.15 * Add exceptions for unused classes	2022-09-05 10:21:44 +02:00

1 2

75 Commits