Part of https://github.com/elastic/elasticsearch/issues/99815
## Steps
1. Migrate TDigest classes to use a custom Array implementation. Temporarily use a simple array wrapper (https://github.com/elastic/elasticsearch/pull/112810)
2. Implement CircuitBreaking in the `MemoryTrackingTDigestArrays` class. Add `Releasable` and ensure it's always closed within TDigest (This PR)
3. Pass the CircuitBreaker as a parameter to TDigestState from wherever it's being used
4. Account remaining TDigest classes size ("SHALLOW_SIZE")
Every step should be safely mergeable to main:
- The first and second steps should have no impact.
- The third and fourth ones will start increasing the CB count partially.
## Remarks
To simplify testing the CircuitBreaker, added a helper method + `@After` to ESTestCase.
Right now CBs are usually tested through MockBigArrays. E.g:
f7a0196b45/x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/AbstractFunctionTestCase.java (L1263-L1265)
So I guess there was no need for this yet. But I may have missed something somewhere.
Also, I'm separating this PR from the "step 3" as integrating this (CB) in the current usages may require some refactor of external code, which may be somewhat more _dangerous_
* Adding inference endpoint creation validation for MistralService, GoogleAiStudioService, and HuggingFaceService
* Moving invalid model type exception to shared ServiceUtils function
* Fixing naming inconsistency
* Updating HuggingFaceIT ELSER tests for inference endpoint validation
Collapse dynamically will add values to the DocumentField values array.
There are a few scenarios where this is immutable and most of these are
OK. However, we get in trouble when we create an immutable set for
StoredValues which collapse later tries to update.
The other option for this fix was to make an array copy for `values` in
every `DocumentField` ctor, this seemed very expensive and could get out
of hand. So, I decided to fix this one bug instead.
closes https://github.com/elastic/elasticsearch/issues/112646
Here we introduce a new implementation of `IndexSettingProvider` whose goal is to "inject" the
`index.mode` setting with value `logsdb` when a cluster setting `cluster.logsdb.enabled` is `true`.
We also make sure that:
* the existing `index.mode` is not set
* the datastream name matches the `logs-*-*` pattern
* `logs@settings` component template is used
This PR fixes following race conditions in `onIndexAvailableForSearch`
introduced in https://github.com/elastic/elasticsearch/pull/112813:
1. If the method is called when the index is already available, cancellation is still scheduled and may execute before successful completion (manifested in test failures https://github.com/elastic/elasticsearch/issues/113336)
2. If the cancel task runs _before_ `addStateListener`, it may fail to remove the listener (noticed while fixing the first issue)
These race conditions only manifest for small timeout windows, and are
completely bypassed for 0 timeout windows based on other checks in prod
code, so the practical impact is fortunately limited.
Resolves: https://github.com/elastic/elasticsearch/issues/113336
* Add more missing wolfi references to fix tests
* packaging tests require access to docker registry
* Fix symlink for es distributions jdk cacerts in wolfi docker
* Fix native support on wolfi images
* Fix provided keystore packaging tests for wolfi
* Add utils used for testing to wolfi image
* Explicitly set default shell to bash in docker images
* Fix docker config issues
* Apply review feedback around docker login
---------
Co-authored-by: Rene Groeschke <rene@elastic.co>
Because the finally clause assertions did not finally print any
exceptions that might have occurred.
Happened in build scan qdorbubrxbqh6. And can be easily reproduced
e.g., by using a custom metadata:
metadata =
IndexMetadata.builder(metadata).settings(Settings.builder()
.put(metadata.getSettings()).put(
IndexMetadata.INDEX_DATA_PATH_SETTING.getKey(),
"/invalid/path")).build();
Sometimes we might need to invoke different requests on a remote cluster
depending on the version of the transport protocol it understands, but
today we cannot make that distinction (without starting to execute an
action on the remote cluster and failing while serializing the request
at least). This commit allows callers access to the underlying
`Transport.Connection` instance so that we can implement better BwC
logic.
Deprecate to, from, include_lower, include_upper range query params.
These params have been removed from our documentation in v. 0.90.4 (d6ecdecc19),
but did not got through deprecation cycle.
These params to be removed in v9.0.
Related to #81276Closes#48538
* Adding ChunkingSettings logic and enabling ChunkingSettings for OpenAI embedding endpoints
* Cleaning up naming in ChunkingSettings logic
* Incrementing InferenceIndex version
* Removing DefaultChunkingSettings, cleaning up chunking settings class and related tests, add chunking strategy to inference index
* Adding check for up to date index mappings when creating an inference endpoint
* Fixing transport version conflict
* Adding validation for invalid chunking settings inputs and improving error messaging
* Reverting SystemIndexMappingUpdateService changes and adding error messaging on mixed cluster exception
Today in the ML and Transform plugins we use `null` for timeouts related
to persistent tasks, which means to use the implicit default timeout of
30s. As per #107984 we want to eliminate all such uses of the implicit
default timeout. This commit either moves to using the timeout from the
associated transport request, if available, or else makes it explicit
that we're using a hard-coded 30s timeout.
* Renaming - code mentioned modelId but was actually deploymentId
* Documenting
* add a test case and more renaming
* Renaming & remove TODOs
* Update MlAutoscalingStats javadoc to match autoscaler comments
* precommit
The parsing logic is test only at this point, lets move it to tests
accordingly to keep the prod codebase a little smaller.
Also fixed a missing `static`.
Some `getClientWrapper()` implementations return a wrapper that only
wraps `NodeClient` instances. In practice we _only_ wrap `NodeClient`
instances so this check is redundant, and in a recent investigation it
was confusing to readers. With this commit we assert that we're always
wrapping a `NodeClient`.
Add initial code required to fallback synthetic source mode to stored
source mode using an index settings provider.
Note that the final version relies on a new index setting that
determines source mode, which is currently controlled by `mode` mapping
attribute in `_source` meta field mapper. Additionally index modes
should not enforce synthetic source mode.
Restores the changes from #111684 which uses multiple streams to improve the
time to download and install the built in ml models. The first iteration has a problem
where the number of in-flight requests was not properly limited which is fixed here.
Additionally there are now circuit breaker checks on allocating the buffer used to
store the model definition.
Since we are enriching the component templates with more entries such as
the data stream lifecycle and in the future the data stream options, we
add a template builder to help with the code, especially tests.
To highlight the value and prepare for the PRs that will add the data
stream options to the template we replace calls to the constructor with
all arguments by the builder: - when there are aguements with null
values, or - when we copy another template and change only a few fields.
This prepares the ground, so when we add data stream options, we will
not need to edit all these places.