* Add timeout to SynonymsManagementAPIService put synonyms
* Remove replicas 0, as that may impact serverless
* Add timeout to put synonyms action, fix tests
* Fix number of replicas
* Remove cluster.health checks for synonyms index
* Revert debugging
* Add integration test for timeouts
* Use TimeValue instead of an int
* Add YAML tests and REST API specs
* Fix a validation bug in put synonym rule
* Spotless
* Update docs/changelog/126314.yaml
* Remove unnecessary checks for null
* Fix equals / HashCode
* Checks that timeout is passed correctly to the check health method
* Use correctly the default timeout
* spotless
* Add monitor cluster privilege to internal synonyms user
* [CI] Auto commit changes from spotless
* Add capabilities to avoid failing on bwc tests
* Replace timeout for refresh param
* Add param to specs
* Add YAML tests
* Fix changelog
* [CI] Auto commit changes from spotless
* Use BWC serialization tests
* Fix bug in test parser
* Spotless
* Delete doesn't need reloading 🤦 removing it
* Revert "Delete doesn't need reloading 🤦 removing it"
This reverts commit 9c8e0b62be.
* [CI] Auto commit changes from spotless
* Fix refresh for delete synonym rule
* Fix tests
* Update docs/changelog/126935.yaml
* Add reload analyzers test
* reload_analyzers is not available on serverless
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Although scrolling is not recommended for knn queries, it is effective.
But I found a bug that when use scroll in the knn query, the But I found
a bug that when using scroll in knn query, knn_score_doc will be lost in
query phase, which means knn query does not work. In addition, the
operations for directly querying the node where the shard is located and
querying the node with transport are different. It can be reproduced on
the local node. Because the query phase uses the previous
ShardSearchRequest object stored before the dfs phase. But when it run
in the local node, it don't do the encode and decode processso the
operation is correct. I wrote an IT to reproduce it and fixed it by
adding the new source to the LegacyReaderContext.
While this change appears subtle at this point, I am using this in a later PR that adds a lot more spatial functions, where nesting them in related groups like this looks much better.
The main impact of this is that the On this page navigator on the right panel of the docs will show the nesting
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
With this PR we restrict the paths we allow access to, forbidding plugins to specify/request entitlements for reading or writing to specific protected directories.
I added this validation to EntitlementInitialization, as I wanted to fail fast and this is the earliest occurrence where we have all we need: PathLookup to resolve relative paths, policies (for plugins, server, agents) and the Paths for the specific directories we want to protect.
Relates to ES-10918
* Revert "Release buffers in netty test (#126744)"
This reverts commit f9f3defe92.
* Revert "Add flow-control and remove auto-read in netty4 HTTP pipeline (#126441)"
This reverts commit c8805b85d2.
The docs about the queue in a `fixed` pool are a little awkwardly
worded, and there is no mention of the queue in a `scaling` pool at all.
This commit cleans this area up.
* updating documentation to remove duplicate and redundant wording from 9.x
* Update links to rerank model landing page
---------
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
We addressed the empty top docs issue with #126385 specifically for scenarios where
empty top docs don't go through the wire. Yet they may be serialized from data node
back to the coord node, in which case they will no longer be equal to Lucene#EMPTY_TOP_DOCS.
This commit expands the existing filtering of empty top docs to include also those that
did go through serialization.
Closes#126742
* Default new semantic_text fields to use BBQ when models are compatible
* Update docs/changelog/126629.yaml
* Gate default BBQ by IndexVersion
* Cleanup from PR feedback
* PR feedback
* Fix test
* Fix test
* PR feedback
* Update test to test correct options
* Hack alert: Fix issue where mapper service was always being created with current index version
There are existing metrics for the active number of threads, but it seems tricky to go from those to a "utilisation" number because all the pools have different sizes.
This commit adds `es.thread_pool.{name}.threads.utilization.current` which will be published by all `TaskExecutionTimeTrackingEsThreadPoolExecutor` thread pools (where `EsExecutors.TaskTrackingConfig#trackExecutionTime` is true).
The metric is a double gauge indicating what fraction (in [0.0, 1.0]) of the maximum possible execution time was utilised over the polling interval.
It's calculated as actualTaskExecutionTime / maximumTaskExecutionTime, so effectively a "mean" value. The metric interval is 60s so brief spikes won't be apparent in the measure, but the initial goal is to use it to detect hot-spotting so the 60s average will probably suffice.
Relates ES-10530
* Temporarily bypass competitive iteration for filters aggregation (#126956)
* Bump versions after 9.0.0 release
* fix merge conflict
* Remove 8.16 from branches.json
* Bring version-bump related changes from main
* [bwc] Add bugfix3 project (#126880)
* Sync version bump changes from main again
---------
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: elasticsearchmachine <58790826+elasticsearchmachine@users.noreply.github.com>
Co-authored-by: Brian Seeders <brian.seeders@elastic.co>
* Updating text_similarity_reranker documentation
* Updating docs to include urls
* remove extra THE from the text
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
CollectionUtils.uniquify is based on C++ std::unique. However, C++
iterators are not quite the same as Java iterators. In particular,
advancing them only allows grabbing the value once. This commit reworks
uniquify to be based on list indices instead of iterators.
closes#126883
This adds `documents_found` and `values_loaded` to the to the ESQL response:
```json
{
"took" : 194,
"is_partial" : false,
"documents_found" : 100000,
"values_loaded" : 200000,
"columns" : [
{ "name" : "a", "type" : "long" },
{ "name" : "b", "type" : "long" }
],
"values" : [[10, 1]]
}
```
These are cheap enough to collect that we can do it for every query and
return it with every response. It's small, but it still gives you a
reasonable sense of how much work Elasticsearch had to go through to
perform the query.
I've also added these two fields to the driver profile and task status:
```json
"drivers" : [
{
"description" : "data",
"cluster_name" : "runTask",
"node_name" : "runTask-0",
"start_millis" : 1742923173077,
"stop_millis" : 1742923173087,
"took_nanos" : 9557014,
"cpu_nanos" : 9091340,
"documents_found" : 5, <---- THESE
"values_loaded" : 15, <---- THESE
"iterations" : 6,
...
```
These are at a high level and should be easy to reason about. We'd like to
extract this into a "show me how difficult this running query is" API one
day. But today, just plumbing it into the debugging output is good.
Any `Operator` can claim to "find documents" or "load values" by overriding
a method on its `Operator.Status` implementation:
```java
/**
* The number of documents found by this operator. Most operators
* don't find documents and will return {@code 0} here.
*/
default long documentsFound() {
return 0;
}
/**
* The number of values loaded by this operator. Most operators
* don't load values and will return {@code 0} here.
*/
default long valuesLoaded() {
return 0;
}
```
In this PR all of the `LuceneOperator`s declare that each `position` they
emit is a "document found" and the `ValuesSourceValuesSourceReaderOperator`
says each value it makes is a "value loaded". That's pretty pretty much
true. The `LuceneCountOperator` and `LuceneMinMaxOperator` sort of pretend
that the count/min/max that they emit is a "document" - but that's good
enough to give you a sense of what's going on. It's *like* document.
On x64, we are testing if we support vector capabilities (1 = "basic" = AVX2, 2 = "advanced" = AVX-512) in order to enable and choose a native implementation for some vector functions, using CPUID.
However, under some circumstances, this is not sufficient: the OS on which we are running also needs to support AVX/AVX2 etc; basically, it needs to acknowledge it knows about the additional register and that it is able to handle them e.g. in context switches. To do that we need to a) test if the CPU has xsave feature and b) use the xgetbv to test if the OS set it (declaring it supports AVX/AVX2/etc).
In most cases this is not needed, as all modern OSes do that, but for some virtualized situations (hypervisors, emulators, etc.) all the component along the chain must support it, and in some cases this is not a given.
This PR introduces a change to the x64 version of vec_caps to check for OS support too, and a warning on the Java side in case the CPU supports vector capabilities but those are not enabled at OS level.
Tested by passing noxsave to my linux box kernel boot options, and ensuring that the avx flags "disappear" from /proc/cpuinfo, and we fall back to the "no native vector" case.
Fixes#126809
A while ago we enabled using ccs_minimize_roundtrips in async search.
This makes it possible for users of async search to send a single search
request per remote cluster, and minimize the impact of network latency.
With non minimized roundtrips, we have pretty recurring cancellation checks:
as part of the execution, we detect that a task expired whenever each shard comes
back with its results.
In a scenario where the coord node does not hold data, or only remote data is
targeted by an async search, we have much less chance of detecting cancellation
if roundtrips are minimized. The local coordinator would do nothing other than
waiting for the minimized results from each remote cluster.
One scenario where we can check for cancellation is when each cluster comes
back with its full set of results. This commit adds such check, plus some testing
for async search cancellation with minimized roundtrips.
The following order of events was possible:
- An ILM policy update cleared `cachedSteps`
- ILM retrieves the step definition for an index, this populates `cachedSteps` with the outdated policy
- The updated policy is put in `lifecyclePolicyMap`
Any subsequent cache retrievals will see the old step definition.
By clearing `cachedSteps` _after_ we update `lifecyclePolicyMap`, we
ensure eventual consistency between the policy and the cache.
Fixes#118406
adds read privilege to the kibana_system role for indexes associated with the Microsoft Defender Integrations.
Changes are necessary in order to support Security Solution bi-directional response actions
In Jackson 2.15 a maximum string length of 50k characters was
introduced. We worked around that by override the length to max int on
all parsers created by xcontent. Jackson 2.17 introduced a similar limit
on field names. This commit mimics the workaround for string length by
overriding the max name length to be unlimited.
relates #58952
* permit at+jwt typ header value in jwt access tokens
* Update docs/changelog/126687.yaml
* address review comments
* [CI] Auto commit changes from spotless
* update Type Validator tests for parser ignoring case
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
We are currently having to hold in heap big list of Double objects which can take big amounts of heap. With this change
we can reduce the heap usage by 7x.
* Revert endpoint creation validation for ELSER and E5
* Update docs/changelog/126792.yaml
* Revert start model deployment being in TransportPutInferenceModelAction
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>