Keep better track of shard contexts using RefCounted, so they can be released more aggressively during operator processing. For example, during TopN, we can potentially release some contexts if they don't pass the limit filter.
This is done in preparation of TopN fetch optimization, which will delay the fetching of additional columns to the data node coordinator, instead of doing it in each individual worker, thereby reducing IO. Since the node coordinator would need to maintain the shard contexts for a potentially longer duration, it is important we try to release what we can eariler.
An even more advanced optimization is to delay fetching to the main cluster coordinator, but that would be more involved, since we need to first figure out how to transport the shard contexts between nodes.
Summary of main changes:
DocVector now maintains a RefCounted instance per shard.
Things which can build or release DocVectors (e.g., LuceneSourceOperator, TopNOperator), can also hold RefCounted instances, so they can pass them to DocVector and also ensure contexts aren't released if they can still be potentially used later.
Driver's main loop iteration (runSingleLoopIteration), now closes its operators even between different operator processing. This is extra aggressive, and was mostly done to improve testability.
Added a couple of tests to TopNOperator and a new integration test EsqlTopNShardManagementIT, which uses the pausable plugin framework to check that TopNOperator releases things as early as possible..
The local plan optimizer should not change the layout, as it has already
been agreed upon. However, CombineProjections can violate this when some
grouping elements refer to the same attribute. This occurs when
ReplaceFieldWithConstantOrNull replaces missing fields with the same
reference for a given data type.
Closes#128054Closes#129811
The test might produce over-budget tasks that cannot run even if all the
other tasks that were blocked (and hold up budget) while running
complete. Rather than prevent submitting such over-budget tasks, this
fix simply sets the merge task's queue available budget to
`Long.MAX_VALUE`, in order to ensure that all merge tasks run before the
test ends.
Fixes https://github.com/elastic/elasticsearch/issues/129148
ES|QL index patterns validation: Ensure that the patterns in the query are syntactically and semantically valid
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
Before a `KeywordFieldType` was created that didn't set the isSyntheticSource field, causing to use the wrong block loader that would synthesize the complete _source instead of just loading values from ignored source. This PR addresses this.
Some features are unavailable in serverless and are thus not worth the
investment to make fully project-aware. This new annotation can be used
to clearly mark blocks of code that are intentionally not made properly
project-aware, in case we need to revisit them in the future.
* [DOCS] Update ESQL metadata fields page
**esql-metadata-fields.md:**
- restructured from bullet list to table format for metadata fields
- added `_index_mode` and `_source` fields to available metadata
- improved field descriptions (more detailed)
- added "usage and limitations" section
- reorganized examples into subsections with headers
- added `_score` sorting example
- added tip box linking to search documentation
* 🚙Drive by updates to search functions ref page
moved tutorial link into tip box at top
added cross-reference to search overview documentation
minor text flow improvements and punctuation fixes
* Fix id typo
* Apply suggestions from review
Co-authored-by: Bogdan Pintea <sig11@mailbox.org>
When there are multiple FS with the same available disk space but different total sizes, it's unpredictable (and irrelevant) which one the checker uses.
Fixes#129823
With the introduction of entitlements (#120243) and exclusive file
access (#123087) it is no longer safe to watch a whole directory.
In a lot of deployments, the parent directory for SSL config files
will be the main config directory, which also contains exclusive files
such as SAML realm metadata or File realm users. Watching that
directory will cause entitlement warnings because it is not
permissible for core/ssl-config to read files that are exclusively
owned by the security module (or other modules)
* Declare LU JOIN GA in 9.1
* Align applies_to tags for sample, change_point
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
- loads the downloaded GeoIP databases from system index to ingest node file system for each project
- each project's databases are loaded to directory `tmp/geoip-databases/{nodeId}/{projectId}`
This PR makes RepositoriesService project aware so that the basic Put,
Get, Delete and Verify repository actions are now project scoped.
It intentionally leaves the following aspects out of scope for the
current changes: * Repository stats reporting * Repository clean-up,
analysis and integrity verification * Repository usages for searchable
snapshots and CCR
They will be worked on separately. One main reason for leaving them out
is that they are not needed by OBS which is currently blocked by
repository/snapshot changes. They may also have their own complexities,
e.g. stats reporting.
Resolves: ES-10478
Introduces a new `RemoveBlock` API that complements the existing `AddBlock` API by allowing users to remove index blocks using `DELETE /{index}/_block/{block}`.
Resolves#128966
---------
Co-authored-by: Niels Bauman <nielsbauman@gmail.com>
The comment in `TransportHandshaker` indicates (correctly) that we emit
a warning when talking to a chronologically-newer-yet-numerically-older
version, but the wording of the warning message is inverted and says
that the remote is chronologically-older-yet-numerically-newer. This
commit straightens out the message to match the situation it is
describing.
Relates #123397
To better support project restoration after deletion, this change moves project Settings from ProjectMetadata to the new custom in the ClusterState. It also introduces a new transport version for cluster state serialization. Reserved cluster state for project settings remains within ProjectMetadata.
Note: In mixed-version multiproject clusters, this may cause existing settings for projects to temporarily disappear until all nodes have been upgraded and restarted.