When parsing queries on the coordinating node, there is currently no way to share state between the different parsing methods (`fromXContent`). The only query that supports a parse context is bool query, which uses the context to track nested depth of queries, added with #66204. Such nested depth tracking mechanism is not 100% accurate as it tracks bool queries only, while there's many more query types that can hold other queries hence potentially cause stack overflow when deeply nested.
This change removes the parsing context that's specific to bool query, introduced with #66204, in favour of generalizing the nested depth tracking to all query types.
The generic tracking is introduced by wrapping the parser and overriding the method that parses named objects through the xcontent registry. Another way would have been to require a context argument when parsing queries, which would mean adding a context argument to all the QueryBuilder#fromXContent static methods. That would be a breaking change for plugins that provide custom queries, hence I went for trying out a different approach.
One aspect that this change requires and introduces is the distinction between parsing a top level query (which will wrap the parser, or it would create the context if we had one), as opposed to parsing an inner query, which goes ahead with the given parser and context. We already have this distinction as we have two different static methods in `AbstractQueryBuilder` but in practice only bool query makes the distinction being the only context-aware query.
In addition to generalizing tracking nested depth when parsing queries, we should be able to adopt this same strategy to track queries usage as part #90176 .
Given that the depth check is now more restrictive, as it counts all compound queries and not only bool, we have decided to raise the default limit to `30` to ensure that users are not going to hit the limit due to this change.
Adds to the docs a note that the `100mb` default for
`http.max_content_length` is the recommended maximum, along with
suggestions for what to do when hitting this limit.
Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min.
This commit adds support for floating point node.processors setting.
This is useful when the nodes run in an environment where the CPU
time assigned to the ES node process is limited (i.e. using cgroups).
With this change, the system would be able to size the thread pools
accordingly, in this case it would round up the provided setting
to the closest integer.
The docs for `transport.ping_schedule` note that the transport client
defaults to a 5s ping schedule, but this is no longer relevant. This
commit drops this from the docs, and also moves the docs for this
setting further down the page to reflect its relative unimportance.
Today we say that voting-only nodes require a "low-latency" network.
This term has a specific meaning in some operating environments which is
different from our intended meaning. To avoid this confusion this commit
removes the absolute term "low-latency" in favour of describing the
requirements relative to the user's own performance goals.
Clean up network setting docs
- Add types for all params
- Remove mention of JDKs before 11
- Clarify some wording
Co-authored-by: Stef Nestor <steffanie.nestor@gmail.com>
This change ensures that existing read_only_allow_delete blocks that
are placed on indices when the flood_stage watermark threshold is
exceeded, are removed when the disk threshold monitoring is disabled.
This is done by changing how InternalClusterInfoService behaves when
disabled. With this change, it will keep calling the registered
listeners periodically, but with an empty ClusterInfo.
Closes#86383
Our current default for the http.max_header_size setting is 8kb. This
is lower than the current default for Kibana (16kb in 8.x), and the ESS
proxy (1mb based on the Go http library default). To align with the
current convention of other Elastic components, this PR increases the
ES header size setting default to 16kb.
Closes#88501
* Convert disk watermarks to RelativeByteSizeValues
Similar to the existing watermark setting for the frozen tier.
Pre-requisite for PR 88639 that plans to introduce max headroom
settings for the disk watermarks, similar to the frozen tier max
headroom setting.
* Add changelog
* Revert 20gb to 20GB
* Make formatNoTrailingZerosPercent non static
* ByteSizeValue.MINUS_ONE
* Remove getMinimumTotalSizeForBelowWatermark
* Remove comment
* Fix minor stuff
* Make parsing of RelativeByteSizeValue faster
Mimicks older definitelyNotPercentage function
* Remove Locale from Strings.format
* More MINUS_ONE
* Adding discovery troubleshooting link
* Add tags to pull in discovery troubleshooting content
* Move discovery troubleshooting to separate page and add redirects
Co-authored-by: Adam Locke <adam.locke@elastic.co>
In #85074 we added docs on discovery troubleshooting that really only
talked about troubleshooting master elections. There's also the case
where the master is elected fine but some other node can't join it. This
commit adds troubleshooting docs about that too.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Fixes a few scalability issues around join validation:
- compresses the cluster state sent over the wire
- shares the serialized cluster state across multiple nodes
- forks the decompression/deserialization work off the transport thread
Relates #77466Closes#83204
Ensures that on every page of the docs that mentions
`cluster.initial_master_nodes` also mentions that this setting must be
removed after bootstrapping completes.
Today it's no longer true that by default nodes will auto-discover other
nodes on the same host and bootstrap them all into a cluster. This
commit fixes the docs on auto-bootstrapping to recognise this.
Today we don't really say anything about the requirements for the data
path in terms of correctness, and we specifically say to avoid NFS for
performance reasons. This isn't wholly accurate: some NFS
implementations work just fine. This commit documents a more balanced
position on local vs remote storage.
This moves the bulk of the upgrade information into the consolidated upgrade guide, but leaves the primary upgrade topic in place as a cross reference.
Relates to: https://github.com/elastic/stack-docs/pull/1970
Co-authored-by: gchaps <33642766+gchaps@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
(cherry picked from commit f6473d71f9)
Co-authored-by: debadair <debadair@elastic.co>
This commit updates the Operator-only functionality doc to
mention the operator only settings introduced in #82819.
It also adds an integration test for those operator only
settings that would have caught #83359.
As of 8.0, the compatibility window for cross-cluster search (CCS) to an earlier release will be one minor release. This updates the CCS docs and adds a related 8.0 breaking change.
Closes https://github.com/elastic/elasticsearch/issues/80782
* Adds a prerequisites section covering remote cluster config, node roles, and security.
* Moves existing content about remote cluster config to the prereqs.
* Updates the remote cluster docs to include information about eligible gateway nodes and tagging for gateway nodes.
Closes https://github.com/elastic/elasticsearch/issues/72001
Updates the remote clusters version compatibility table to include 7.17 and 8.x versions.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Today the same-shard allocation decider falls back to checking the
hostname if the node has no host address. In practice nodes will always
have an address so the fallback is dead code. This commit removes that
dead code.
Relates #80702 which will add the ability to distinguish nodes by
hostname regardless of whether they have an address or not, and #80767
which optimizes this area of code - this refactoring should make the
optimization simpler.
Today we increase the verbosity of discovery failures after 5 minutes
without a master. Unfortunately 5 minutes is a common orchestration
timeout, so if discovery is broken then we see nodes being shut down
just before they start to emit useful logs. This commit reduces the
default timeout to 3 minutes to address that.
We have a few leftover mentions of `zen` discovery, mostly for
historical/BwC reasons, which this commit removes.
Prior to this commit the default value for `discovery.type` was `zen`
but this was not written down anywhere or officially supported: the two
options were to set it to `single-node` or to omit it entirely. This
commit changes the default to `multi-node` and documents this.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Today we have a short note in one place in the docs saying not to touch
the contents of the data path. This commit expands the warning to
describe more precisely what is forbidden, and to give some more detail
of the consequences, and also duplicates the warning to the other
location that documents the `path.data` setting.
Deprecate the script context cache in favor of the general cache.
Users should use the following settings:
`script.max_compilations_rate` to set the max compilation rate
for user scripts such as filter scripts. Certain script contexts
that submit scripts outside of the control of the user are
exempted from this rate limit. Examples include runtime fields,
ingest and watcher.
`script.cache.max_size` to set the max size of the cache.
`script.cache.expire` to set the expiration time for entries in
the cache.
Whats deprecated?
`script.max_compilations_rate: use-context`. This special
setting value was used to turn on the script context-specific caches.
`script.context.$CONTEXT.cache_max_size`, use `script.cache.max_size`
instead.
`script.context.$CONTEXT.cache_expire`, use `script.cache.expire`
instead.
`script.context.$CONTEXT.max_compilations_rate`, use
`script.max_compilations_rate` instead.
The default cache size was increased from `100` to `3000`, which
was approximately the max cache size when using context-specific caches.
The default compilation rate limit was increased from `75/5m` to
`150/5m` to account for increasing uses of scripts.
System script contexts can now opt-out of compilation rate limiting
using a flag rather than a sentinel rate limit value.
7.16: Script: Deprecate script context cache #79508
Refs: #62899
7.16: Script: Opt-out system contexts from script compilation rate limit #79459
Refs: #62899
Today we limit the max number of concurrent snapshot file restores
per recovery. This works well when the default
node_concurrent_recoveries is used (which is 2). When this limit is
increased, it is possible to exhaust the underlying repository
connection pool, affecting other workloads.
This commit adds a new setting
`indices.recovery.max_concurrent_snapshot_file_downloads_per_node` that
allows to limit the max number of snapshot file downloads per node
during recoveries. When a recovery starts in the target node it tries
to acquire a permit that allows it to download snapshot files when it is
granted. This is communicated to the source node in the
StartRecoveryRequest. This is a rather conservative approach since it is
possible that a recovery that gets a permit to use snapshot files
doesn't recover any snapshot file while there's a concurrent recovery
that doesn't get a permit could take advantage of recovering from a
snapshot.
Closes#79044
Changes can-match from a shard-level to a node-level action, which helps avoid an explosion of shard-level can-match
subrequests in clusters with many shards, that can cause stability issues. Also introduces a new search_coordination
thread pool to handle the sending and handling of node-level can-match requests.
This PR changes uses of transient cluster settings to
persistent cluster settings.
The PR also deprecates the transient settings usage.
Relates to #49540
* A typo error
a space between 'E' and 'cluster...'
* Update example, fix headings, change notes
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Co-authored-by: Marwane Chahoud <marwane.chahoud@gmail.com>
* [DOCS] Fix default value for closed indices
#57953 introduced changes that added ESS icons to many Elasticsearch settings. As part of those changes, the default value for `cluster.indices.close.enable` was indicated as `false`, when it should be `true`. This PR updates the default value to `true`.
Closes#78877
* Update description
* Update note to remove outdated claims
The documentation indicates that `stack.templates.enabled` can be used in Elasticsearch Service, but it is not part of the settings allowlist in ESS. This PR makes the documentation match the state of the allowlist.
* Improve docs for pre-release version compatibility
Follow-up to #78317 clarifying a couple of points:
- a pre-release build can restore snapshots from released builds
- compatibility applies if at least one of the local or remote cluster
is a released build
* Remote cluster build date nit
The reference manual includes docs on version compatibility in various
places, but it's not clear that these docs only apply to released
versions and that the rules for pre-release versions are stricter than
folks expect. This commit adds some words to the docs for unreleased
versions which explains this subtlety.
* [DOCS] Update remote cluster docs
* Add files, rename files, write new stuff
* Plethora of changes
* Add test and update snippets
* Redirects, moved files, and test updates
* Moved file to x-pack for tests
* Remove older CCS page and add redirects
* Cleanup, link updates, and some rewrites
* Update image
* Incorporating user feedback and rewriting much of the remote clusters page
* More changes from review feedback
* Numerous updates, including request examples for CCS and Kibana
* More changes from review feedback
* Minor clarifications on security for remote clusters
* Incorporate review feedback
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Some review feedback and some editorial changes
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
We currently use the plaintext body of a shard request as the key to the
request cache. This has the disadvantage that very large requests can
quickly fill up the cache due to the size of their keys. With this commit,
we instead use a sha-256 hash of the shard request as the cache key,
which will use a constant (and much smaller) number of bytes.
Today we expire the client stats for HTTP channels 5 minutes after they
close. It's possible to open a very large number of HTTP channels in 5
minutes, possibly inadvertently, and the stats for those channels can be
overwhelming.
This commit introduces a limit on the number of channels tracked by each
node which applies in addition to the age limit, and makes these limits
configurable via static settings. It drops the pruning of old stats when
starting to track a new channel and instead uses a queue to expire the
oldest stats when each channel closes if necessary to respect the count
limit; it only performs age-based expiry when retrieving the stats,
since the count limit now bounds the memory needed. Finally, it
tightents up some missing synchronization and makes sure that we expose
only immutable objects to the stats subsystem.
This is related to #73497. Currently, we only use the configured
transport.compression_scheme setting when compressing a request or a
response. Additionally, the cluster.remote.*.compression_scheme
setting is ignored. This commit fixes this behavior by respecting the
per-cluster setting. Additionally, it resolves confusion around inbound
and outbound connections by always responding with the same scheme that
was received. This allows remote connections to have different schemes
than local connections.
This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary.
Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard.
Relates #73496
In 7.15, we intend for the indexing_data compression level and the
compression scheme lz4 to no longer be experimental. This commit
updates the documentation to reflect this. Additionally, it adds
missing docs for the cluster.remote.*.transport.compression_scheme
setting.
Relates to #73497.
The special values `_global_`, `_site_`, `0.0.0.0` and so on may resolve
to multiple addresses, of which one is chosen to be the publish address.
This commit generalises the warning about reachability as applied to
DNS-resolved hostnames to also apply to these special values.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This commit adds a new set of classes that would compute a peer
recovery plan, based on source files + target files + available
snapshots. When possible it would try to maximize the number of
files used from a snapshot. It uses repositories with `use_for_peer_recovery`
setting set to true.
It adds a new recovery setting `indices.recovery.use_snapshots`
Relates #73496
In the upcoming Lucene 9 release, `indices.query.bool.max_clause_count` is
going to apply to the entire query tree rather than per `bool` query. In order
to avoid breaks, the limit has been bumped from 1024 to 4096.
The semantics will effectively change when we upgrade to Lucene 9, this PR
is only about agreeing on a migration strategy and documenting this change.
To avoid further breaks, I am leaning towards keeping the current setting name
even though it contains `bool`. I believe that it still makes sense given that
`bool` queries are typically the main contributors to high numbers of clauses.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Today the docs for remote cluster connections use `ping_schedule` fairly
liberally, and don't mention that you should prefer TCP keepalives
wherever possible. This commit reduces the use of this setting in the
examples and adjusts the description of the setting to include a note
about TCP keepalives instead.
This commit is related to #73497. It adds two new settings. The first setting
is transport.compression_scheme. This setting allows the user to
configure LZ4 or DEFLATE as the transport compression. Additionally, it
modifies transport.compress to support the value indexing_data. When
this setting is set to indexing_data only messages which are primarily
composed of raw source data will be compressed. This is bulk, operations
recovery, and shard changes messages.
Today if sending file chunks is CPU-bound (e.g. when using compression)
then we tend to concentrate all that work onto relatively few threads,
even if `indices.recovery.max_concurrent_file_chunks` is increased. With
this commit we fork the transmission of each chunk onto its own thread
so that the CPU-bound work can happen in parallel.
In #55805, we added a setting to allow single data node clusters to
respect the high watermark. In #73733 we added the related deprecations.
This commit ensures the only valid value for the setting is true and
adds deprecations if the setting is set. The setting will be removed
in a future release.
Co-authored-by: David Turner <david.turner@elastic.co>
* Add new thread pool for critical operations
* Split critical thread pool into read and write
* Add POJO to hold thread pool names
* Add tests for critical thread pools
* Add thread pools to data streams
* Update settings for security plugin
* Retrieve ExecutorSelector from SystemIndices where possible
* Use a singleton ExecutorSelector
Adds new snapshot meta pool that is used to speed up the get snapshots API
by making `SnapshotInfo` load in parallel. Also use this pool to load
`RepositoryData`.
A follow-up to this would expand the use of this pool to the snapshot status
API and make it run in parallel as well.
If a node is partitioned away from the rest of the cluster then the
`ClusterFormationFailureHelper` periodically reports that it cannot
discover the expected collection of nodes, but does not indicate why. To
prove it's a connectivity problem, users must today restart the node
with `DEBUG` logging on `org.elasticsearch.discovery.PeerFinder` to see
further details.
With this commit we log messages at `WARN` level if the node remains
disconnected for longer than a configurable timeout, which defaults to 5
minutes.
Relates #72968
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.
Relates #71844
Frozen indices (partial searchable snapshots) require less heap per
shard and the limit can therefore be raised for those. We pick 3000
frozen shards per frozen data node, since we think 2000 is reasonable
to use in production.
Relates #71042 and #34021
* Removing security overview and condensing.
* Adding new security file.
* Minor changes.
* Removing link to pass build.
* Adding minimal security page.
* Adding minimal security page.
* Changes to intro.
* Add basic and basic + http configurations.
* Lots of changes, removed files, and redirects.
* Moving some AD and LDAP sections, plus more redirects.
* Redirects for SAML.
* Updating snippet languages and redirects.
* Adding another SAML redirect.
* Hopefully fixing the ci/2 error.
* Fixing another broken link for SAML.
* Adding what's next sections and some cleanup.
* Removes both security tutorials from the TOC.
* Adding redirect for removed tutorial.
* Add graphic for Elastic Security layers.
* Incorporating reviewer feedback.
* Update x-pack/docs/en/security/securing-communications/security-basic-setup.asciidoc
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
* Update x-pack/docs/en/security/securing-communications/security-minimal-setup.asciidoc
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Update x-pack/docs/en/security/securing-communications/security-basic-setup.asciidoc
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Update x-pack/docs/en/security/index.asciidoc
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
* Update x-pack/docs/en/security/securing-communications/security-basic-setup-https.asciidoc
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
* Apply suggestions from code review
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Additional changes from review feedback.
* Incorporating reviewer feedback.
* Incorporating more reviewer feedback.
* Clarify that TLS is for authenticating nodes
Co-authored-by: Tim Vernum <tim@adjective.org>
* Clarify security between nodes
Co-authored-by: Tim Vernum <tim@adjective.org>
* Clarify that TLS is between nodes
Co-authored-by: Tim Vernum <tim@adjective.org>
* Update title for configuring Kibana with a password
Co-authored-by: Tim Vernum <tim@adjective.org>
* Move section for enabling passwords between Kibana and ES to minimal security.
* Add section for transport description, plus incorporate more reviewer feedback.
* Moving operator privileges lower in the navigation.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
We document that master nodes should have a persistent data path but
it's a bit hard to understand that this is what the docs are saying and
we don't really say why it's important. This commit clarifies this
paragraph.
Relates 49d0f3406c
Today the docs on node roles say that you shouldn't use dedicated
masters for heavy requests such as indexing and searching, but as per
the "designing for resilience" docs this guidance applies to all client
requests. This commit generalises the node roles docs slightly to
clarify this.
Relates #70435
This commit addresses two aspects of the description in the docs of
configuring a local node to be a remote cluster client. First, the
documentation was referring to the legacy setting for configuring a
remote cluster client. Secondly, we clarify that additional features,
not only cross-cluster search, have requirements around the usage of the
remote_cluster_client role.
Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.
The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.
Closes#54151Closes#2869