With the introduction of BKD-based geo shape indexing in #32039, the prefix tree indexing method has
been deprecated. From 8.0.0, it will not be allowed to create new mappings using deprecated parameters.
* [DOCS] Document how to migrate to node roles from node attrs. Closes#65855
* [DOCS] Incorporated review comments
* Update docs/reference/data-management/migrate-index-allocation-filters.asciidoc
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.
I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.
Co-authored-by: Jay Modi <jaymode@users.noreply.github.com>
We rely on the repository implementation correctly handling the case where a
write is aborted before it completes. This is not guaranteed for third-party
repositories.
This commit adds a rare action during analysis which aborts the write
just before it completes and verifies that the target blob is not found
by any node.
add support for the stats and top metrics aggregation in transform. With this change it became
easier to add more multi value aggregations to transform
Limitations:
- only the 1st element of top_metrics gets consumed by transform[*].
- all values of stats will be mapped to double if mapping deduction is used, including count,
sum, min, max
fixes#52236
relates #51925
When doing a rolling restart we recommend disabling shard allocation to
avoid unnecessary recoveries. However, this advise is unnecessary or
even harmful when restarting nodes that do not carry any data like a
pure ML node.
Today the only example of calling the cluster allocation explain API above the
fold is the bare `GET /_cluster/allocation/explain` which kind of works but is
not usually what the user wants. This commit changes the docs so that we open
with an example showing how we usually expect it to be called. This will make
it clearer that you should normally specify exactly for which shard you want an
explanation. It also tidies up a few other wrinkles in these docs.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
* Updates a cat test snippet to always return by index name in asc order
* Removes several leading slashes
* Reduces length of several snippet delimiters
Closes https://github.com/elastic/elasticsearch/issues/71683
This adds a new `match_only_text` field, which indexes the same data as a `text`
field that has `index_options: docs` and `norms: false` and uses the `_source`
for positional queries like `match_phrase`. Unlike `text`, this field doesn't
support scoring.
This commit adds support for system data streams and also the first use
of a system data stream with the fleet action results data stream. A
system data stream is one that is used to store system data that users
should not interact with directly. Elasticsearch will manage these data
streams. REST API access is available for external system data streams
so that other stack components can store system data within a system
data stream. System data streams will not use the system index read and
write threadpools.
* [DOCS] Focus retrieving selected fields on fields parameter
* Incorporating changes from reviews
* Adding clarifications from review feedback
* Slight wording revisions.
* Clarify language around format parameter and move text out of callout.
Currently when the fleet global checkpoints API returns immediately if
the index is not ready or shards are not ready. This commit modifies the
API to wait on the index and primary shards active up until the timeout
period.
Related to #71449.
This commit revives the documentation of the "Clear Cache" and
"Shard Stats" APIs of Searchable Snapshots that was removed
in #62217. This is a partial revert of the commit b545c55 with
some light wording changes.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.
Relates #71844
Removes the experimental status for the frozen tier / shared_cache searchable snapshots for the 7.13 release.
Also adapts docs that URL repositories are now supported in 7.13 for searchable snapshots.
Changes:
* Refactors the "Getting Started" content down to one page.
* Refactors the README to reduce duplicated content and better mirror
Kibana's.
* Focuses the quick start on time series data, including data streams
and runtime fields.
* Streamlines self-managed install instructions to Docker.
Co-authored-by: debadair <debadair@elastic.co>
In update by query requests where max_docs < size and conflicts=proceed
we weren't using the remaining documents from the scroll response in
cases where there were conflicts and in the first bulk request the
successful updates < max_docs. This commit address that problem and
use the remaining documents from the scroll response instead of
requesting a new page.
Closes#63671
The frozen tier partially downloads shards only. This commit
introduces an autoscaling decider that scales the total storage
on the tier according to a configurable percentage relative to
the total data set size.
This commit adds the ability to define an index-time geo_point field
with a script parameter, allowing you to calculate points from other
values within the indexed document.
As we started thinking about applying on_script_error to runtime fields, to handle script errors at search time, we would like to use the same parameter that was recently introduced for indexed fields. We decided that continue or fail gives a better indication of the behaviour compared to the current ignore or reject which is too specific to indexing documents.
This commit applies such rename.
This change exposes the newly introduced parameter `dynamic_templates`
in ingest. This parameter can be set by a set processor or a script processor.
Relates #69948
This commit adds some per-index statistics to the `SnapshotInfo` blob:
- number of shards
- total size in bytes
- maximum number of segments per shard
It also exposes these statistics in the get snapshot API.
Allow direct access to a dense_vector' values in script
through the following functions:
- getVectorValue – returns a vector's value as an array of floats
- getMagnitude – returns a vector's magnitude
Closes#51964
Frozen indices (partial searchable snapshots) require less heap per
shard and the limit can therefore be raised for those. We pick 3000
frozen shards per frozen data node, since we think 2000 is reasonable
to use in production.
Relates #71042 and #34021
This PR adds documentation for GeoIPv2 auto-update feature.
It also changes related settings names from geoip.downloader.* to ingest.geoip.downloader to have the same convention as current setting.
Relates to #68920
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Currently the `fields` API fetches the root flattened field and returns it in a
structured way in the response. In addition this change makes it possible to
directly query subfields. However, requesting flattened subfields via wildcard
patterns is not possible.
Closes#70605
This PR introduces a new query called `combined_fields` for searching multiple
text fields. It takes a term-centric view, first analyzing the query string
into individual terms, then searching for each term any of the fields as though
they were one combined field. It is based on Lucene's `CombinedFieldQuery`,
which takes a principled approach to scoring based on the BM25F formula.
This query provides an alternative to the `cross_fields` `multi_match` mode. It
has simpler behavior and a more robust approach to scoring.
Addresses #41106.
Fleet server needs an API to access up to date global checkpoints for
indices. Additionally, it requires a mode of operation when fleet can
provide its current knowledge about the global checkpoints and poll for
advancements. This commit introduces this API in the fleet plugin.
This commit allows you to set 'script' and 'on_script_error' parameters
on date field mappers, meaning that runtime date fields can be made indexed
simply by moving their definitions from the runtime section of the mappings
to the properties section.
We accept dates with a decimal point like `2113413.13241324` and parse
them *somehow*. But there are cases where we'll lose precision on those
dates, see #70085. This advises folks not to use that format. We'll
continue to accept those dates for backwards compatibility but you
should avoid using them.
Co-authored-by: Adrien Grand <jpountz@gmail.com>
* Warn users if security is implicitly disabled
Elasticsearch has security features implicitly disabled by default for
Basic and Trial licenses, unless explicitly set in the configuration
file.
This may be good for onboarding, but it also lead to unintended insecure
clusters.
This change introduces clear warnings when security features are
implicitly disabled.
- a warning header in each REST response if security is implicitly
disabled;
- a log message during cluster boot.
Runtime fields are much more flexible than script_fields because you
can filter and aggregate on them so we hope folks use them! This
converts the example of using a `parent_join` field in a script to a
runtime field so folks get used to seeing them and hopefully using them.
While I was editing this I took the opportunity to replace the script
with a real-ish example. Scripts that just load the field value are nice
and short but I hope no one uses them in real life because they just add
overhead when compared to accessing the field directly. So I made the
script do something.
Relates to #69291
This commit allows you to set 'script' and 'on_script_error' parameters
on IP field mappers, meaning that runtime IP fields can be made indexed
simply by moving their definitions from the runtime section of the mappings
to the properties section.
If enabled, the `delete_searchable_snapshot` option will attempt to delete the
index snapshot generated in any previous phase, for the purpose of mounting the
index as a searchable snapshot.
This shrinks a runtime field definition so that it fits on the screen
without scrolling. It also converts the doc into a test so we can be
sure it continues to work.
Relates to #69291
Runtime fields are much more flexible than script_fields because you
can filter and aggregate on them so we hope folks use them! This
converts the example of using a `date_nanos` field in a script to a
runtime field so folks get used to seeing them and hopefully using them.
While I was editing this I took the opportunity to replace the script
with a real-ish example. Scripts that just load the field value are nice
and short but I hope no one uses them in real life because they just add
overhead when compared to accessing the field directly. So I made the
script do something.
Relates to #69291
Co-authored-by: Adam Locke <adam.locke@elastic.co>
We have recently introduced the ability to associate an indexed field with a script. This commit updates the existing mappings stats to output stats about the script, similar to what we already do for runtime fields.
Today the response to `GET _cluster/state` does not include the roles of
the nodes in the cluster. In the past this made sense, roles were
relatively unchanging things that could be determined from elsewhere.
These days we have an increasingly rich collection of roles, with
nontrivial BWC implications, so it is important for debugging to be able
to see the specific roles as viewed by the master. This commit adds the
role names to the cluster state API output.
Relates #71385
Runtime fields are much more flexible than `script_fields` because you
can filter and aggregate on them so we hope folks use them! This
converts the example of using a `boolean` field in a script to a runtime
field so folks get used to seeing them and hopefully using them.
While I was editing this I took the opportunity to replace the script
with a real-ish example. Scripts that just load the field value are nice
and short but I hope no one uses them in real life because they just add
overhead when compared to accessing the field directly. So I made the
script do *something*.
Relates to #69291
This commit adds a deprecation note to the multiple data paths doc. It also removes mention of multiple paths support in the setup settings table.
relates #71205
This adds a "note" on the docs for the script query pointing folks to
runtime fields because they are more flexible. It also translates the
request example into runtime fields.
Relates to #69291
Co-authored-by: Adam Locke <adam.locke@elastic.co>
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)
Relates to #69291
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
* Add warning admonition for removing runtime fields.
* Add cross-link to runtime fields.
* Expanding examples for runtime fields in a search request.
* Clarifying language and simplifying response tests.
We previously allowed but deprecated the ability for the shared cache to
be positively sized on nodes without the frozen role. This is because we
only allocate shared_cache searchable snapshots to nodes with the frozen
role. This commit completes our intention to deprecate/remove this
ability.
This PR sets the default value of `action.destructive_requires_name`
to `true.` Fixes#61074. Additionally, we set this value explicitly in
test classes that rely on wildcard deletions to clear test state.
This commit adds a script parameter to long and double fields that makes
it possible to calculate a value for these fields at index time. It uses the same
script context as the equivalent runtime fields, and allows for multiple index-time
scripted fields to cross-refer while still checking for indirection loops.
With shared cache searchable snapshots we have shards that have a size
in S3 that differs from the locally occupied disk space. This commit
introduces `store.total_data_set_size` to node and indices stats, allowing to
differ between the two.
Relates #69820
This commit allows for composite aggregations in datafeeds.
Composite aggs provide a much better solution for having influencers, partitions, etc. on high volume data. Instead of worrying about long scrolls in the datafeed, the calculation is distributed across cluster via the aggregations.
The restrictions for this support are as follows:
- The composite aggregation must have EXACTLY one `date_histogram` source
- The sub-aggs of the composite aggregation must have a `max` aggregation on the SAME timefield as the aforementioned `date_histogram` source
- The composite agg must be the ONLY top level agg and it cannot have a `composite` or `date_histogram` sub-agg
- If using a `date_histogram` to bucket time, it cannot have a `composite` sub-agg.
- The top-level `composite` agg cannot have a sibling pipeline agg. Pipeline aggregations are supported as a sub-agg (thus a pipeline agg INSIDE the bucket).
Some key user interaction differences:
- Speed + resources used by the cluster should be controlled by the `size` parameter in the `composite` aggregation. Previously, we said if you are using aggs, use a specific `chunking_config`. But, with composite, that is not necessary.
- Users really shouldn't use nested `terms` aggs anylonger. While this is still a "valid" configuration and MAY be desirable for some users (only wanting the top 10 of certain terms), typically when users want influencers, partition fields, etc. they want the ENTIRE population. Previously, this really wasn't possible with aggs, with `composite` it is.
- I cannot really think of a typical usecase that SHOULD ever use a multi-bucket aggregation that is NOT supported by composite.
This change exposes for each field in the _field_caps response if the field is a metadata field.
This is needed for consumers of this API that want to filter these fields. Currently ML keeps a static list
and QL checks that the family type starts with `_`. In order to ease the addition of new metadata fields, this
change reworks the strategy in this solution and now only checks for the new flag.
Note that the new flag is also applied at the coordinator level in a best-effort to apply the logic on older nodes
in a mixed-version cluster.
* Make wildcard field use constant scoring queries for wildcard queries. Add a note about ignoring rewrite parameters on wildcard queries.
Also fixes caching issue where case sensitive and case insensitive results were cached as the same
Closes#69604
Previously, a datafeed and job must already exist for the `_preview` API to work.
With this change, users can get an accurate preview of the data that will be sent to the anomaly detection job
without creating either of them.
closes https://github.com/elastic/elasticsearch/issues/70264
* [DOCS] Clarify supported features for CCS.
* Clarify text and add subsection with title.
* Moving APIs to supported API section and paring down text.
* Removing security overview and condensing.
* Adding new security file.
* Minor changes.
* Removing link to pass build.
* Adding minimal security page.
* Adding minimal security page.
* Changes to intro.
* Add basic and basic + http configurations.
* Lots of changes, removed files, and redirects.
* Moving some AD and LDAP sections, plus more redirects.
* Redirects for SAML.
* Updating snippet languages and redirects.
* Adding another SAML redirect.
* Hopefully fixing the ci/2 error.
* Fixing another broken link for SAML.
* Adding what's next sections and some cleanup.
* Removes both security tutorials from the TOC.
* Adding redirect for removed tutorial.
* Add graphic for Elastic Security layers.
* Incorporating reviewer feedback.
* Update x-pack/docs/en/security/securing-communications/security-basic-setup.asciidoc
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
* Update x-pack/docs/en/security/securing-communications/security-minimal-setup.asciidoc
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Update x-pack/docs/en/security/securing-communications/security-basic-setup.asciidoc
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Update x-pack/docs/en/security/index.asciidoc
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
* Update x-pack/docs/en/security/securing-communications/security-basic-setup-https.asciidoc
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
* Apply suggestions from code review
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Additional changes from review feedback.
* Incorporating reviewer feedback.
* Incorporating more reviewer feedback.
* Clarify that TLS is for authenticating nodes
Co-authored-by: Tim Vernum <tim@adjective.org>
* Clarify security between nodes
Co-authored-by: Tim Vernum <tim@adjective.org>
* Clarify that TLS is between nodes
Co-authored-by: Tim Vernum <tim@adjective.org>
* Update title for configuring Kibana with a password
Co-authored-by: Tim Vernum <tim@adjective.org>
* Move section for enabling passwords between Kibana and ES to minimal security.
* Add section for transport description, plus incorporate more reviewer feedback.
* Moving operator privileges lower in the navigation.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
This adds named `teardown` support for doc tests similar to its support
for named `setup` section. This is useful when many doc files want to
share a similar `setup` AND `teardown`. I've introduced an example of
this in the CCR docs just to prove its works. We expect we'll use it for
datastreams as well.
Closes#70830
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This adds a heading for `shard_min_doc_count` and merges the paragraphs
for them. I wanted to link to this section earlier today and it wasn't a
"real" section so I couldn't.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This adds additional documentation for shared_cache searchable snapshots that are targeting the frozen tier:
- it generalizes the introduction section on searchable snapshots, mentioning that they come in two flavors now
as well as the relation to cold and frozen tiers,
- it expands the shared_cache section and
- it adds Cloud-specific instructions for getting started with the frozen tier
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Co-authored-by: debadair <debadair@elastic.co>
Co-authored-by: David Turner <david.turner@elastic.co>
* Implement dedicated client version compatibility
Add further dedicated client (xDBC, CLI) compatibility rules and
document these. A client is version-compatible with the server if:
- it supports version compatibility (past or on 7.7.0); and
- it's not on a version newer than server's; and
- it's major version is at most one unit behind server's.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
* Initial changes for scripting.
* Shorten script examples.
* Expanding types docs.
* Updating types.
* Fixing broken cross-link.
* Fixing map error.
* Incorporating review feedback.
* Fixing broken table.
* Adding more info about reference types.
* Fixing broken path.
* Adding more info an examples for def type.
* Adding more info on operators.
* Incorporating review feedback.
* Adding notconsole for example.
* Removing comments in example.
* More review feedback.
* Editorial changes.
* Incorporating more reviewer feedback.
* Rewrites based on review feedback.
* Adding new sections for storing scripts and shortening scripts.
* Adding redirect for stored scripts.
* Adding DELETE for stored script plus link.
* Adding section for updating docs with scripts.
* Incorporating final feedback from reviews.
* Tightening up a few areas.
* Minor change around other languages.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This aims at making the shrink action retryable. Every step is
retryable, but in order to provide an experience where ILM tries
to achieve a successful shrink even when the target node goes
missing permanently or the shrunk index cannot recover, this also
introduces a retryable shrink cycle within the shrink action.
The shrink action will generate a unique index name that'll be the
shrunk index name. The generated index name is stored in the lifecycle
state.
If the shrink action ends up waiting for the source shards to
colocate or for the shrunk index to recover for more than the configured
`LIFECYCLE_STEP_WAIT_TIME_THRESHOLD` setting, it will move back
to clean up the attempted (and failed) shrunk index and will retry
generating a new index name and attempting to shrink the source
to the newly generated index name.
We document that master nodes should have a persistent data path but
it's a bit hard to understand that this is what the docs are saying and
we don't really say why it's important. This commit clarifies this
paragraph.
Relates 49d0f3406c
Today the docs on node roles say that you shouldn't use dedicated
masters for heavy requests such as indexing and searching, but as per
the "designing for resilience" docs this guidance applies to all client
requests. This commit generalises the node roles docs slightly to
clarify this.
Relates #70435
This commit allows documents seen within the same time bucket to be out of order.
This is already supported within the native process.
Additionally, when recording the "latest" record timestamp, we were assuming that the latest seen document was truly the "latest". This is not really the case if latency is utilized or if documents come out of order within the same bucket.
This commit updates the default format of date_nanos field
on existing and new indices to use `strict_date_optional_time_nanos` instead of
`strict_date_optional_time`.
Using `strict_date_optional_time` as the default format for date_nanos doesn't
make sense because it accepts and parses dates with nanosecond precision,
but when it formats it drops the nanoseconds.
The change should be transparent for users, these formats accept the same input.
Relates #69192Closes#67063
If a search after request targets multiple indices and some of its sort
field has type `date` in one index but `date_nanos` in other indices,
then Elasticsearch won't interpret the search_after parameter correctly
in every target index. The sort value of a date field by default is a
long of milliseconds since the epoch while a date_nanos field is a long
of nanoseconds.
This commit introduces the `format` parameter in the sort field so a
sort value of a date or date_nanos will be formatted using a date format
in a search response.
The below example illustrates how to use this new parameter.
```js
{
"query": {
"match_all": {}
},
"sort": [
{
"timestamp": {
"order": "asc",
"format": "strict_date_optional_time_nanos"
}
}
]
}
```
```js
{
"query": {
"match_all": {}
},
"sort": [
{
"timestamp": {
"order": "asc",
"format": "strict_date_optional_time_nanos"
}
}
],
"search_after": [
"2015-01-01T12:10:30.123456789Z" // in `strict_date_optional_time_nanos` format
]
}
```
Closes#69192
Add support to delete component templates api to specify multiple template
names separated by a comma.
Change the cleanup template logic for rest tests to remove all component templates via a single delete component template request. This to optimize the cleanup logic. After each rest test we delete all templates. So deleting templates this via a single api call (and thus single cluster state update) saves a lot of time considering the number of rest tests.
Older versions don't support component / composable index templates
and/or data streams. Yet the test base class tries to remove objects
after each test, which adds a significant number of lines to the
log files (which slows the tests down). The ESRestTestCase will
now check whether all nodes have a specific version and then decide
whether data streams and component / composable index templates will
be deleted.
Also ensured that the logstash-index-template and security-index-template
aren't deleted between tests, these templates are builtin templates that
ES will install if missing. So if tests remove these templates between tests
then ES will add these template back almost immediately. These causes
many log lines and a lot of cluster state updates, which slow tests down.
Relates to #69973
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
This commit addresses two aspects of the description in the docs of
configuring a local node to be a remote cluster client. First, the
documentation was referring to the legacy setting for configuring a
remote cluster client. Secondly, we clarify that additional features,
not only cross-cluster search, have requirements around the usage of the
remote_cluster_client role.
Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Add support to delete composable index templates api to specify multiple template
names separated by a comma.
Change to cleanup template logic for rest tests to remove all composable index templates via a single delete composable index template request. This to optimize the cleanup logic. After each rest test we delete all templates. So deleting templates this via a single api call (and thus single cluster state update) saves a lot of time considering the number of rest tests.
If this pr is accepted then I will do the same change for the delete component template api.
Relates to #69973
Support for additional Client authentication methods was added in
the OIDC realm in #58708. This change adds the `rp.client_auth_method`
and `rp.client_auth_signature_algorithm` settings in the realm settings
reference doc.
Type configuration parameter was removed in 7.0. This change cleans
up some sentences where references to it had remained even after
we removed the parameter itself.
This commit changes the frozen phase within ILM in the following ways:
- The `searchable_snapshot` action now no longer takes a `storage` parameter. The storage type is
determined by the phase within which it is invoked (shared cache for frozen and full copy for
everything else).
- The frozen phase in ILM now no longer allows *any* actions other than `searchable_snapshot`
- If a frozen phase is provided, it *must* include a `searchable_snapshot` action.
These changes may seem breaking, but since they are intended to go back to 7.12 which has not been
released yet, they are not truly breaking changes.
This field mapper only lived in its own module so it could be licensed as x-pack
basic. Now it can be moved to core, which matches its status as a core type.
When performing a multi_match in cross_fields mode, we group fields based on
their analyzer and create a blended query per group. Our docs claimed that the
group scores were combined through a boolean query, but they are actually
combined through a dismax that incorporates the tiebreaker parameter.
This commit updates the docs and adds a test verifying the behavior.
It can be confusing to configure policies with phase timings that get smaller, because phase timings
are absolute. To make things a little clearer, this commit now rejects policies where a configured
min_age is less than a previous phase's min_age.
This validation is added only to the PutLifecycleAction.Request instead of the
TimeseriesLifecycleType class because we cannot do this validation every time a lifecycle is
created or else we will block cluster state from being recoverable for existing clusters that may
have invalid policies.
Resolves#70032
- adds a bit more overview on the process, including noting that it
works in terms of files
- notes that the snapshot is a point-in-time view of each shard, and not
necessarily exactly at the start of the snapshot process
- documents the `snapshot.max_concurrent_operations` setting
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Remove not completely correct statement about the size of dense_vectors
We do store a dense_vector as binary doc value with size `4*dims+4`.
But this is size before compression. As compressed size depends on
data itself, it is better to remove completely any statement
about the size.
Runtime fields usage is currently reported as part of the xpack feature usage API. Now that runtime fields are part of server, their corresponding stats can be moved to be part of the ordinary mapping stats exposed by the cluster stats API.
This test was sorting by store.size, but these indices could end up with the same store size and
then the sorting would occasionally be wrong for the test.
Resolves#51619
The tip about updating a `search_analyzer` currently does not mention that most
of the time (when the current analyzer is not "default"), user need to repeat
the currently set "analyzer" parameter in the field definition. Adding this as a
short note.
You can't update the `analyzer` parameter in the PUT mappings API even if
the index is closed. This adds a TIP to call that out. And adds a TIP
for `search_quote_analyzer` which you *can* update.
The endpoint `_snapshottable_features` is long and implies incorrect
things about this API - it is used not just for snapshots, but also for
the upcoming reset API. Following discussions on the team, this commit
changes the endpoint to `_features` and removes the connection between
this API and snapshots, as snapshots are not the only use for the output
of this API.
We expect runtime fields to perform a little better than our "native"
aggregation script so we should point folks to them instead of the
"native" aggregation script.
* Support audit ignore policy by index privileges
Adding new audit ignore policy - privileges
For example, following policy will filter out all events, which actions
minimal required privilege is either "read" or "delete":
xpack.security.audit.logfile.events.ignore_filters:
example:
privileges: ["read", "delete"]
Resolve: #60877
Related: #10836
Related: #37148
* Support audit ignore policy by index privileges
Adding new audit ignore policy - privileges
For example, following policy will filter out all events, which actions
required privilege is either "read" or "delete":
xpack.security.audit.logfile.events.ignore_filters:
example:
privileges: ["read", "delete"]
Resolve: #60877
Related: #10836
Related: #37148
* To avoid ambiguity (as cluster and index policies may have the same
name) changing implementation to have to separate policies for
`index_privileges` and `cluster_privileges`.
If both are set for the same policy, throw the IllegalArgumentException.
* To avoid ambiguity (as cluster and index policies may have the same
name) changing implementation to have to separate policies for
`index_privileges` and `cluster_privileges`.
If both are set for the same policy, throw the IllegalArgumentException.
* Fixing Api key related privilege check which expects request and
authentication by introducing overloaded
version of findPrivilegesThatGrant
just checking if privileges which can grant the action regardless of the
request and authentication context.
* Fixing a test; adding a caching mechanism to avoid calling
findPrivilegesThatGrant each
time.
* Support audit ignore policy by index privileges
Addressing review feedback
* Support audit ignore policy by index privileges
Addressing review comments + changing approach:
- use permission check instead of simple "checkIfGrants"
- adding more testing
* Support audit ignore policy by index privileges
Addressing review comments + changing approach:
- use permission check instead of simple "checkIfGrants"
- adding more testing
* Support audit ignore policy by index privileges
Addressing review comments + changing approach:
- use permission check instead of simple "checkIfGrants"
- adding more testing
* Support audit ignore policy by index privileges
Addressing review comments + changing approach:
- use permission check instead of simple "checkIfGrants"
- adding more testing
* Revert "Support audit ignore policy by index privileges"
This reverts commit 152821e7
* Revert "Support audit ignore policy by index privileges"
This reverts commit 79649e9a
* Revert "Support audit ignore policy by index privileges"
This reverts commit 96d22a42
* Revert "Support audit ignore policy by index privileges"
This reverts commit 67574b2f
* Revert "Support audit ignore policy by index privileges"
This reverts commit 35573c8b
* Revert "Fixing a test; adding a caching mechanism to avoid calling findPrivilegesThatGrant each time."
This reverts commit 7faa52f3
* Revert "Fixing Api key related privilege check which expects request and authentication by introducing overloaded version of findPrivilegesThatGrant just checking if privileges which can grant the action regardless of the request and authentication context."
This reverts commit 72b9aefe
* Revert "To avoid ambiguity (as cluster and index policies may have the same name) changing implementation to have to separate policies for `index_privileges` and `cluster_privileges`. If both are set for the same policy, throw the IllegalArgumentException."
This reverts commit 7dd8fe7d
* Revert "To avoid ambiguity (as cluster and index policies may have the same name) changing implementation to have to separate policies for `index_privileges` and `cluster_privileges`. If both are set for the same policy, throw the IllegalArgumentException."
This reverts commit cb5bc09c
* Revert "Support audit ignore policy by index privileges"
This reverts commit a918da10
* Support audit ignore policy by actions
Getting back to action filtering
* Support audit ignore policy by actions
Cleaning up some tests
* Support audit ignore policy by actions
Cleaning up some tests
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit adds a new `_preview` endpoint for data frame analytics.
This allows users to see the data on which their model will be trained. This is especially useful
in the arrival of custom feature processors.
The API design is a similar to datafeed `_preview` and data frame analytics `_explain`.
Adds support for the include_unloaded_segments flag in node stats, which helps with understanding resource usage of
shared_cache-style searchable snapshots on a per-node basis.
This commit moves away from the static `rollup-{indexName}` rollup index
naming strategy and moves towards a randomized rollup index name scheme.
This will reduce the complications that exist if the RollupStep fails and retries
in any way. A separate cleanup will still be required for failed temporary indices,
but at least there will not be a conflict.
This commit generates the new rollup index name in the LifecycleExecutionState so
that it can be used in RollupStep and UpdateRollupIndexPolicyStep on a per-index
basis.
This adds additional statistics into the usage API for data frame analytics
and trained models.
For data frame analytics the added stats are:
- count of jobs by analysis type
- stats for peak_usage_bytes
For trained models the added stats are:
- counts of: total, prepackaged, other (not created by data frame analytics)
- counts by analysis type based on the inference config
- stats for estimated heap usage
- stats for estimated number of operations
This commit adds support for the Gold+ licensed `geo_line` aggregation.
This aggregation takes a collection of `geo_point` values and constructs a line
according to some sort value. Adding to transforms allows users to create these
potentially expensive lines out of band of visualizations and then do additional aggs/queries
against the pivoted data.
Examples would be:
"Do these daily user paths ever intersect?"
"Does this path enter and leave this area?"
* [DOCS] Adding grok support for runtime fields.
* Update response.
* Adding testresponse replacements.
* Update runtime field context and add dissect.
* Fixing backslash in the response.
* Fixing testresponse.
* Incorporating review feedback.
* Updates emit and adds cross link from ES runtime fields page.
To avoid confusion for the users replace the `YYYY` and `uuuu` year
patterns in the examples of `DATETIME_FORMAT/PARSE` with the most common
`yyyy` to avoid any confusion for users that might just copy paste those
queries for their own use case.
Relates to #68030
Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.
The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.
Closes#54151Closes#2869
Autoscaling expects data tiers to be used exclusively both for node
roles and in ILM policies. This commit adds a test demonstrating that
as well as documentation for the behavior.
Users can now specify runtime mappings as part of the source config
of a data frame analytics job. Those runtime mappings become part of
the mapping of the destination index. This ensures the fields are
accessible in the destination index even if the relevant data frame
analytics job gets deleted.
Closes#65056
* [DOCS] Add runtime field to glossary
* Update links with external refs
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
A `model_alias` allows trained models to be referred by a user defined moniker.
This not only improves the readability and simplicity of numerous API calls, but it allows for simpler deployment and upgrade procedures for trained models.
Previously, if you referenced a model ID directly within an ingest pipeline, when you have a new model that performs better than an earlier referenced model, you have to update the pipeline itself. If this model was used in numerous pipelines, ALL those pipelines would have to be updated.
When using a `model_alias` in an ingest pipeline, only that `model_alias` needs to be updated. Then, the underlying referenced model will change in place for all ingest pipelines automatically.
An additional benefit is that the model referenced is not changed until it is fully loaded into cache, this way throughput is not hampered by changing models.
This commit removes support for JAVA_HOME. As we previously deprecated
usage of JAVA_HOME to override the path for the JDK, this commit follows
up by removing support for JAVA_HOME. Note that we do not treat
JAVA_HOME being set as a failure, as it is perfectly reasonable for a
user to have JAVA_HOME configured at the system level.
This commit introduces a dedicated envirnoment variable ES_JAVA_HOME to
determine the JDK used to start (if not using the bundled JDK). This
environment variable will replace JAVA_HOME. The reason that we are
making this change is because JAVA_HOME is a common environment variable
and sometimes users have it set in their environment from other JDK
applications that they have installed on their system. In this case,
they would accidentally end up not using the bundled JDK despite their
intentions. By using a dedicated environment variable specific to
Elasticsearch, we avoid this potential for conflict. With this commit,
we introduce the new environment variable, and deprecate the use of
JAVA_HOME. We will remove support for JAVA_HOME in a future commit.
* Reallocate runtime document
Reallocate document `runtime-fields-scriptless` from `runtime-search-request` to `runtime-mapping-fields`
* Move runtime without script section
Move runtime without script section to under the dynamic runtime mapping section
* Fix snippet formatting and remove discrete heading.
* Update test snippet.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
This PR adds the special `_shard_doc` sort tiebreaker automatically to any
search requests that use a PIT. Adding the tiebreaker ensures that any
sorted query can be paginated consistently within a PIT.
Closes#56828
Today we imply that CCR will automatically fall back to a full index
copy if it cannot replay any missing history. This was true for earlier
versions of the design but we ultimately decided not to do this without
adjusting the docs to match. This commit fixes the docs.
Currently, existing runtime fields can be updated, but they cannot be removed. That allows to correct potential mistakes, but once a runtime field is added to the index mappings, it is not possible to remove it.
With this commit we introduce the ability to remove an existing runtime field by providing a null value for it through the put mapping API. If a field with such name does not exist, such specific instruction will have no effect on other existing runtime fields.
Note that the removal of runtime fields makes the recently introduced assertRefreshItNotNeeded assertion trip, because when each local node merges mappings back in, the runtime fields that were previously removed by the master node, get added back again locally. This is only a problem for the assertion that verifies that the removed refresh operation is never needed. We worked around this by tweaking the assertion to ignore runtime fields completely, for simplicity, by assertion on the serialized merged mappings and incoming mappings without the corresponding runtime section.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Today we rely on blob stores behaving in a certain way so that they can be used
as a snapshot repository. There are an increasing number of third-party blob
stores that claim to be S3-compatible, but which may not offer a suitably
correct or performant implementation of the S3 API. We rely on somesubtle
semantics with concurrent readers and writers, but some blob stores may not
implement it correctly. Hitting a corner case in the implementation may be rare
in normal use, and may be hard to reproduce or to distinguish from an
Elasticsearch bug.
This commit introduces a new `POST /_snapshot/.../_analyse` API which exercises
the more problematic corners of the repository implementation looking for
correctness bugs and measures the details of the performance of the repository
under concurrent load.
This change adds a new cluster privilege cancel_task that allows to:
Cancel running tasks (_tasks/_cancel).
Cancel and delete async searches.
Today the 'manage' cluster privilege is required to cancel tasks and
to delete async searches when security features are enabled.
This new focused privilege allows to handle tasks and searches only.
The change also adds the privilege to the internal 'kibana_system'
and '_async_search' roles. They both need to be able to cancel tasks
and delete async searches.
Relates #67965
Add a `max_analyzed_offset` query parameter to allow users
to limit the highlighting of text fields to a value less than or equal to the
`index.highlight.max_analyzed_offset`, thus avoiding an exception when
the length of the text field exceeds the limit. The highlighting still takes place,
but stops at the length defined by the new parameter.
Closes: #52155
This moves the execution of the `searchable_snapshot` action before the
`migrate` action in the `cold` and `frozen` phases for more efficient
data migration (ie. mounting it as a searchable snapshot directly on the
target tier)
Now that searchable_snapshot can precede other actions in the same phase
(eg. in frozen it is followed by `migrate`) we need to allow the mounted
index to resume executing the ILM policy starting with a step that's part
of a new action (ie. migrate).
This adds support to resume the execution of the mounted index from another
action.
With older versions, the execution would resume from the PhaseCompleteStep
as it was the last action in a phase, which was handled as a special case
in the `CopyExecutionStateStep`. This generalises the `CopyExecutionStateStep`
to be able to resume from any `StepKey`.
This PR expands the meaning of `include_global_state` for snapshots to include system indices. If `include_global_state` is `true` on creation, system indices will be included in the snapshot regardless of the contents of the `indices` field. If `include_global_state` is `true` on restoration, system indices will be restored (if included in the snapshot), regardless of the contents of the `indices` field. Index renaming is not applied to system indices, as system indices rely on their names matching certain patterns. If restored system indices are already present, they are automatically deleted prior to restoration from the snapshot to avoid conflicts.
This behavior can be overridden to an extent by including a new field in the snapshot creation or restoration call, `feature_states`, which contains an array of strings indicating the "feature" for which system indices should be snapshotted or restored. For example, this call will only restore the `watcher` and `security` system indices (in addition to `index_1`):
```
POST /_snapshot/my_repository/snapshot_2/_restore
{
"indices": "index_1",
"include_global_state": true,
"feature_states": ["watcher", "security"]
}
```
If `feature_states` is present, the system indices associated with those features will be snapshotted or restored regardless of the value of `include_global_state`. All system indices can be omitted by providing a special value of `none` (`"feature_states": ["none"]`), or included by omitting the field or explicitly providing an empty array (`"feature_states": []`), similar to the `indices` field.
The list of currently available features can be retrieved via a new "Get Snapshottable Features" API:
```
GET /_snapshottable_features
```
which returns a response of the form:
```
{
"features": [
{
"name": "tasks",
"description": "Manages task results"
},
{
"name": "kibana",
"description": "Manages Kibana configuration and reports"
}
]
}
```
Features currently map one-to-one with `SystemIndexPlugin`s, but this should be considered an implementation detail. The Get Snapshottable Features API and snapshot creation rely upon all relevant plugins being installed on the master node.
Further, the list of feature states included in a given snapshot is exposed by the Get Snapshot API, which now includes a new field, `feature_states`, which contains a list of the feature states and their associated system indices which are included in the snapshot. All system indices in feature states are also included in the `indices` array for backwards compatibility, although explicitly requesting system indices included in a feature state is deprecated. For example, an excerpt from the Get Snapshot API showing `feature_states`:
```
"feature_states": [
{
"feature_name": "tasks",
"indices": [
".tasks"
]
}
],
"indices": [
".tasks",
"test1",
"test2"
]
```
Co-authored-by: William Brafford <william.brafford@elastic.co>
Currently runtime fields from search requests don't appear in the output of the
field capabilities API, but some consumer of runtime fields would like to see
runtime section just like they are defined in search requests reflected and
merged into the field capabilities output.
This change adds parsing of a "runtime_mappings" section equivallent to the one
on search requests to the `_field_caps` endpoint, passes this section down to
the shard level where any runtime fields defined here overwrite the mapping of
the targetet indices.
Closes#68117
Introduce eql search status API,
that reports the status of eql stored or async search.
GET _eql/search/status/<id>
The API is restricted to the monitoring_user role.
For a running eql search, a response has the following format:
{
"id" : <id>,
"is_running" : true,
"is_partial" : true,
"start_time_in_millis" : 1611690235000,
"expiration_time_in_millis" : 1611690295000
}
For a completed eql search, a response has the following format:
{
"id" : <id>,
"is_running" : false,
"is_partial" : false,
"expiration_time_in_millis" : 1611690295000,
"completion_status" : 200
}
Closes#66955
This partially reverts #64016 and and adds #67839 and adds
additional tests that would have caught issues with the changes
in #64016. It's mostly Nik's code, I am just cleaning things up
a bit.
Co-authored-by: Nik Everett <nik9000@gmail.com>
Moving towards grouping of data types in the field caps API
the internal data type `DATETIME_NANOS` introduced for `date_nanos`
support is eliminated.
Relates: #67722
Follows: #67666
* Integrate "fields" API into QL (#68467)
* QL: retry SQL and EQL requests in a mixed-node (rolling upgrade) cluster (#68602)
* Adapt nested fields extraction from "fields" API output to the new un-flattened structure (#68745)
* Fixing Painless tests.
* Update runtime field context to fix test cases.
* Remove watcher logging from usage API and replace test.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit adds support for the recently introduced partial searchable snapshot (#68509) to ILM.
Searchable snapshot ILM actions may now be specified with a `storage` option, specifying either
`full_copy` or `shared_cache` (similar to the "mount" API) to mount either a full or partial
searchable snapshot:
```json
PUT _ilm/policy/my_policy
{
"policy": {
"phases": {
"cold": {
"actions": {
"searchable_snapshot" : {
"snapshot_repository" : "backing_repo",
"storage": "shared_cache"
}
}
}
}
}
}
```
Internally, If more than one searchable snapshot action is specified (for example, a full searchable
snapshot in the "cold" phase and a partial searchable snapshot in the "frozen" phase) ILM will
re-use the existing snapshot when doing the second mount since a second snapshot is not required.
Currently this is allowed for actions that use the same repository, however, multiple
`searchable_snapshot` actions for the same index that use different repositories is not allowed (the
ERROR state is entered). We plan to allow this in the future in subsequent work.
If the `storage` option is not specified in the `searchable_snapshot` action, the mount type
defaults to "shared_cache" in the frozen phase and "full_copy" in all other phases.
Relates to #68605
This commit spells out how important repository reliability is to
searchable snapshots, and also documents a procedure for taking a backup
of a snapshot repository.
Relates #54944
The doc is misleading : The following intervals search returns documents containing `my favorite food` **immediately** followed by `hot water` or `cold porridge`
max_gaps apply only to the match query and is not used for checking proximity with the other match, the example given actually`This search would match a my_text value of my favorite food is cold`
Co-authored-by: Julien Guay <guay_j@yahoo.fr>
This commit adds the `data_frozen` node role as part of the formalization of data tiers. It also
adds the `"frozen"` phase to ILM, currently allowing the same actions as the existing cold phase.
The frozen phase is intended to be used for data even less frequently searched than the cold phase,
and will eventually be loosely tied to data using partial searchable snapshots (as oppposed to full
searchable snapshots in the cold phase).
Relates to #60848
Fixed the inconsistencies regarding NULL argument handling.
NULL literal vs NULL field value as function arguments in some case
resulted in different function return values.
Functions should return with the same value no matter if the argument(s)
came from a field or from a literal.
The introduced integration test tests if function calls with same
argument values (regardless of literal/field) will return with the
same output (also checks if newly added functions are added to the
testcases).
Fixed the following functions:
* Insert: NULL start, length and replacement arguments (as fields) also
result in NULL return value instead of returning the input.
* Locate: NULL pattern results in NULL return value, NULL optional start
argument handled the same as missing start argument
* Replace: NULL pattern and replacement results in NULL instead of
returning the input
* Substring: NULL start or length results in NULL instead of returning
the input
Fixes#58907
Changes:
- Reworks the ILM tutorial to focus on the Elastic Agent and a built-in ILM policy
- Updates several screenshots in the docs for the new ILM UI
Co-authored-by: debadair <debadair@elastic.co>
A frozen tier is backed by an external object store (like S3) and caches only a
small portion of data on local disks. In this way, users can reduce hardware
costs substantially for infrequently accessed data. For the frozen tier we only
pull in the parts of the files that are actually needed to run a given search.
Further, we don't require the node to have enough space to host all the files.
We therefore have a cache that manages which file parts are available, and which
ones not. This node-level shared cache is bounded in size (typically in relation
to the disk size), and will evict items based on a LFU policy, as we expect some
parts of the Lucene files to be used more frequently than other parts. The level
of granularity for evictions is at the level of regions of a file, and does not
require evicting full files. The on-disk representation that was chosen for the
cold tier is not a good fit here, as it won't allow evicting parts of a file.
Instead we are using fixed-size pre-allocated files and have implemented our own
memory management logic to map regions of the shard's original Lucene files onto
regions in these node-level shared files that are representing the on-disk
cache.
This PR adds the core functionality to searchable snapshots to power such a
frozen tier:
- It adds the node-level shared cache that evicts file regions based on a LFU
policy
- It adds the machinery to dynamically download file regions into this cache and
serve their contents when searches execute.
- It extends the mount API with a new parameter, `storage`, which selects the
kind of local storage used to accelerate searches of the mounted index. If set
to `full_copy` (default, used for cold tier), each node holding a shard of the
searchable snapshot index makes a full copy of the shard to its local storage.
If set to `shared_cache`, the shard uses the newly introduced shared cache,
only holding a partial copy of the index on disk (used for frozen tier).
Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: David Turner <david.turner@elastic.co>
This commit sets the recovery rate for dedicated cold nodes. The goal is
here is enhance performance of recovery in a dedicated cold tier, where
we expect such nodes to be predominantly using searchable snapshots to
back the indices located on them. This commit follows a simple approach
where we increase the recovery rate as a function of the node size, for
nodes that appear to be dedicated cold nodes.
Adds a multi_terms aggregation support. The multi terms aggregation works
very similarly to the terms aggregation but supports multiple terms. The goal
of this PR is to add the basic functionality so it is not optimized at the
moment. It will be done in follow up PRs.
Closes#65623
This commit removes the following deprecated settings in v8:
- `gateway.expected_nodes`
- `gateway.expected_master_nodes`
- `gateway.recover_after_nodes`
- `gateway.recover_after_master_nodes`
Co-authored-by: ShawnLi1014 <shawnli1014@gmail.com>
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:
- Updating LICENSE and NOTICE files throughout the code base, as well
as those packaged in our published artifacts
- Update IDE integration to now use the new license header on newly
created source files
- Remove references to the "OSS" distribution from our documentation
- Update build time verification checks to no longer allow Apache 2.0
license header in Elasticsearch source code
- Replace all existing Apache 2.0 license headers for non-xpack code
with updated header (vendored code with Apache 2.0 headers obviously
remains the same).
- Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
This change adds a new "architectures" section to the
cluster stats, containing a summary of how many nodes
in the cluster are on each processor architecture.
The intention is to make it easier to see whether
clusters are running on aarch64, or mixed x86_64/aarch64,
which may aid support as aarch64 becomes more commonly
used.
Refactoring of cat transform to show more relevant information. The current cat transform shows a
lot of configuration details, however cat should show operationally useful information. This PR
changes the defaults and also adds when transform did a search last.
Taking a snapshot of a cluster containing searchable snapshot indices is
kind of mindbending. This commit adds docs to indicate that this does
work.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
* [DOCS] Minor rewording for HTTP settings.
* Revert "[DOCS] Minor rewording for HTTP settings."
This reverts commit 9a831adca6.
* Adds advanced wording to HTTP & transport settings.
Today's network config docs are split into "Network", "HTTP" and
"Transport" pages, with unclear relationships between them. We often
encounter users with weird configs that indicate they don't really
understand how these settings all relate. In fact these pages are all
very interrelated, and the HTTP and Transport pages are almost all only
for advanced users. This commit brings these docs into a single page and
rewords some things to try and guide users away from the advanced
settings unless their configuration needs all the extra complexity.
It also adds a section entitled "Binding and publishing" which clarifies
the meanings of the `bind_host` and `publish_host` parameters. This is
also a common source of confusion amongst users.
It also clarifies that many of these settings accept a list of
addresses, and warns that this may not be what you want. Closes#67956.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
The PR adds early_stopping_enabled optional data frame analysis configuration parameter. The enhancement was already described in elastic/ml-cpp#1676 and so I mark it here as non-issue.
Use an internal new DataType DATETIME_NANOS which is not exposed
and therefore cannot be used for CASTing. DATETIME is used instead
and the precision of both DATETIME and TIME has been promoted from
3 to 9, providing transparency to all datetime functionality regardless
of millis or nanos precision.
Moreover, CURRENT_TIMESTAMP/CURRENT_TIME can now return precision up
to 6 fractional digits of a second with the use of Clock.
Closes: #38562
Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>
this commit introduces a new Rollup ILM Action that allows indices
to be rolled up according to a specific rollup config. The
action also allows for the new rolled up index to be associated with
a different policy than the original/source index.
Relates #42720.
Closes#48003.
This commit adds statistics about the index creation versions to the `/_cluster/stats` endpoint. The
stats look like:
```
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"indices" : {
"count" : 3,
...
"versions" : [
{
"version" : "8.0.0",
"index_count" : 1,
"primary_shard_count" : 2,
"total_primary_size" : "8.6kb",
"total_primary_bytes" : 8831
},
{
"version" : "7.11.0",
"index_count" : 1,
"primary_shard_count" : 1,
"total_primary_size" : "4.6kb",
"total_primary_bytes" : 4230
}
]
},
...
}
```
(`total_primary_size` is only shown with the `?human` flag)
This is useful for telemetry as it allows us to see if/when a cluster has indices created on a
previous version that would need to be either upgraded or supported during an upgrade.
* Adds datetime as a date, which is necessary in setup.
* Updating field context example.
* Fixing sample data, updating context example, and updating runtime example.
* Updating field context and changing runtime field to use seats data.
* Update filter context to use the seats data.
* Updating min-should-match context to use seats data.
* Replacing last mentions of TEST[skip].
* Update usage with watcher response for build error.
* Updating usage API again for watcher.
* Third time's a charm for fixing test cases.
* Adding specific test replacement for watcher logging total.
* Change actors to keyword based on review feedback.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Today the discovery phase has a short 1-second timeout for handshaking
with a remote node after connecting, which allows it to quickly move on
and retry in the case of connecting to something that doesn't respond
straight away (e.g. it isn't an Elasticsearch node).
This short timeout was necessary when the component was first developed
because each connection attempt would block a thread. Since #42636 the
connection attempt is now nonblocking so we can apply a more relaxed
timeout.
If transport security is enabled then our handshake timeout applies to
the TLS handshake followed by the Elasticsearch handshake. If the TLS
handshake alone takes over a second then the whole handshake times out
with a `ConnectTransportException`, but this does not tell us which of
the two individual handshakes took so long.
TLS handshakes have their own 10-second timeout, which if reached yields
a `SslHandshakeTimeoutException` that allows us to distinguish a problem
at the TLS level from one at the Elasticsearch level. Therefore this
commit extends the discovery probe timeouts.