Fixed autoscaling docs to no longer call partially mounted indices or
shards for frozen indices/shards, now uses partially mounted indices or
shards.
Closes#73132
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This change adds support for using `search_after` with field collapsing. When
using these in conjunction, the same field must be used for both sorting and
field collapsing. This helps keep the behavior simple and predictable.
Otherwise it would be possible for a group to appear on multiple pages of
results.
Currently search after is handled directly in `CollapsingTopDocsCollector`. As
a follow-up, we could generalize the logic and move support to the Lucene
grouping framework.
Closes#53115.
Today the docs indicate that restoring a snapshot with
`include_global_state` set will merge the ingest pipelines, ILM
policies, settings etc in the snapshot with those already in the
cluster. This isn't the case, we simply replace all the things. This
commit corrects the docs.
The get alias api should take into account the aliases parameter when
returning aliases that refer to data streams and don't return entries
for data streams that don't have any aliases pointing to it.
Relates to #66163
Adds new snapshot meta pool that is used to speed up the get snapshots API
by making `SnapshotInfo` load in parallel. Also use this pool to load
`RepositoryData`.
A follow-up to this would expand the use of this pool to the snapshot status
API and make it run in parallel as well.
The current search API documentation doesn't include any examples of query
parameter usage.
This updates the docs to include a simple syntax example using the `from` and
`size` query parameters.
Changes:
* Removes an error in the create SLM policy API's `schedule` parameter
def. `schedule` is not used to delete expired snapshots.
* Updates the `expire_after` parameter def to mention the
`slm.retention_schedule` cluster setting.
Upgrades to Lucene-8.9 snapshot which includes:
- LUCENE-9507: Custom order for leaves (/cc @mayya-sharipova)
- LUCENE-9935: Enable bulk merge for stored fields with index sort
This commit adds a `cancelled` flag to each cancellable task in the
response to the list tasks API, allowing users to see that a task has
been properly cancelled and will complete as soon as possible.
Closes#72907
If a node is partitioned away from the rest of the cluster then the
`ClusterFormationFailureHelper` periodically reports that it cannot
discover the expected collection of nodes, but does not indicate why. To
prove it's a connectivity problem, users must today restart the node
with `DEBUG` logging on `org.elasticsearch.discovery.PeerFinder` to see
further details.
With this commit we log messages at `WARN` level if the node remains
disconnected for longer than a configurable timeout, which defaults to 5
minutes.
Relates #72968
Adds an optional parameter to the _terms_enum request designed to allow paging.
The last term from a previous result can be passed as the search_after parameter to a subsequent request, meaning only terms after the given term (but still matching the provided string prefix) are returned
Relates to #72910
* add runtime fields contexts to execute docs
* Changes for formatting throughout
* Add missing context and context_setup
* Updating runtime field context
* Moving parameters and adopting a more standard API layout
* Update several examples
* Update more examples for runtime context
* Fix links
* Add boolean_field example and remove extraneous headings
* Add example for date_time context
* Remove extra space in TEST
* Updating date_time example
* Incorporating review feedback
* Adding cross links
* Tweaking some language based on feedback
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
* wip
* Service Accounts - add beta documentation
* consistent names
* fix test
* Update service accounts overview and token creation files.
* Rename get service tokens to get service credentials
* fix tests
* Changes for create and get service tokens.
* Changes for get token creds, delete token, clear token cache, and token auth.
* add manage_service_account privilege to list
* List service accounts APIs
* Move xpack setting to Security API page, plus other cleanup.
* Shorten secret tokens in examples, add cross links, plus other cleanup.
* Clarifying parameter descriptions.
* Clarify language for authenticating with a token.
* Tweaks
* Typo fix
* Adding redirects to work around CI build checks
* Revert "Adding redirects to work around CI build checks"
This reverts commit 20a1b53591.
* Remove redirects that were implemented to satisfy CI checks in master branch
* Move note about not supporting basic auth
* Clarify what service accounts are specifically for
* Apply suggestions from code review
Co-authored-by: Tim Vernum <tim@adjective.org>
* Addressing review feedback
* tweak
* Improve doc tests
* fix test
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
Azure Storage accounts offer several storage services including Blob Storage, Table Storage, File Storage, and Storage Queues. The intro page for this plugin should specify which type is used for elasticsearch snapshots. This info is necessary for pricing at very least.
Co-authored-by: joshschmitter <45405518+joshschmitter@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Enroll node API can be used by new nodes in order to join an
existing cluster that has security features enabled. The response
of a call to this API contains all the necessary information that
the new node requires in order to configure itself and bootstrap
trust with the existing cluster.
* Deprecate single-tier allocation filtering settings
`(index|cluster).routing.allocation.(include|exclude|require)._tier` settings are now deprecated in
favor of using `index.routing.allocation.include._tier_preference`.
* Update deprecation message
We currently don't support `copy_to` for fields that take the form of objects
(e.g. `date_range` or certain kinds of `geo_point` variants). The current
problem with objects is that when DocumentParser parses anything other than
single values, it potentially advances the underlying parser past the value that
we would need to stay on for parsing the value again. While we might want to
support this in the future, for now this PR enhances the otherwise confusing
MapperParsingException with something more helpful and adds a short note in the
documentation about this restriction.
Closes#49344
Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.
Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.
The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
https://github.com/elastic/elasticsearch/issues/66163#issuecomment-824709767
Relates to #66163
Adds some extra debugging information to make it clear that you are
running `significant_text`. Also adds some using timing information
around the `_source` fetch and the `terms` accumulation. This lets you
calculate a third useful timing number: the analysis time. It is
`collect_ns - fetch_ns - accumulation_ns`.
This also adds a half dozen extra REST tests to get a *fairly*
comprehensive set of the operations this supports. It doesn't cover all
of the significance heuristic parsing, but its certainly much better
than what we had.
This commit adds a new pipeline aggregation that allows correlation within the aggregation frame work in bucketed values.
The initial function is a `count_correlation` function. The purpose of which is to correlate the count in a consistent number of buckets with a pre calculated indicator. The indicator and the aggregated buckets should related to the same metrics with in documents.
Example for correlating terms within a `service.version.keyword` with latency percentiles. The percentiles and provided correlation indicator both refer to the same source data where the indicator was previously calculated.:
```
GET apm-7.12.0-transaction-generated/_search
{
"size": 0,
"aggs": {
"field_terms": {
"terms": {
"field": "service.version.keyword",
"size": 20
},
"aggs": {
"latency_range": {
"range": {
"field": "transaction.duration.us",
"ranges": [<snip>],
"keyed": true
}
},
"correlation": {
"bucket_correlation": {
"buckets_path": "latency_range>_count",
"count_correlation": {
"indicator": {
"expectations": [<snip>],
"doc_count": 20000
}
}
}
}
}
}
}
}
```
Today we mention Metricbeat's `scope` parameter but offer no guidance
about how it should be used. This commit adds guidance to use `scope:
cluster`, especially on clusters with dedicated master-eligible nodes.
Due to problems discovered in #72572 we have to disable geoip downloader for now. We use ingest.geoip.downloader.enabled.default as feature flag.
This change also reverts changes to docs.
This commit removes the bootstrap.system_call_filter setting, as
starting in Elasticsearch 8.0.0 we are going to require that system call
filters be installed and that this is not user configurable. Note that
while we force bootstrap to attempt to install system call filters, we
only enforce that they are installed via a bootstrap check in production
environments. We can consider changing this behavior, but leave that for
future consideration and thus a potential follow-up change.
We are going to require system call filters. This commit is the first
step in that journey, which is to deprecate the setting that allows
disabling system call filters.
The docs for the `filter` agg seemed to suggest that it was the
preferred way to filter results for aggs but its really mostly for when
you need to filter things under another bucketing agg.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
New api designed for use by apps like Kibana for auto-complete use cases.
A search string is supplied which is used as prefix for matching terms found in a given field in the index.
Supported field types are keyword, constant_keyword and flattened.
A timeout can limit the amount of time spent looking for matches (default 1s) and an `index_filter` query can limit indices e.g. those in the hot or warm tier by querying the `_tier` field
Closes#59137
Watcher uses a connection pool for outgoing HTTP traffic, which means
that some HTTP connections may live for a long time, possibly in an idle
state. Such connections may be silently torn down by a remote device, so
that when we re-use them we encounter a `Connection reset` or similar
error.
This commit introduces a setting allowing users to set a finite expiry
time on these connections, and also enables TCP keepalives on them by
default so that a remote teardown will be actively detected sooner.
Closes#52997
Changes:
* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
* Remove frozen tier restriction for ESS
* Remove section from 'Use ES for time series data'
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This documents the different signatures of the `emit` method for runtime
fields. For fields like `long` the signature is fairly obvious -
`emit(long)`. But for `date`, `ip`, and `geo_point` its not obvious from
the name what the signature of the method will be.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
* [DOCS] Add field extraction use cases to scripting docs
* Adding file
* Remove extra space
* Add dissect pattern to split and retrieve data
* Fix list spacing
* Incorporating review feedback
This commit increases the xpack.ml.max_open_jobs from 20 to 512. Additionally, it ignores nodes that cannot provide an accurate view into their native memory.
If a node does not have a view into its native memory, we ignore it for assignment.
This effectively fixes a bug with autoscaling. Autoscaling relies on jobs with adequate memory to assign jobs to nodes. If that is hampered by the xpack.ml.max_open_jobs scaling decisions are hampered.
With the introduction of BKD-based geo shape indexing in #32039, the prefix tree indexing method has
been deprecated. From 8.0.0, it will not be allowed to create new mappings using deprecated parameters.
* Improve indentation of code for discovery-gce
Improve the indentation by using a indentation level of two spaces to
improve readability and enable better copy&paste experience.
* Improve docs for GCP web-console and permissions
Match the description for the GCP web-console to the current state
and change the API-permission.
There is (no longer) a permission `compute.full_control`.
* Apply suggestions from code review
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
* [DOCS] Document how to migrate to node roles from node attrs. Closes#65855
* [DOCS] Incorporated review comments
* Update docs/reference/data-management/migrate-index-allocation-filters.asciidoc
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.
I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.
Co-authored-by: Jay Modi <jaymode@users.noreply.github.com>
We rely on the repository implementation correctly handling the case where a
write is aborted before it completes. This is not guaranteed for third-party
repositories.
This commit adds a rare action during analysis which aborts the write
just before it completes and verifies that the target blob is not found
by any node.
add support for the stats and top metrics aggregation in transform. With this change it became
easier to add more multi value aggregations to transform
Limitations:
- only the 1st element of top_metrics gets consumed by transform[*].
- all values of stats will be mapped to double if mapping deduction is used, including count,
sum, min, max
fixes#52236
relates #51925
When doing a rolling restart we recommend disabling shard allocation to
avoid unnecessary recoveries. However, this advise is unnecessary or
even harmful when restarting nodes that do not carry any data like a
pure ML node.