Fleet server needs an API to access up to date global checkpoints for
indices. Additionally, it requires a mode of operation when fleet can
provide its current knowledge about the global checkpoints and poll for
advancements. This commit introduces this API in the fleet plugin.
* Warn users if security is implicitly disabled
Elasticsearch has security features implicitly disabled by default for
Basic and Trial licenses, unless explicitly set in the configuration
file.
This may be good for onboarding, but it also lead to unintended insecure
clusters.
This change introduces clear warnings when security features are
implicitly disabled.
- a warning header in each REST response if security is implicitly
disabled;
- a log message during cluster boot.
Runtime fields usage is currently reported as part of the xpack feature usage API. Now that runtime fields are part of server, their corresponding stats can be moved to be part of the ordinary mapping stats exposed by the cluster stats API.
The endpoint `_snapshottable_features` is long and implies incorrect
things about this API - it is used not just for snapshots, but also for
the upcoming reset API. Following discussions on the team, this commit
changes the endpoint to `_features` and removes the connection between
this API and snapshots, as snapshots are not the only use for the output
of this API.
This adds additional statistics into the usage API for data frame analytics
and trained models.
For data frame analytics the added stats are:
- count of jobs by analysis type
- stats for peak_usage_bytes
For trained models the added stats are:
- counts of: total, prepackaged, other (not created by data frame analytics)
- counts by analysis type based on the inference config
- stats for estimated heap usage
- stats for estimated number of operations
This commit adds support for the Gold+ licensed `geo_line` aggregation.
This aggregation takes a collection of `geo_point` values and constructs a line
according to some sort value. Adding to transforms allows users to create these
potentially expensive lines out of band of visualizations and then do additional aggs/queries
against the pivoted data.
Examples would be:
"Do these daily user paths ever intersect?"
"Does this path enter and leave this area?"
* Fixing Painless tests.
* Update runtime field context to fix test cases.
* Remove watcher logging from usage API and replace test.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit adds the `data_frozen` node role as part of the formalization of data tiers. It also
adds the `"frozen"` phase to ILM, currently allowing the same actions as the existing cold phase.
The frozen phase is intended to be used for data even less frequently searched than the cold phase,
and will eventually be loosely tied to data using partial searchable snapshots (as oppposed to full
searchable snapshots in the cold phase).
Relates to #60848
Adds a multi_terms aggregation support. The multi terms aggregation works
very similarly to the terms aggregation but supports multiple terms. The goal
of this PR is to add the basic functionality so it is not optimized at the
moment. It will be done in follow up PRs.
Closes#65623
Refactoring of cat transform to show more relevant information. The current cat transform shows a
lot of configuration details, however cat should show operationally useful information. This PR
changes the defaults and also adds when transform did a search last.
* Adds datetime as a date, which is necessary in setup.
* Updating field context example.
* Fixing sample data, updating context example, and updating runtime example.
* Updating field context and changing runtime field to use seats data.
* Update filter context to use the seats data.
* Updating min-should-match context to use seats data.
* Replacing last mentions of TEST[skip].
* Update usage with watcher response for build error.
* Updating usage API again for watcher.
* Third time's a charm for fixing test cases.
* Adding specific test replacement for watcher logging total.
* Change actors to keyword based on review feedback.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Removed the autoscaling feature flags, autoscaling is now on by default
(though it requires an external system to handle the autoscaling
events). Added experimental notice to all autoscaling related
documentation pages.
Relates #51191
Transform writes dates as epoch millis, this does not work for historic data in some cases or is
unsupported. Dates should be written as such. With this PR transform starts writing dates in ISO
format, but as existing transform might rely on the format it provides backwards compatibility for
old jobs as well as a setting to write dates as epoch millis.
fixes#63787
This is a follow-up PR for #65256 to fix the xpack info and usage reports for
operator privilegs. In summary, this PR ensures:
* _xpack does not report operator privileges because it is categorised under
security
* _xpack/usage reports operator privileges status under the security
section
* _license/feature_usage reports last used time of operator privileges.
It is up to the downstream to filter out this report if necessary.
In some Elastic Stack environments, there is a distinction between the operator
of the cluster infrastructure and the administrator of the cluster. This
distinction cannot be supported currently because the "administrator" often has
the superuser role which grants each and every privilege of the cluster.
This PR adds a new feature to protect a fixed set of APIs from the
"administrator" even when it is a highly privileged user such as superuser. It
enhances the Elasticsearch security model to have an additional layer of
restriction in addition to the RBAC.
Co-authored-by: Tim Vernum <tim@adjective.org>
In the process of developing a new implementation for the Elasticsearch Rollups functionality we came up with the concept of the aggregate metric field type.
The aggregate_metric_double field type can store the results of aggregations (currently min, max, sum, value_count and avg are supported - more to come).
This field allows us to run (min, max, sum, value_count, avg) aggregations on the container field and the field will return the correct metric depending on the aggregation that is computed.
add support for the missing (bucket) aggregation (counts docs with a configured missing field value)
in transform. The output is mapped to name:count, the mapping type is long.
The current _update_by_query documentation mentions a scroll_size default of 100 and later another default of 1000.
We use the default of 1000 defined in AbstractBulkByScrollRequest and this PR changes the documentation accordingly.
Closes#63637
This PR adds deprecation warnings when accessing System Indices via the REST layer. At this time, these warnings are only enabled for Snapshot builds by default, to allow projects external to Elasticsearch additional time to adjust their access patterns.
Deprecation warnings will be triggered by all REST requests which access registered System Indices, except for purpose-specific APIs which access System Indices as an implementation detail a few specific APIs which will continue to allow access to system indices by default:
- `GET _cluster/health`
- `GET {index}/_recovery`
- `GET _cluster/allocation/explain`
- `GET _cluster/state`
- `POST _cluster/reroute`
- `GET {index}/_stats`
- `GET {index}/_segments`
- `GET {index}/_shard_stores`
- `GET _cat/[indices,aliases,health,recovery,shards,segments]`
Deprecation warnings for accessing system indices take the form:
```
this request accesses system indices: [.some_system_index], but in a future major version, direct access to system indices will be prevented by default
```
This commit adds telemetry for our data tier formalization. This telemetry helps determine the
topology of the cluster with regard to the content, hot, warm, & cold tiers/roles.
An example of the telemetry looks like:
```
GET /_xpack/usage?human
{
...
"data_tiers" : {
"available" : true,
"enabled" : true,
"data_warm" : {
...
},
"data_cold" : {
...
},
"data_content" : {
"node_count" : 1,
"index_count" : 6,
"total_shard_count" : 6,
"primary_shard_count" : 6,
"doc_count" : 71,
"total_size" : "59.6kb",
"total_size_bytes" : 61110,
"primary_size" : "59.6kb",
"primary_size_bytes" : 61110,
"primary_shard_size_avg" : "9.9kb",
"primary_shard_size_avg_bytes" : 10185,
"primary_shard_size_median" : "8kb",
"primary_shard_size_median_bytes" : 8254,
"primary_shard_size_mad" : "7.2kb",
"primary_shard_size_mad_bytes" : 7391
},
"data_hot" : {
...
}
}
}
```
The fields are as follows:
- node_count :: number of nodes with this tier/role
- index_count :: number of indices on this tier
- total_shard_count :: total number of shards for all nodes in this tier
- primary_shard_count :: number of primary shards for all nodes in this tier
- doc_count :: number of documents for all nodes in this tier
- total_size_bytes :: total number of bytes for all shards for all nodes in this tier
- primary_size_bytes :: number of bytes for all primary shards on all nodes in this tier
- primary_shard_size_avg_bytes :: average shard size for primary shard in this tier
- primary_shard_size_median_bytes :: median shard size for primary shard in this tier
- primary_shard_size_mad_bytes :: [median absolute deviation](https://en.wikipedia.org/wiki/Median_absolute_deviation) of shard size for primary shard in this tier
Relates to #60848
This commit removes the documentation for some specific Searchable Snapshot REST APIs:
- clear cache
- searchable snapshot stats
- repository stats
These APIs are low-level and are useful to investigate the behavior of snapshot
backed indices but we expect them to be removed in the future or to appear in
a different form.
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.
In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.
While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.
In addition a few community links have been removed, as they do not seem
to exist anymore.
Corrects the `requests_per_second` query parameter used in the reindex,
delete by query, and update by query API docs.
The parameter defaults to `-1` (no throttle). `0` is not an allowed value.
This commit adds the `require_alias` flag to requests that create new documents.
This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias.
When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings.
This is useful when an alias is required instead of a concrete index.
closes https://github.com/elastic/elasticsearch/issues/55267
This commit adds data stream info to the `/_xpack` and `/_xpack/usage` APIs. Currently the usage is
pretty minimal, returning only the number of data streams and the number of indices currently
abstracted by a data stream:
```
...
"data_streams" : {
"available" : true,
"enabled" : true,
"data_streams" : 3,
"indices_count" : 17
}
...
```
This commit adds conditional logic to the docs to avoid including any
docs on searchable snapshots in released versions.
Rework of #58556 which was reverted.
Changes:
* Updates 'Data streams' intro page to focus on problem solution and
benefits.
* Adds 'Data streams overview' page to cover conceptual information,
based on existing content in the 'Data streams' intro.
* Adds diagrams for data streams and search/indexing request examples.
* Moves API jump list and API docs to a new 'Data streams APIs' section.
Links to these APIs will be available through tutorials.
* Add xrefs to existing docs for concepts like generation, write index,
and append-only.
Cleans up the reference documentation for the following
search API parameters:
* `_source` query parameter
* `_source_excludes` query parameter
* `_source_includes` query parameter
* `_source` request body parameter
* `hits._source` response property
Today we already disallow negative values for the "from" parameter in the search
API when it is set as a request parameter and setting it on the
SearchSourceBuilder, but it is still parsed without complaint from a search
body, leading to differing exceptions later. This PR changes this behavior to be
the same regardless of setting the value directly, as url parameter or in the
search body. While we silently accepted "-1" as meaning "unset" and used the
default value of 0 so far, any negative from-value is now disallowed.
Closes#54897
This commit adds the `expand_wildcards` parameter documentation to the
`_cat/indices` and `_cat/aliases` docs, as those APIs now support
`expand_wildcards`. Additionally, clarifies the `expand_wildcards` docs with
respect to hidden indices.
This adds support for `terms` and `rare_terms` aggs in transforms.
The default behavior is that the results are collapsed in the following manner:
`<AGG_NAME>.<BUCKET_NAME>.<SUBAGGS...>...`
Or if no sub aggs exist
`<AGG_NAME>.<BUCKET_NAME>.<_doc_count>`
The mapping is also defined as `flattened` by default. This is to avoid field explosion while still providing (limited) search and aggregation capabilities.
This aggregation will perform normalizations of metrics
for a given series of data in the form of bucket values.
The aggregations supports the following normalizations
- rescale 0-1
- rescale 0-100
- percentage of sum
- mean normalization
- z-score normalization
- softmax normalization
To specify which normalization is to be used, it can be specified
in the normalize agg's `normalizer` field.
For example:
```
{
"normalize": {
"buckets_path": <>,
"normalizer": "percent"
}
}
```
Closes#51005.
Today we report some statistics in terms of Lucene-level documents, which
differ from Elasticsearch-level documents in a number of ways and include
things like document tombstones which users cannot directly observe. This
commit clarifies the internal nature of these statistics.
Closes#56497
* [DOCS] Promote cron expressions info from Watcher to a separate topic.
* Fix table error
* Fixed xref
* Apply suggestions from code review
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
* Incorporated review feedback
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Similar to what the moving function aggregation does, except merging windows of percentiles
sketches together instead of cumulatively merging final metrics
This has no practical impact on users since frozen indices are the only
throttled indices today. However this has an impact on upcoming features
that would use search throttling.
Filtering out throttled indices made sense a couple years ago, but as
we're now improving support for slow requests with `_async_search` and
exploring ways to reduce storage costs, this feature has most likely
become a trap, that we'd like to not have with upcoming features that
would use search throttling.
Relates #54058
This commit merges the searchable-snapshots feature branch into master.
See #54803 for the complete list of squashed commits.
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
This commit adds a top-level link to the autoscaling API reference page
to the API docs. Additionally, we add a conditional guard on the API
pages to only include them in development builds of the docs.
Fixing the naming of the HLRC values to match the ToXContent field names (i.e. the field names returned from an API call).
Also fixes the names in the _cat API as well.
closes#53946