* Adds datetime as a date, which is necessary in setup.
* Updating field context example.
* Fixing sample data, updating context example, and updating runtime example.
* Updating field context and changing runtime field to use seats data.
* Update filter context to use the seats data.
* Updating min-should-match context to use seats data.
* Replacing last mentions of TEST[skip].
* Update usage with watcher response for build error.
* Updating usage API again for watcher.
* Third time's a charm for fixing test cases.
* Adding specific test replacement for watcher logging total.
* Change actors to keyword based on review feedback.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Removed the autoscaling feature flags, autoscaling is now on by default
(though it requires an external system to handle the autoscaling
events). Added experimental notice to all autoscaling related
documentation pages.
Relates #51191
Transform writes dates as epoch millis, this does not work for historic data in some cases or is
unsupported. Dates should be written as such. With this PR transform starts writing dates in ISO
format, but as existing transform might rely on the format it provides backwards compatibility for
old jobs as well as a setting to write dates as epoch millis.
fixes#63787
This is a follow-up PR for #65256 to fix the xpack info and usage reports for
operator privilegs. In summary, this PR ensures:
* _xpack does not report operator privileges because it is categorised under
security
* _xpack/usage reports operator privileges status under the security
section
* _license/feature_usage reports last used time of operator privileges.
It is up to the downstream to filter out this report if necessary.
In some Elastic Stack environments, there is a distinction between the operator
of the cluster infrastructure and the administrator of the cluster. This
distinction cannot be supported currently because the "administrator" often has
the superuser role which grants each and every privilege of the cluster.
This PR adds a new feature to protect a fixed set of APIs from the
"administrator" even when it is a highly privileged user such as superuser. It
enhances the Elasticsearch security model to have an additional layer of
restriction in addition to the RBAC.
Co-authored-by: Tim Vernum <tim@adjective.org>
In the process of developing a new implementation for the Elasticsearch Rollups functionality we came up with the concept of the aggregate metric field type.
The aggregate_metric_double field type can store the results of aggregations (currently min, max, sum, value_count and avg are supported - more to come).
This field allows us to run (min, max, sum, value_count, avg) aggregations on the container field and the field will return the correct metric depending on the aggregation that is computed.
add support for the missing (bucket) aggregation (counts docs with a configured missing field value)
in transform. The output is mapped to name:count, the mapping type is long.
The current _update_by_query documentation mentions a scroll_size default of 100 and later another default of 1000.
We use the default of 1000 defined in AbstractBulkByScrollRequest and this PR changes the documentation accordingly.
Closes#63637
This PR adds deprecation warnings when accessing System Indices via the REST layer. At this time, these warnings are only enabled for Snapshot builds by default, to allow projects external to Elasticsearch additional time to adjust their access patterns.
Deprecation warnings will be triggered by all REST requests which access registered System Indices, except for purpose-specific APIs which access System Indices as an implementation detail a few specific APIs which will continue to allow access to system indices by default:
- `GET _cluster/health`
- `GET {index}/_recovery`
- `GET _cluster/allocation/explain`
- `GET _cluster/state`
- `POST _cluster/reroute`
- `GET {index}/_stats`
- `GET {index}/_segments`
- `GET {index}/_shard_stores`
- `GET _cat/[indices,aliases,health,recovery,shards,segments]`
Deprecation warnings for accessing system indices take the form:
```
this request accesses system indices: [.some_system_index], but in a future major version, direct access to system indices will be prevented by default
```
This commit adds telemetry for our data tier formalization. This telemetry helps determine the
topology of the cluster with regard to the content, hot, warm, & cold tiers/roles.
An example of the telemetry looks like:
```
GET /_xpack/usage?human
{
...
"data_tiers" : {
"available" : true,
"enabled" : true,
"data_warm" : {
...
},
"data_cold" : {
...
},
"data_content" : {
"node_count" : 1,
"index_count" : 6,
"total_shard_count" : 6,
"primary_shard_count" : 6,
"doc_count" : 71,
"total_size" : "59.6kb",
"total_size_bytes" : 61110,
"primary_size" : "59.6kb",
"primary_size_bytes" : 61110,
"primary_shard_size_avg" : "9.9kb",
"primary_shard_size_avg_bytes" : 10185,
"primary_shard_size_median" : "8kb",
"primary_shard_size_median_bytes" : 8254,
"primary_shard_size_mad" : "7.2kb",
"primary_shard_size_mad_bytes" : 7391
},
"data_hot" : {
...
}
}
}
```
The fields are as follows:
- node_count :: number of nodes with this tier/role
- index_count :: number of indices on this tier
- total_shard_count :: total number of shards for all nodes in this tier
- primary_shard_count :: number of primary shards for all nodes in this tier
- doc_count :: number of documents for all nodes in this tier
- total_size_bytes :: total number of bytes for all shards for all nodes in this tier
- primary_size_bytes :: number of bytes for all primary shards on all nodes in this tier
- primary_shard_size_avg_bytes :: average shard size for primary shard in this tier
- primary_shard_size_median_bytes :: median shard size for primary shard in this tier
- primary_shard_size_mad_bytes :: [median absolute deviation](https://en.wikipedia.org/wiki/Median_absolute_deviation) of shard size for primary shard in this tier
Relates to #60848
This commit removes the documentation for some specific Searchable Snapshot REST APIs:
- clear cache
- searchable snapshot stats
- repository stats
These APIs are low-level and are useful to investigate the behavior of snapshot
backed indices but we expect them to be removed in the future or to appear in
a different form.
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.
In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.
While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.
In addition a few community links have been removed, as they do not seem
to exist anymore.
Corrects the `requests_per_second` query parameter used in the reindex,
delete by query, and update by query API docs.
The parameter defaults to `-1` (no throttle). `0` is not an allowed value.
This commit adds the `require_alias` flag to requests that create new documents.
This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias.
When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings.
This is useful when an alias is required instead of a concrete index.
closes https://github.com/elastic/elasticsearch/issues/55267
This commit adds data stream info to the `/_xpack` and `/_xpack/usage` APIs. Currently the usage is
pretty minimal, returning only the number of data streams and the number of indices currently
abstracted by a data stream:
```
...
"data_streams" : {
"available" : true,
"enabled" : true,
"data_streams" : 3,
"indices_count" : 17
}
...
```
This commit adds conditional logic to the docs to avoid including any
docs on searchable snapshots in released versions.
Rework of #58556 which was reverted.
Changes:
* Updates 'Data streams' intro page to focus on problem solution and
benefits.
* Adds 'Data streams overview' page to cover conceptual information,
based on existing content in the 'Data streams' intro.
* Adds diagrams for data streams and search/indexing request examples.
* Moves API jump list and API docs to a new 'Data streams APIs' section.
Links to these APIs will be available through tutorials.
* Add xrefs to existing docs for concepts like generation, write index,
and append-only.
Cleans up the reference documentation for the following
search API parameters:
* `_source` query parameter
* `_source_excludes` query parameter
* `_source_includes` query parameter
* `_source` request body parameter
* `hits._source` response property
Today we already disallow negative values for the "from" parameter in the search
API when it is set as a request parameter and setting it on the
SearchSourceBuilder, but it is still parsed without complaint from a search
body, leading to differing exceptions later. This PR changes this behavior to be
the same regardless of setting the value directly, as url parameter or in the
search body. While we silently accepted "-1" as meaning "unset" and used the
default value of 0 so far, any negative from-value is now disallowed.
Closes#54897
This commit adds the `expand_wildcards` parameter documentation to the
`_cat/indices` and `_cat/aliases` docs, as those APIs now support
`expand_wildcards`. Additionally, clarifies the `expand_wildcards` docs with
respect to hidden indices.
This adds support for `terms` and `rare_terms` aggs in transforms.
The default behavior is that the results are collapsed in the following manner:
`<AGG_NAME>.<BUCKET_NAME>.<SUBAGGS...>...`
Or if no sub aggs exist
`<AGG_NAME>.<BUCKET_NAME>.<_doc_count>`
The mapping is also defined as `flattened` by default. This is to avoid field explosion while still providing (limited) search and aggregation capabilities.
This aggregation will perform normalizations of metrics
for a given series of data in the form of bucket values.
The aggregations supports the following normalizations
- rescale 0-1
- rescale 0-100
- percentage of sum
- mean normalization
- z-score normalization
- softmax normalization
To specify which normalization is to be used, it can be specified
in the normalize agg's `normalizer` field.
For example:
```
{
"normalize": {
"buckets_path": <>,
"normalizer": "percent"
}
}
```
Closes#51005.
Today we report some statistics in terms of Lucene-level documents, which
differ from Elasticsearch-level documents in a number of ways and include
things like document tombstones which users cannot directly observe. This
commit clarifies the internal nature of these statistics.
Closes#56497
* [DOCS] Promote cron expressions info from Watcher to a separate topic.
* Fix table error
* Fixed xref
* Apply suggestions from code review
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
* Incorporated review feedback
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Similar to what the moving function aggregation does, except merging windows of percentiles
sketches together instead of cumulatively merging final metrics
This has no practical impact on users since frozen indices are the only
throttled indices today. However this has an impact on upcoming features
that would use search throttling.
Filtering out throttled indices made sense a couple years ago, but as
we're now improving support for slow requests with `_async_search` and
exploring ways to reduce storage costs, this feature has most likely
become a trap, that we'd like to not have with upcoming features that
would use search throttling.
Relates #54058
This commit merges the searchable-snapshots feature branch into master.
See #54803 for the complete list of squashed commits.
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
This commit adds a top-level link to the autoscaling API reference page
to the API docs. Additionally, we add a conditional guard on the API
pages to only include them in development builds of the docs.
Fixing the naming of the HLRC values to match the ToXContent field names (i.e. the field names returned from an API call).
Also fixes the names in the _cat API as well.
closes#53946
The current consensus is that we don't need info actions for smaller items like
field mappers. We can also remove the usage action since the cluster stats API
now tracks information about mappings, like what field types are defined.
I discussed with @rjernst about what kind of functionality should be
reported in the info API, since it doesn't sound sensible to list every
single feature there. As a guideline, Ryan suggested that functionality
that needs to maintain state should definitely be in the info API, but
probably not field mappers like `constant_keyword`.
This field is a specialization of the `keyword` field for the case when all
documents have the same value. It typically performs more efficiently than
keywords at query time by figuring out whether all or none of the documents
match at rewrite time, like `term` queries on `_index`.
The name is up for discussion. I liked including `keyword` in it, so that we
still have room for a `singleton_numeric` in the future. However I'm unsure
whether to call it `singleton`, `constant` or something else, any opinions?
For this field there is a choice between
1. accepting values in `_source` when they are equal to the value configured
in mappings, but rejecting mapping updates
2. rejecting values in `_source` but then allowing updates to the value that
is configured in the mapping
This commit implements option 1, so that it is possible to reindex from/to an
index that has the field mapped as a keyword with no changes to the source.
The main purpose of this commit is to add a single autoscaling REST
endpoint skeleton, for the purpose of starting to build out the build
and testing infrastructure that will surround it. For example, rather
than commiting a fully-functioning autoscaling API, we introduce here
the skeleton so that we can start wiring up the build and testing
infrastructure, establish security roles/permissions, an so on. This
way, in a forthcoming PR that introduces actual functionality, that PR
will be smaller and have less distractions around that sort of
infrastructure.
This change introduces a new feature for indices so that they can be
hidden from wildcard expansion. The feature is referred to as hidden
indices. An index can be marked hidden through the use of an index
setting, `index.hidden`, at creation time. One primary use case for
this feature is to have a construct that fits indices that are created
by the stack that contain data used for display to the user and/or
intended for querying by the user. The desire to keep them hidden is
to avoid confusing users when searching all of the data they have
indexed and getting results returned from indices created by the
system.
Hidden indices have the following properties:
* API calls for all indices (empty indices array, _all, or *) will not
return hidden indices by default.
* Wildcard expansion will not return hidden indices by default unless
the wildcard pattern begins with a `.`. This behavior is similar to
shell expansion of wildcards.
* REST API calls can enable the expansion of wildcards to hidden
indices with the `expand_wildcards` parameter. To expand wildcards
to hidden indices, use the value `hidden` in conjunction with `open`
and/or `closed`.
* Creation of a hidden index will ignore global index templates. A
global index template is one with a match-all pattern.
* Index templates can make an index hidden, with the exception of a
global index template.
* Accessing a hidden index directly requires no additional parameters.
Relates #50251
Co-Authored-By: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-Authored-By: David Roberts <dave.roberts@elastic.co>
Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>