- Creates a new StackTemplateRegistry that uses the new names
- The new registry only respects stack.templates.enabled for index templates
- Renames the old registry to LegacyStackTemplateRegistry
- Component templates are not duplicated but registered under two different names
- Documents the new naming convention
- Index templates are not renamed, at least for now, as there are some challenges with it
See 7fd0423 for more details.
This releases the Data stream lifecycle feature as a
Technical Preview feature.
Data stream lifecycle, albeit in technical preview, will allow data streams
to take advantage of a native simplified and resilient lifecycle implementation.
This add support to the `GET _data_stream` API for displaying the value
of the `index.lifecycle.prefer_ilm` setting both at the backing index
level and at the top level (top level meaning, similarly to the existing
`ilm_policy` field, the value in the index template that's backing the
data stream), an `ilm_policy` field for each backing index displaying
the actual ILM policy configured for the index itself, a `managed_by`
field for each backing index indicating who manages this index (the
possible values are: `Index Lifecycle Management`, `Data stream
lifecycle`, and `Unmanaged`).
This also adds a top level field to indicate which system would manage
the next generation index for this data stream based on the current
configuration. This field is called `next_generation_managed_by` and the
same values as the indices level `managed_by` field has are available.
An example output for a data stream that has 2 backing indices managed
by ILM and the write index by DSL:
```
{
"data_streams": [{
"name": "datastream-psnyudmbitp",
"timestamp_field": {
"name": "@timestamp"
},
"indices": [{
"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000001",
"index_uuid": "kyw0WEXvS8-ahchYS10NRQ",
"prefer_ilm": true,
"ilm_policy": "policy-uVBEI",
"managed_by": "Index Lifecycle Management"
}, {
"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000002",
"index_uuid": "pDLdc4DERwO54GRzDr4krw",
"prefer_ilm": true,
"ilm_policy": "policy-uVBEI",
"managed_by": "Index Lifecycle Management"
}, {
"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000003",
"index_uuid": "gYZirLKcS3mlc1c3oHRpYw",
"prefer_ilm": false,
"ilm_policy": "policy-uVBEI",
"managed_by": "Data stream lifecycle"
}],
"generation": 3,
"status": "YELLOW",
"template": "indextemplate-obcvkbjqand",
"lifecycle": {
"enabled": true,
"data_retention": "90d"
},
"ilm_policy": "policy-uVBEI",
"next_generation_managed_by": "Data stream lifecycle",
"prefer_ilm": false,
"hidden": false,
"system": false,
"allow_custom_routing": false,
"replicated": false
}]
}
```
In #92820 we adjusted the indices resolve API to use the
`IndexNameExpressionResolver` to align its behaviour with other similar
APIs, but this was a subtle breaking change in its behaviour when there
were no matching indices. This adds a note in the docs to record this
change in behaviour.
In this PR we enable all new data streams to be managed by the data
stream lifecycle by default. This is implemented by adding an empty
`lifecycle: {}` upon new data stream creation.
Opting out is represented by a the `enabled` flag:
```
{
"lifecycle": {
"enabled": false
}
}
```
This change has the following implications on when is an index managed
and by which feature:
| Parent data stream lifecycle| ILM| `prefer_ilm`|Managed by|
|----------------------------|----|----------------|-| | default | yes|
true| ILM| | default | yes| false| data stream lifecycle| |default |
no|true/false|data stream lifecycle| |opt-out or
missing|yes|true/false|ILM| |opt-out or missing|no|true/false|unmanaged|
Data streams that have been created before the data stream lifecycle is
enabled will not have the default lifecycle.
Next steps: - We need to document this when the feature will be GA
(https://github.com/elastic/elasticsearch/issues/97973).
Fleet is currently hard coded to set index.codec to best_compression (deflate compression). This is good for most data streams, except for data streams were tsdb is enabled. Ideally Fleet doesn't need to set this setting at all and Elasticsearch's default would be good. But unfortunately this isn't the case. It default to default (lz4 - optimised for speed), which in would mean much higher disk space usage. Ideally the default would be default when synthetic source is enabled and otherwise best_compression. Changing this now, would mean a breaking change.
Instead Fleet like to depend on Elasticsearch's internal component templates. To at least abstract some of the internal details away. The metrics-settings is ok for non tsdb, but there is no component template for tsdb metrics. This PR adds this.
Relates to elastic/kibana#160288
* Index Management now has link to Discover in UI.
* updating screenshot for data streams section
* Update docs/reference/indices/index-mgmt.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
* Update docs/reference/indices/index-mgmt.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
---------
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
This adds IndexVersion to cluster state, alongside node version. This is needed so IndexVersion can be tracked across the cluster, allowing min/max supported index versions to be determined.
* [docs] Clarify that index template settings take precedence over component templates.
[docs] Clarify that index template settings take precedence over component templates.
* Update docs/reference/indices/index-templates.asciidoc
Co-authored-by: Adam Locke <adam.locke@chronosphere.io>
---------
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Adam Locke <adam.locke@chronosphere.io>
* It adds the profiling index pattern profiling-* to the fleet server service privileges.
* And adds profiling-* to kibana system role privileges.
---------
Co-authored-by: Daniel Mitterdorfer <daniel.mitterdorfer@elastic.co>
- No need to use an `AsyncShardFetch` here, there is no caching
- Response may be very large, introduce chunking
- Fan-out may be very large, introduce throttling
- Processing time may be nontrivial, introduce cancellability
- Eliminate many unnecessary intermediate data structures
- Do shard-level response processing more eagerly
- Determine allocation from `RoutingTable` not `RoutingNodes`
- Add tests
Relates #81081
For managing data streams with DLM we chose to have one cluster setting that will determine the rollover conditions for all data streams. This PR introduces this cluster setting, it exposes it via the 3 existing APIs under the flag `include_defaults` and adjusts DLM to use it. The feature remains behind a feature flag.
The documentation specifies the possible values for the status of the response. This endpoint is inconsistent with most others that expose the health status as it returns the values as uppercase strings rather than lowercase.
This PR fixes the cases in the documentation to align with the actual values returned in the response body.
This change introduces the configuration option `ignore_missing_component_templates` as discussed in https://github.com/elastic/elasticsearch/issues/92426 The implementation [option 6](https://github.com/elastic/elasticsearch/issues/92426#issuecomment-1372675683) was picked with a slight adjustment meaning no patterns are allowed.
## Implementation
During the creation of an index template, the list of component templates is checked if all component templates exist. This check is extended to skip any component templates which are listed under `ignore_missing_component_templates`. An index template that skips the check for the component template `logs-foo@custom` looks as following:
```
PUT _index_template/logs-foo
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
```
The component template `logs-foo@package` has to exist before creation. It can be created with:
```
PUT _component_template/logs-foo@custom
{
"template": {
"mappings": {
"properties": {
"host.ip": {
"type": "ip"
}
}
}
}
}
```
## Testing
For manual testing, different scenarios can be tested. To simplify testing, the commands from `.http` file are added. Before each test run, a clean cluster is expected.
### New behaviour, missing component template
With the new config option, it must be possible to create an index template with a missing component templates without getting an error:
```
### Add logs-foo@package component template
PUT http://localhost:9200/
_component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json
{
"template": {
"mappings": {
"properties": {
"host.name": {
"type": "keyword"
}
}
}
}
}
### Add logs-foo index template
PUT http://localhost:9200/
_index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
### Create data stream
PUT http://localhost:9200/
_data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
### Check if mappings exist
GET http://localhost:9200/
logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```
It is checked if all templates could be created and data stream mappings are correct.
### Old behaviour, with all component templates
In the following, a component template is made optional but it already exists. It is checked, that it will show up in the mappings:
```
### Add logs-foo@package component template
PUT http://localhost:9200/
_component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json
{
"template": {
"mappings": {
"properties": {
"host.name": {
"type": "keyword"
}
}
}
}
}
### Add logs-foo@custom component template
PUT http://localhost:9200/
_component_template/logs-foo@custom
Authorization: Basic elastic password
Content-Type: application/json
{
"template": {
"mappings": {
"properties": {
"host.ip": {
"type": "ip"
}
}
}
}
}
### Add logs-foo index template
PUT http://localhost:9200/
_index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
### Create data stream
PUT http://localhost:9200/
_data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
### Check if mappings exist
GET http://localhost:9200/
logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```
### Check old behaviour
Ensure, that the old behaviour still exists when a component template is used that is not part of `ignore_missing_component_templates`:
```
### Add logs-foo index template
PUT http://localhost:9200/
_index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
```
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Mentions that we only report on recoveries for shard copies that
actually exist in the cluster, so you don't see all historical data if,
e.g., the shard copy relocates elsewhere.
We tell users not to force merge unless their index is read-only. This PR
proposes to soften the warning and make it more precise. This way, more users
can consider force merging for their use case, like those with append-only
indices, or those with a small number of updates that can regularly perform a
force merge.
This change adds support for kNN vector fields to the `_disk_usage` API. The
strategy:
* Iterate the vector values (using the same strategy as for doc values) to
estimate the vector data size
* Run some random vector searches to estimate the vector index size
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Closes#84801
* Revert "Revert "[DOCS] Add TSDS docs (#86905)" (#87702)"
This reverts commit 0c86d7b9b2.
* First fix to tests
* Add data_stream object to index template
* small rewording
* Add enable data stream object in gradle example setup
* Add bullet about data stream must be enabled in template
* [DOCS] Add TSDB docs
* Update docs/build.gradle
Co-authored-by: Adam Locke <adam.locke@elastic.co>
* Address Nik's comments, part 1
* Address Nik's comments, part deux
* Reword write index
* Add feature flags
* Wrap one more section in feature flag
* Small fixes
* set index.routing_path to optional
* Update storage reduction value
* Update create index template code example
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
The get settings api has accepts the expand_wildcards option. The docs
state the default value is `all`, but it is actually now `open` (which
does not include hidden or closed indices by default). This commit
changes the docs to match the existing behavior.
"Add" was out of the hyperlink context which I have fixed it.
Earlier line 71 was like : * *Add* <<set-up-lifecycle-policy,*lifecycle policy*>>
After rectifying line 71 is like : * <<set-up-lifecycle-policy,*Add lifecycle policy*>>
(cherry picked from commit 3b8d51c696)
Co-authored-by: Tapomoy Bhowmik <99604828+TapomoyBhowmik@users.noreply.github.com>
This commit adds tracking for desired nodes cluster membership.
When desired nodes are updated they are matched against the current
cluster members. Additionally when a node joins the cluster the
desired nodes cluster membership is updated.
* Resolve indices api: add 'system' attribute
* Update docs/changelog/85042.yaml
* Remove magic strings for attribute values
* Update API docs to provide possible resolved index attributes
Add a new rollover condition with the name `max_primary_shard_docs`.
Triggers rollover when the largest primary shard in the index reaches a certain number of documents.