Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.
This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.
Addresses #49028 and #55363.
Adds a full list of supported aggregations to the node info API. This list
will be used in transform tests and telemetry mapping tests that will be added
as follow-up PRs.
Fixes#59774
In #54716 I removed pipeline aggregators from the aggregation result
tree and caused us to read them from the request. This saves a bunch of
round trip bytes, which is neat. But there was a bug in the backwards
compatibility logic. You see, we still have to give the pipeline
aggregations to nodes older than 7.8 over the wire because that is how
they know what pipelines to run. They have the pipelines in the request
but they don't read them. They use the ones in the response tree.
Anyway, we had a bug where we were never sending pipelines defined two
levels down. So while you are upgrading the pipeline wouldn't run.
Sometimes. If the data node of the "first" result was post-7.8 and the
coordinating node was pre-7.8.
This fixes the bug.
* Fix retrieving data stream stats for a DS with multiple backing indices
This API incorrectly had `allowAliasesToMultipleIndices` set to false in the default options for the
request. This changes it from `false` to `true` and enhances a test to exercise the functionality.
Resolves#59802
* Fix test for wording change
This change allows simulating replacing a composable template with a different version, for example:
```
POST /_index_template/_simulate/my-template
{
"index_patterns": ["idx*"],
"composed_of": ["ct1"],
"priority": 10,
"template": {
"settings": {
"index.lifecycle.name": "policy"
}
}
}
```
Should simulate as if `my-template` were replaced with the template specified in the body.
Resolves#59152
This commit adds the `require_alias` flag to requests that create new documents.
This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias.
When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings.
This is useful when an alias is required instead of a concrete index.
closes https://github.com/elastic/elasticsearch/issues/55267
Histograms are used in many tests, so it is not practical to hunt them all down
and disable one by one. This commit temporary disables all bwc tests until
#59656 is merged.
This PR removes the expand_wildcards and forbid_closed_indices parameters from the Data
Streams Stats REST endpoint. These options are required for broadcast requests, but are not
needed for anything in terms of resolving data streams. Instead, we just set a default set of
IndicesOptions on the transport request.
This API reports on statistics important for data streams, including the number of data
streams, the number of backing indices for those streams, the disk usage for each data
stream, and the maximum timestamp for each data stream
Currently we combine coordinating and primary bytes into a single bucket
for indexing pressure stats. This makes sense for rejection logic.
However, for metrics it would be useful to separate them.
This makes the data_stream timestamp field specification optional when
defining a composable template.
When there isn't one specified it will default to `@timestamp`.
This adds a low precendece mapping for the `@timestamp` field with
type `date`.
This will aid with the bootstrapping of data streams as a timestamp
mapping can be omitted when nanos precision is not needed.
This commit updates the node stats version constants to reflect the fact
that index pressure stats were backported to 7.9. It also reenables BWC
tests.
* Adds hard_bounds to histogram aggregations
Adds a hard_bounds parameter to explicitly limit the buckets that a histogram
can generate. This is especially useful in case of open ended ranges that can
produce a very large number of buckets.
Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
We have recently added internal metrics to monitor the amount of
indexing occurring on a node. These metrics introduce back pressure to
indexing when memory utilization is too high. This commit exposes these
stats through the node stats API.
* Create new data-stream xpack module.
* Move TimestampFieldMapper to the new module,
this results in storing a composable index template
with data stream definition only to work with default
distribution. This way data streams can only be used
with default distribution, since a data stream can
currently only be created if a matching composable index
template exists with a data stream definition.
* Renamed `_timestamp` meta field mapper
to `_data_stream_timestamp` meta field mapper.
* Add logic to put composable index template api
to fail if `_data_stream_timestamp` meta field mapper
isn't registered. So that a more understandable
error is returned when attempting to store a template
with data stream definition via the oss distribution.
In a follow up the data stream transport and
rest actions can be moved to the xpack data-stream module.
The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (#58956) that cases timestamp field validation to be performed
for each template and instead of the final mappings that is created.
* only apply _timestamp meta field if index is created as part of a data stream or data stream rollover,
this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition.
Relates to #58642
Relates to #53100Closes#58956Closes#58583
This adds the data stream's index template, the configured ILM policy
(if any) and the health status of the data stream to the GET _data_stream
response.
Restoring a data stream from a snapshot could install a data stream that
doesn't match any composable templates. This also makes the `template`
field in the `GET _data_stream` response optional.
The build plugin is still necessary to generate the POM file.
This commit adds the build plugin plugin back to the rest-api-spec
project. It was recently removed as it was thought to be
unnecessary.
This request:
```
POST /_search
{
"aggs": {
"a": {
"adjacency_matrix": {
"filters": {
"1": {
"terms": { "t": { "index": "lookup", "id": "1", "path": "t" } }
}
}
}
}
}
}
```
Would fail with a 500 error and a message like:
```
{
"error": {
"root_cause": [
{
"type": "illegal_state_exception",
"reason":"async actions are left after rewrite"
}
]
}
}
```
This fixes that by moving the query rewrite phase from a synchronous
call on the data nodes into the standard aggregation rewrite phase which
can properly handle the asynchronous actions.
This commit creates a new Gradle plugin to provide a separate task name
and source set for running YAML based REST tests. The only project
converted to use the new plugin in this PR is distribution/archives/integ-test-zip.
For which the testing has been moved to :rest-api-spec since it makes the most
sense and it avoids a small but awkward change to the distribution plugin.
The remaining cases in modules, plugins, and x-pack will be handled in followups.
This plugin is distinctly different from the plugin introduced in #55896 since
the YAML REST tests are intended to be black box tests over HTTP. As such they
should not (by default) have access to the classpath for that which they are testing.
The YAML based REST tests will be moved to separate source sets (yamlRestTest).
The which source is the target for the test resources is dependent on if this
new plugin is applied. If it is not applied, it will default to the test source
set.
Further, this introduces a breaking change for plugin developers that
use the YAML testing framework. They will now need to either use the new source set
and matching task, or configure the rest resources to use the old "test" source set that
matches the old integTest task. (The former should be preferred).
As part of this change (which is also breaking for plugin developers) the
rest resources plugin has been removed from the build plugin and now requires
either explicit application or application via the new YAML REST test plugin.
Plugin developers should be able to fix the breaking changes to the YAML tests
by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests
under a yamlRestTest folder (instead of test)
This commit adds a new metadata field mapper that validates,
that a document has exactly a single timestamp value in the data stream timestamp field and
that the timestamp field mapping only has `type`, `meta` or `format` attributes configured.
Other attributes can affect the guarantee that an index with this meta field mapper has a
useable timestamp field.
The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever
a new backing index of a data stream is created.
Relates to #53100
* Forbid read-only-allow-delete block in blocks API
The read-only-allow-delete block is not really under the user's control
since Elasticsearch adds/removes it automatically. This commit removes
support for it from the new API for adding blocks to indices that was
introduced in #58094.
* Missing xref
* Reword paragraph on read-only-allow-delete block
Today the disk-based shard allocator accounts for incoming shards by
subtracting the estimated size of the incoming shard from the free space on the
node. This is an overly conservative estimate if the incoming shard has almost
finished its recovery since in that case it is already consuming most of the
disk space it needs.
This change adds to the shard stats a measure of how much larger each store is
expected to grow, computed from the ongoing recovery, and uses this to account
for the disk usage of incoming shards more accurately.
This commit adds the component and composable templates, as well as ILM policies, for the new
default indexing strategy. It installs:
- logs-default-mappings (component)
- logs-default-settings (component)
- logs-default-policy (ilm policy)
- logs-default-template (composable template)
- metrics-default-mappings (component)
- metrics-default-settings (component)
- metrics-default-policy (ilm policy)
- metrics-default-template (composable template)
These templates and policies are managed by a new x-pack module, `stack`, and can be disabled by
setting `stack.templates.enabled` to `false`.
These ensure that patterns for the `logs-*-*` and `metrics-*-*` indices are set up to create data
streams with the proper mappings and settings.
This also makes changes to the `IndexTemplateRegistry` to support installing component and
composable templates (previously it supported only legacy templates).
Resolves#56709
Adds an API for putting an index block in place, which also ensures for write blocks that, once successfully returning to
the user, all shards of the index are properly accounting for the block, for example that all in-flight writes to an index have
been completed after adding the write block.
This API allows coordinating more complex workflows, where it is crucial that an index is no longer receiving writes after
the API completes, useful for example when marking an index as read-only during an upgrade in order to reindex its
documents.
This PR implements recursive mapping merging for composable index templates.
When creating an index, we perform the following:
* Add each component template mapping in order, merging each one in after the
last.
* Merge in the index template mappings (if present).
* Merge in the mappings on the index request itself (if present).
Some principles:
* All 'structural' changes are disallowed (but everything else is fine). An
object mapper can never be changed between `type: object` and `type: nested`. A
field mapper can never be changed to an object mapper, and vice versa.
* Generally, each section is merged recursively. This includes `object`
mappings, as well as root options like `dynamic_templates` and `meta`. Once we
reach 'leaf components' like field definitions, they always overwrite an
existing one instead of being merged.
Relates to #53101.
Follow-up to 35aecf4c9a. Somehow I missed the fact that there's an ILM
API named `retry`, which is a keyword in Ruby. I've removed it from the
keywords list.
If an API name (or components of a name) overlaps with a reserved word in
the programming language for an ES client, then it's possible that the code
that is generated from the API will not compile. This PR adds validation to
check for such overlaps.
Implements a new histogram aggregation called `variable_width_histogram` which
dynamically determines bucket intervals based on document groupings. These
groups are determined by running a one-pass clustering algorithm on each shard
and then reducing each shard's clusters using an agglomerative
clustering algorithm.
This PR addresses #9572.
The shard-level clustering is done in one pass to minimize memory overhead. The
algorithm was lightly inspired by
[this paper](https://ieeexplore.ieee.org/abstract/document/1198387). It fetches
a small number of documents to sample the data and determine initial clusters.
Subsequent documents are then placed into one of these clusters, or a new one
if they are an outlier. This algorithm is described in more details in the
aggregation's docs.
At reduce time, a
[hierarchical agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering)
algorithm inspired by [this paper](https://arxiv.org/abs/1802.00304)
continually merges the closest buckets from all shards (based on their
centroids) until the target number of buckets is reached.
The final values produced by this aggregation are approximate. Each bucket's
min value is used as its key in the histogram. Furthermore, buckets are merged
based on their centroids and not their bounds. So it is possible that adjacent
buckets will overlap after reduction. Because each bucket's key is its min,
this overlap is not shown in the final histogram. However, when such overlap
occurs, we set the key of the bucket with the larger centroid to the midpoint
between its minimum and the smaller bucket’s maximum:
`min[large] = (min[large] + max[small]) / 2`. This heuristic is expected to
increases the accuracy of the clustering.
Nodes are unable to share centroids during the shard-level clustering phase. In
the future, resolving https://github.com/elastic/elasticsearch/issues/50863
would let us solve this issue.
It doesn’t make sense for this aggregation to support the `min_doc_count`
parameter, since clusters are determined dynamically. The `order` parameter is
not supported here to keep this large PR from becoming too complex.
Relates to #53100
* use mapping source direcly instead of using mapper service to extract the relevant mapping details
* moved assertion to TimestampField class and added helper method for tests
* Improved logic that inserts timestamp field mapping into an mapping.
If the timestamp field path consisted out of object fields and
if the final mapping did not contain the parent field then an error
occurred, because the prior logic assumed that the object field existed.
This commit adapts the bwc version in preparation of the backport
to 7.x. The bwc tests are disabled in order to allow the merge of
#58299.
Relates #58299
The dangling_indices.import API name could cause issues in the client
libs because import is a reserved word in many languages. Rename the
API to avoid this, and rename the other APIs for consistency.
Related to #48366.
* Add index filtering in field capabilities API
This change allows to use an `index_filter` in the
field capabilities API. Indices are filtered from
the response if the provided query rewrites to `match_none`
on every shard:
````
GET metrics-*
{
"index_filter": {
"bool": {
"must": [
"range": {
"@timestamp": {
"gt": "2019"
}
}
}
}
}
````
The filtering is done on a best-effort basis, it uses the can match phase
to rewrite queries to `match_none` instead of fully executing the request.
The first shard that can match the filter is used to create the field
capabilities response for the entire index.
Closes#56195
This builds an `auto_date_histogram` aggregator that natively aggregates
from many buckets and uses it when the `auto_date_histogram` used to use
`asMultiBucketAggregator` which should save a significant amount of
memory in those cases. In particular, this happens when
`auto_date_histogram` is a sub-aggregator of a multi-bucketing aggregator
like `terms` or `histogram` or `filters`. For the most part we preserve
the original implementation when `auto_date_histogram` only collects from
a single bucket.
It isn't possible to "just port the aggregator" without taking a pretty
significant performance hit because we used to rewrite all of the
buckets every time we switched to a coarser and coarser rounding
configuration. Without some major surgery to how to delay sub-aggs
we'd end up rewriting the delay list zillions of time if there are many
buckets.
The multi-bucket version of the aggregator has a "budget" of "wasted"
buckets and only rewrites all of the buckets when we exceed that budget.
Now that we don't rebucket every time we increase the rounding we can no
longer get an accurate count of the number of buckets! So instead the
aggregator uses an estimate of the number of buckets to trigger switching
to a coarser rounding. This estimate is likely to be *terrible* when
buckets are far apart compared to the rounding. So it also uses the
difference between the first and last bucket to trigger switching to a
coarser rounding. Which covers for the shortcomings of the bucket
estimation technique pretty well. It also causes the aggregator to emit
fewer buckets in cases where they'd be reduced together on the
coordinating node. This is wonderful! But probably fairly rare.
All of that does buy us some speed improvements when the aggregator is
a child of multi-bucket aggregator:
Without metrics or time zone: 25% faster
With metrics: 15% faster
With time zone: 22% faster
Relates to #56487
* Disallow merging existing mapping field definitions in templates
This commit changes the merge strategy introduced in #55607 and #55982. Instead of overwriting these
fields, we now prevent them from being merged with an exception when a user attempts to
overwrite a field.
As part of this, a more robust validation has been added. The existing validation checked whether
templates (composable and component) were valid on their own, this new validation now checks that
the composite template (mappings/settings/aliases) is valid. This means that when a composable
template is added or updated, we confirm that it is valid with its component pieces. When a
component template is updated we ensure that all composable templates that make use of the component
template continue to be valid before allowing the component template to be updated.
This change also necessitated changes in the tests, however, I have left tests that exercise mapping
merging with nested object fields as `@AwaitsFix`, as we intend to change the behavior soon to allow
merging in a recursive-with-replacement fashion (see: #57393). I have added tests that check the new
disallowing behavior in the meantime.
* Use functional instead of imperative prefix collection
* Use IndexService.withTempIndexService
* Rename tests
* Fix tests
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Merges the remaining implementation of `significant_terms` into `terms`
so that we can more easilly make them work properly without
`asMultiBucketAggregator` which *should* save memory and speed them up.
Relates #56487
Relates: elastic/elasticsearch#55474
This commit updates the snapshot.delete.json REST API spec
to make snapshot a list type, now that it can accept a
list of comma-separated snapshot names
When the `terms` agg runs against strings and uses global ordinals it
has an optimization when it collects segments that only ever have a
single value for the particular string. This is *very* common. But I
broke it in #57241. This fixes that optimization and adds `debug`
information that you can use to see how often we collect segments of
each type. And adds a test to make sure that I don't break the
optimization again.
We also had a specialiation for when there isn't a filter on the terms
to aggregate. I had removed that specialization in #57241 which resulted
in some slow down as well. This adds it back but in a more clear way.
And, hopefully, a way that is marginally faster when there *is* a
filter.
Closes#57407
This saves some memory when the `histogram` aggregation is not a top
level aggregation by dropping `asMultiBucketAggregator` in favor of
natively implementing multi-bucket storage in the aggregator. For the
most part this just uses the `LongKeyedBucketOrds` that we built the
first time we did this.
Relates: elastic/elasticsearch#55014
This commit deprecates the local param in get_mapping.json.
This parameter is a no-op and field mappings are always retrieved locally.
When the `terms` enum operates on non-numeric data it can collect it via
global ordinals. It actually has two separate collection strategies for,
one "dense" and one "remapping". Each of *those* strategies has two
"iteration" strategies that it uses to build buckets, depending on
whether or not we need buckets with `0` docs in them. Previously this
was done with several `null` checks and never really explained. This
change replaces those checks with two `CollectionStrategy` classes which
have good stuff like documentation.
Today we already disallow negative values for the "from" parameter in the search
API when it is set as a request parameter and setting it on the
SearchSourceBuilder, but it is still parsed without complaint from a search
body, leading to differing exceptions later. This PR changes this behavior to be
the same regardless of setting the value directly, as url parameter or in the
search body. While we silently accepted "-1" as meaning "unset" and used the
default value of 0 so far, any negative from-value is now disallowed.
Closes#54897
Limit the creation of data streams only for namespaces that have a composable template with a data stream definition.
This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover.
Also remove `timestamp_field` parameter from create data stream request and
let the create data stream api resolve the timestamp field
from the data stream definition snippet inside a composable template.
Relates to #53100
This PR changes the name of the Index Template V2 classes to "Composable Templates", it also ensures there are no mentions of "V2" in the documentation or error/warning messages. V1 templates are referred to as "legacy" templates.
Resolves#56609
This saves memory when running numeric significant terms which are not
at the top level by merging its collection into numeric terms and relying
on the optimization that we made in #55873.
* Update track_total_hits to union type
This commit updates track_total_hits parameter type to a union
of boolean and number, to reflect the possible values that can
be passed.
* Update rest-api-spec/src/main/resources/rest-api-spec/api/search.json
Co-Authored-By: Karel Minarik <karel.minarik@gmail.com>
* Update rest-api-spec/src/main/resources/rest-api-spec/api/search.json
Co-authored-by: Karel Minarik <karel.minarik@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Changes:
* Adds API reference docs for the delete snapshot repo API.
* Corrects an error in the delete snapshot repo API spec. Comma-separated
repository names are not supported.
* Relocates the existing delete snapshot repo API example docs.
When `date_histogram` is a sub-aggregator it used to allocate a bunch of
objects for every one of it's parent's buckets. This uses the data
structures that we built in #55873 rework the `date_histogram`
aggregator instead of all of the allocation.
Part of #56487
In KeystoreWrapper class we determine if the error to decrypt a
given keystore is caused by a wrong password based on the exception
that the SunJCE implementation of AES is
throwing(AEADBadTagException). Other implementations from other
Security Providers fail with a different exception and as such we
cannot differentiate between a corrupted file and a wrong password
in a foolproof way.
As in other tests such as in
KeyStoreWrapperTests#testDecryptKeyStoreWithWrongPassword
we handle this by matching both possible exception messages.
This adds an API for simulating template composition with or without an index template.
It looks like:
```
POST /_index_template/_simulate/my-template
```
To simulate a template named `my-template` that already exists, or, to simulate a template that does
not already exist:
```
POST /_index_template/_simulate
{
"index_patterns": ["my-index"]
"composed_of": ["ct1", "ct2"],
}
```
This is related to #55686, which adds an API to simulate composition based on an index name (hence
the `_simulate_index` vs `_simulate`).
This commit also adds reference documentation for both simulation APIs.
Relates to #53101Resolves#56390Resolves#56255
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
This commit removes the ability to put V2 index templates that reference missing component templates.
It also prevents removing component templates that are being referenced by an existing V2 index
template.
Relates to #53101Resolves#56314
This adds a few things to the `breakdown` of the profiler:
* `histogram` aggregations now contain `total_buckets` which is the
count of buckets that they collected. This could be useful when
debugging a histogram inside of another bucketing agg that is fairly
selective.
* All bucketing aggs that can delay their sub-aggregations will now add
a list of delayed sub-aggregations. This is useful because we
sometimes have fairly involved logic around which sub-aggregations get
delayed and this will save you from having to guess.
* Aggregtations wrapped in the `MultiBucketAggregatorWrapper` can't
accurately add anything to the breakdown. Instead they the wrapper
adds a marker entry `"multi_bucket_aggregator_wrapper": true` so we
can be quickly pick out such aggregations when debugging.
It also fixes a bug where `_count` breakdown entries were contributing
to the overall `time_in_nanos`. They didn't add a large amount of time
so it is unlikely that this caused a big problem, but I was there.
To support the arbitrary breakdown data this reworks the profiler so
that the `breakdown` can contain any data that is supported by
`StreamOutput#writeGenericValue(Object)` and
`XContentBuilder#value(Object)`.
This commit allows the JSON schema's documentation.url property to have a null value.
This can useful for cases where a feature is under development, and does not have
documentation published yet.
This commit also adds a documentation.url for two ml resources.
This commit adds the ability to auto create data streams using index templates v2.
Index templates (v2) now have a data_steam field that includes a timestamp field,
if provided and index name matches with that template then a data stream
(plus first backing index) is auto created.
Relates to #53100
This commit removes the `prefer_v2_templates` flag and setting. This was a brief setting that
allowed specifying whether V1 or V2 template should be used when an index is created. It has been
removed in favor of V2 templates always having priority.
Relates to #53101Resolves#56528
This is not a breaking change because this flag was never in a released version.
`auto_date_histogram` was returning the incorrect `interval` because
of a combination of two things:
1. When pipeline aggregations rewrote `auto_date_histogram` we reset the
interval to 1. Oops. Fixed that.
2. *Every* bucket aggregation was rewriting its buckets as though there
was a pipeline aggregation even if there aren't any. This is a bit
silly so we skip that too.
Closes#56116
The rest spec and documentation for _cat/threadpool supports a "size" parameter.
However, the "size" parameter will have no impact since there are no values
of type "SizeValue" of the return value of this _cat api.
This commit removes the "size" param from the spec and documentation.
This commit also adds support for the "time" param since and support to
format the time param for the "keep_alive" column. By default, the output
should not change since the "TimeValue" rendered default (via RestTable)
is toString(), and the code prior to this also called toString().
closes#54478
This removed the specification of `order` as it is not a parameter of the
v2 put template api (the priority is the equivalent of `order` and is
defined in the body) and add a bit of description for the `cause` parameter
(which is currently used as a cluster update task tracking)
This adds a new api to simulate matching the given index name against the
index templates in the system.
The syntax for the new API takes the following form:
POST _index_template/_simulate_index/{index_name}
{
"index_patterns": ["logs-*"],
"priority": 15,
"template": {
"settings": {
"number_of_shards": 3
}
...
}
}
Where the body is optional, but we support the entire body used by the
PUT _index_template/{name} api. When the body is specified we'll simulate
matching the given index against a system that'd have the given index
template together with the index templates that exist in the system.
The response, in both cases, will return the matching template's resolved
settings, mappings and aliases, together with a special field that'll print any
overlapping templates and their corresponding index patterns.
This adds support for V2 index templates to the cat templates API. It uses the `order` field as
priority in order not to break compatibility, while adding the `composed_of` field to show component
templates that are used from an index template.
Relates to #53101
* Aggs must specify a `field` or `script` (or both)
This adds a validation to VSParserHelper to ensure that a field or
script or both are specified by the user. This is technically
required today already, but throws an exception much deeper
in the agg framework and has a very unintuitive error for the user
(as well as eating more resources instead of failing early)
* Fix StringStats test
* Add yaml test
* Skip test on older versions
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
A JSON schema was recently introduced for the REST API specification. #54252
This PR introduces a 3rd party validation tool to ensure that the
REST specification conforms to the schema.
The task is applied to the 3 projects that contain REST API specifications.
The plugin wires this task into the precommit commit task, and should be
considered as part of the public API for the build tools for any plugin
developer to contribute their plugin's specification.
An ignore parameter has been introduced for the task to allow specific
file to be ignored from the validation. The ignored files in this PR
will soon get issues logged and a link so they can be fixed.
Closes#54314
This commit adds a new querystring parameter on the following APIs:
- Index
- Update
- Bulk
- Create Index
- Rollover
These APIs now support a `?prefer_v2_templates=true|false` flag. This flag changes the preference
creation to use either V2 index templates or V1 templates. This flag defaults to `false` and will be
changed to `true` for 8.0+ in subsequent work.
Additionally, setting this flag internally sets the `index.prefer_v2_templates` index-level setting.
This setting is used so that actions that automatically create a new index (things like rollover
initiated by ILM) will inherit the preference from the original index. This setting is dynamic so
that a transition from v1 to v2 templates can occur for long-running indices grouped by an alias
performing periodic rollover.
This also adds support for sending this parameter to the High Level Rest Client.
Relates to #53101
The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used.
In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs.
Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex().
If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api).
Relates to #53100
Modify the value of nowInMillis in queryShardContext to current timestamp, because the
value will be used lately when validating the filtered alias which uses now in a date_nanos
range query.
Closes#54315