These statements come off a little too strongly towards "don't use data streams if you *ever* have updates", but they do support updates when necessary, as long as the backing indices are used.
Adds support for the aggs request body parameter to the Query API Key Information API.
This parameter works identically to the well known eponymous parameter of the _search endpoint,
but the set of allowed aggregation types as well as the field names allowed is restricted.
If you start up a freshly-unpacked Elasticsearch tarball, security
auto-configuration will set `http.host: 0.0.0.0` in `elasticsearch.yml`,
overriding the documented default behaviour which is to fall back to
`network.host` which itself defaults to `localhost`. This commit adds a
note to the docs about this.
We are adding a query parameter to the field_caps api in order to filter out
fields with no values. The parameter is called `include_empty_fields` and
defaults to true, and if set to false it will filter out from the field_caps
response all the fields that has no value in the index.
We keep track of FieldInfos during refresh in order to know which field has
value in an index. We added also a system property
`es.field_caps_empty_fields_filter` in order to disable this feature if needed.
---------
Co-authored-by: Matthias Wilhelm <ankertal@gmail.com>
To improve cross-cluster search user experience, Kibana needs an endpoint that is accessible
by arbitrary Kibana dashboard search users and provides:
1. a listing of clusters in scope for a CCS query (based on the index expression and whether
there are any indices on each cluster that the Kibana user has access to query).
2. whether that cluster is currently connected to the querying cluster (will it come back as
skipped or failed in a CCS search)
3. showing the skip_unavailable setting for those clusters (so you can know whether it will
return skipped or failed in a CCS search)
4. the ES version of the cluster
Since no single Elasticsearch endpoint provides all of these features, this PR creates a new endpoint `_resolve/cluster` that works along side the existing `_resolve/index` endpoint
(and leverages some of its features).
Example usage against a cluster with 2 remote clusters configured:
GET /_resolve/cluster/*,remote*:bl*
Response:
{
"(local)": {
"connected": true,
"skip_unavailable": false,
"matching_indices": true,
"version": {
"number": "8.12.0-SNAPSHOT",
"build_flavor": "default",
"minimum_wire_compatibility_version": "7.17.0",
"minimum_index_compatibility_version": "7.0.0"
}
},
"remote2": {
"connected": true,
"skip_unavailable": true,
"matching_indices": true,
"version": {
"number": "8.12.0-SNAPSHOT",
"build_flavor": "default",
"minimum_wire_compatibility_version": "7.17.0",
"minimum_index_compatibility_version": "7.0.0"
}
},
"remote1": {
"connected": true,
"skip_unavailable": false,
"matching_indices": false,
"version": {
"number": "8.12.0-SNAPSHOT",
"build_flavor": "default",
"minimum_wire_compatibility_version": "7.17.0",
"minimum_index_compatibility_version": "7.0.0"
}
}
}
Almost all errors show up as "error" entries in the response.
Only the local SecurityException returns a 403 since that happens before the ResolveCluster
Transport code kicks in.
This PR extends the repository integrity health indicator to cover also unknown and invalid repositories. Because these errors are local to a node, we extend the `LocalHealthMonitor` to monitor the repositories and report the changes in their health regarding the unknown or invalid status.
To simplify this extension in the future, we introduce the `HealthTracker` abstract class that can be used to create new local health checks.
Furthermore, we change the severity of the health status when the repository integrity indicator reports unhealthy from `RED` to `YELLOW` because even though this is a serious issue, there is no user impact yet.
This adds some docs to the top of `docs.csv-spec` and
`docs-IT_tests_only.csv-spec` telling folks not to add more stuff there
and instead put new examples into whatever files they line up with. It
also shifts some things out of the file to "prime the pump" on cleaning
it up.
Since these docs were originally written there have been a couple
of changes:
1. We now support aarch64 as well as x86_64, so the SSE4.2 guidance
needed clarification.
2. ML is more deeply embedded into Elasticsearch functionality
across nodes that are not ML nodes. For example, ingest pipelines
now routinely use ML, and, in the near future, index mappings
will too in the form of semantic text. Although we cannot mandate
that xpack.ml.enabled is set uniformly across the cluster, as
that would be a breaking change, we should say ever more strongly
that ML must be enabled on all nodes if all ML functionality is to
work correctly. The primary reason for wanting to disable ML is
hardware incompatibility, and if ML is disabled for that reason
then it should not be used at all.
Makes the task_type element of the _inference API optional so that
it is possible to GET, DELETE or POST to an inference entity without
providing the task type
This adds two new vector index types: - flat - int8_flat
Both store the vectors in a flat space and search is brute-force over
the vectors in the index. For the regular `flat` index, this can be
considered syntactic sugar that allows `knn` queries without having to
put indices within HNSW.
For `int8_flat`, this allows float vectors to be stored in a flat
manner, but also automatically quantized.
* [DOCS] Add link to on-prem install tutorial
* Move link to bottom of packages section
* Rearrange things according to suggestions
* Add another link on the 'Install Elasticsearch with RPM' page
Today the docs on balancing settings describe what the settings all do
but offer little guidance about how to configure them. This commit adds
some extra detail to avoid some common misunderstandings and reorders
the docs a little so that more commonly-adjusted settings are mentioned
earlier.
A number of aggregations that rely on deferred collection don't work
with time series index searcher and will produce incorrect result. These
aggregation usages should fail. The documentation has been updated to
describe these limitations.
In case of multi terms aggregation, the depth first collection is
forcefully used when time series aggregation is used. This behaviour is
inline with the terms aggregation.
* Document nested expressions for stats
* More docs
* Apply suggestions from review
- count-distinct.asciidoc
- Content restructured, moving the section about approximate counts to end of doc.
- count.asciidoc
- Clarified that omitting the `expression` parameter in `COUNT` is equivalent to `COUNT(*)`, which counts the number of rows.
- percentile.asciidoc
- Moved the note about `PERCENTILE` being approximate and non-deterministic to end of doc.
- stats.asciidoc
- Clarified the `STATS` command
- Added a note indicating that individual `null` values are skipped during aggregation
* Comment out mentioning a buggy behavior
* Update sum with inline function example, update test file
* Fix typo
* Delete line
* Simplify wording
* Fix conflict fix typo
---------
Co-authored-by: Liam Thompson <leemthompo@gmail.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
This adds support for the `match` query type to the Query API key Information API.
Note that since string values associated to API Keys are mapped as `keywords`,
a `match` query with no analyzer parameter is effectively equivalent to a `term` query
for such fields (e.g. `name`, `username`, `realm_name`).
Relates: #101691
Deprecated node_version field, made it optional(unused) in new parser
Added deprecation warning handler for mixed cluster
Split tests for old vs. current format
The docs for forced awareness indicate that no replicas will be assigned
until all zones are available, which is definitely undesirable and also
not the actual behaviour. This commit fixes the wording to match what
really happens.
Closes#104777
The ILM `_explain` API displays the user-configured rollover action with the user-defined rollover conditions. However, the implicit conditions were not visible anywhere (except in the documentation and some debug-level logging).
The ILM `_explain` API example responses included human-readable (time) fields, but the example URL didn't actually return those human-readable fields (only the raw fields).