Currently Lucene limits the max number of vector dimensions to 1024.
This commit overrides KnnFloatVectorField and KnnByteVectorField
classes to increase the limit to 2048 for indexed vectors in ES.
This changes the `GET _data_stream/ds_name/_lifecycle` endpoint to
return the data stream name even if it doesn't have a lifecycle
configured.
e.g.
```
{
"data_streams": [
{
"name": "logs-nginx"
}
]
}
```
* Fix xcontent and tests
* Update docs
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This removes a redundant `lifecycle` field in the `PUT _lifecycle`
request.
Before we had
```
PUT _data_stream/logs-nginx/_lifecycle
{
"lifecycle": {
"data_retention": "7d"
}
}
```
This changes the request to
```
PUT _data_stream/logs-nginx/_lifecycle
{
"data_retention": "7d"
}
```
Here we add synthetic source support for fields whose type is flattened.
Note that flattened fields and synthetic source have the following limitations,
all arising from the fact that in synthetic source we just see key/value pairs
when reconstructing the original object and have no type information in mappings:
* flattened fields use sorted set doc values of keywords, which means two things:
first we do not allow duplicate values, second we treat all values as keywords
* reconstructing array of objects results in nested objects (no array)
* reconstructing arrays with just one element results in a single-value field since we
have no way to distinguish single-valued from multi-values fields other then looking
at the count of values
* Update release notes to include 8.7.0
Release notes and migration guide from 8.7.0 release ported into main
as well as re-generating 8.8.0 release notes. This latter step will
be overwritten anyway, multiple times, by more up-to-date regeneration
of the 8.8.0 release notes during the release process.
* Remove coming 8.7.0 line
* Update docs/reference/migration/migrate_8_7.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
* Make same change to 8.8
---------
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
With PR we introduce CRUD endpoints which update/delete the data lifecycle on the data stream level. When this is updated it will apply at the next DLM run to all the backing indices that are managed by DLM.
Explains why you should remove `cluster.initial_master_nodes`, and
rewords some of the other sections a little for (subjectively) improved
readability.
This adds a new parameter to `knn` that allows filtering nearest neighbor results that are outside a given similarity.
`num_candidates` and `k` are still required as this controls the nearest-neighbor vector search accuracy and exploration. For each shard the query will search `num_candidates` and only keep those that are within the provided `similarity` boundary, and then finally reduce to only the global top `k` as normal.
For example, when using the `l2_norm` indexed similarity value, this could be considered a `radius` post-filter on `knn`.
relates to: https://github.com/elastic/elasticsearch/issues/84929 && https://github.com/elastic/elasticsearch/pull/93574
* Add note about ENV for systemd installs
This ES_TMPDIR env variable is not referenced in the /guide/en/elasticsearch/reference/current/setting-system-settings.html#sysconfig section, and when searching for the error mentioned in this page, it might not become too obvious.
* Restructure
---------
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
- No need to use an `AsyncShardFetch` here, there is no caching
- Response may be very large, introduce chunking
- Fan-out may be very large, introduce throttling
- Processing time may be nontrivial, introduce cancellability
- Eliminate many unnecessary intermediate data structures
- Do shard-level response processing more eagerly
- Determine allocation from `RoutingTable` not `RoutingNodes`
- Add tests
Relates #81081
* [DOCS] Describe how to use Elastic Agent to monitor Elasticsearch
* Temporarily fix doc build
* Add question about showing Elastic Agent metrics in the monitoring UI
* Apply changes from review
* Activate link to Kibana docs
* Fix broken link
* Update docs/reference/monitoring/indices.asciidoc
This change adds a new rest parameter called `rest_include_named_queries_score` that when set, includes the score of the named queries that matched the document.
Note that with this change, the score of named queries is always returned when using the transport client. The rest level has the ability to set the format of
the matched_queries section for BWC (kept as is by default).
Closes#65563
`runtime_mappings` is the name of the param in the search request. In the
document `put` statement, it's called `runtime`
Co-authored-by: Matthew Hinea <matthew.hinea@gmail.com>