Commit Graph

3214 Commits

Author SHA1 Message Date
elasticsearchmachine 28b4d081cb Merge pull request ESQL-1050 from elastic/main
🤖 ESQL: Merge upstream
2023-04-22 01:11:31 -04:00
Nhat Nguyen 55049fd10a
Wait for events in sort segments YAML test (#95478)
This PR is similar to #46586.

When waiting for no initializing shards we also have to wait for events
when we have more than one node in the cluster. When the primary is
started, there is a short period of time, where neither the primary nor
any of the replicas are initializing.
2023-04-22 00:31:59 -04:00
elasticsearchmachine e05d9a53d5 Merge pull request ESQL-1038 from elastic/main
🤖 ESQL: Merge upstream
2023-04-20 13:15:29 -04:00
Salvatore Campagna a721fcf80f
Fix error message and unmute test (#95411) 2023-04-20 13:41:20 +02:00
Salvatore Campagna e560b81754
mute: yaml test with incorrect error message (#95407)
Mute test.

See #95406
2023-04-20 06:54:18 -04:00
Mary Gouseti a432aa8df2
Revert "mute multiple tests (#95145)" (#95402) 2023-04-20 12:40:59 +02:00
Salvatore Campagna 1c609ac0f7
Support flattened fields as time series dimension fields (#95273)
Normally dimension fields are identified by means of a boolean parameter
at mapping time, time_series_dimension. Flattened fields do not have mappings,
other than identifying the top level field as a flattened field type. Moreover a boolean
is not enough to identify the top-level field as a dimension since we would like
users to be able to specify a subset of the fields in the flattened field to be dimensions
(not necessarily all of them). For this reason we introduce a new mapping parameter,
time_series_dimensions, which lists the fields, in any order, in the flattened field
that the user wants as dimensions. Field names must not include the root field name
and their name is the relative path from the root down to the leaf field name.

We require flattened fields to be indexed, to have doc values and disallow usage
of the ignore_above parameter together with time_series_dimensions.
2023-04-20 10:57:45 +02:00
Ievgen Degtiarenko c2c0ced9b1
Reset desired balance (#94525)
This introduces an endpoint to reset the desired balance.
It could be used if computed balance diverged from the actual one a lot 
to start a new computation from the current state.
2023-04-20 08:03:48 +02:00
elasticsearchmachine 240a526d8d Merge pull request ESQL-1034 from elastic/main
🤖 ESQL: Merge upstream
2023-04-19 13:16:54 -04:00
Carlos Delgado cf284036f5
Set Search Application API stability as "experimental" (#95379) 2023-04-19 18:10:12 +02:00
elasticsearchmachine 26ff9657b8 Merge pull request ESQL-1032 from elastic/main
🤖 ESQL: Merge upstream
2023-04-19 01:14:43 -04:00
Lee Hinman 947279445b
Add Watcher APIs for updating/retrieving settings (#95342)
The `.watches` index is a system index, which means that its settings
cannot be modified by the user. This commit adds APIs (`PUT
/_watcher/settings` and `GET /_watcher/settings`) that allow modifying
and retrieving a subset of index settings for the `.watches` index.

The settings that are currently allowed are `index.number_of_replicas`
and `index.auto_expand_replicas`, though more may be added in the
future.

Resolves https://github.com/elastic/elasticsearch/issues/92991
2023-04-18 17:22:46 -04:00
elasticsearchmachine 3d26f6193b Merge pull request ESQL-1027 from elastic/main
🤖 ESQL: Merge upstream
2023-04-18 13:18:14 -04:00
Hendrik Muhs 3524836311
[ML] integrate model download into put trained model API (#95281)
This PR enables downloading packaged models from `ml-models.elastic.co`,
an endpoint provided by Elastic. Elastic provided models begin with a
`.`, which is a private namespace that does not interfere with user
models(the `.` prefix is disallowed for them). If a user puts a packaged
model, the model gets automatically downloaded. For air-gaped
environments it is possible to load models from a file.

earlier changes: #95175, #95207
2023-04-18 03:29:45 -04:00
David Kyle 267e74ff25
[ML] Start, stop and infer with deployment ID (#95168)
A trained model deployment can be started with an optional deployment Id.
Deployment Ids and model Ids considered to be in the same namespace
and unique, a deployment id cannot be the same as any other deployment
or model Id unless it is the same as the model being deployed. When 
creating a new model, the id cannot match any models or deployments
2023-04-18 08:22:34 +01:00
Salvatore Campagna 726f2c4c37
Test time series index routing path corner cases (#95233) 2023-04-18 08:45:30 +02:00
elasticsearchmachine 6cd8f2a22d Merge pull request ESQL-1013 from elastic/main
🤖 ESQL: Merge upstream
2023-04-13 13:16:47 -04:00
Kathleen DeRusso 28e6d64ec3
Initial Search Application Search API with templates (#95026)
---------

Co-authored-by: Ioana Tagirta <ioana.tagirta@elastic.co>
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
Co-authored-by: cdelgado <carlos.delgado@elastic.co>
Co-authored-by: Sloane Perrault <sloane.perrault@gmail.com>
2023-04-13 11:56:27 +02:00
elasticsearchmachine 03885e527c Merge pull request ESQL-1008 from elastic/main
🤖 ESQL: Merge upstream
2023-04-12 13:19:03 -04:00
Aurélien FOUCRET 46f5ee4fd7
Behavioral Analytics - Fix failing YAML tests on main (#95193)
* Fix failing tests on main
2023-04-12 17:11:32 +02:00
Aurélien FOUCRET 675163b155
[Enterprise Search][Behavioral Analytics] Events ingest API (#95027) 2023-04-12 15:13:19 +02:00
elasticsearchmachine 46d2d55c81 Merge pull request ESQL-1001 from elastic/main
🤖 ESQL: Merge upstream
2023-04-11 13:17:10 -04:00
elasticsearchmachine ea0cc7abfc Merge pull request ESQL-1000 from elastic/main
🤖 ESQL: Merge upstream
2023-04-11 11:38:36 -04:00
Seth Michael Larson 5e4d703009
Use 'indices' namespace for the Explain Data Lifecycle API 2023-04-11 10:17:38 -05:00
Salvatore Campagna be5406a956
mute multiple tests (#95145) 2023-04-11 15:05:35 +02:00
Salvatore Campagna 0eeef45ea2
Synthetic source support for flattened fields (#94842)
Here we add synthetic source support for fields whose type is flattened.
Note that flattened fields and synthetic source have the following limitations,
all arising from the fact that in synthetic source we just see key/value pairs
when reconstructing the original object and have no type information in mappings:

* flattened fields use sorted set doc values of keywords, which means two things: 
   first we do not allow duplicate values, second we treat all values as keywords
* reconstructing array of objects results in nested objects (no array)
* reconstructing arrays with just one element results in a single-value field since we
   have no way to distinguish single-valued from multi-values fields other then looking
   at the count of values
2023-04-11 10:54:28 +02:00
elasticsearchmachine edf9932b28 Merge pull request ESQL-992 from elastic/main
🤖 ESQL: Merge upstream
2023-04-10 13:14:09 -04:00
Seth Michael Larson 68bcbb50b7
Align search application and behavioral analytics APIs with current guidelines 2023-04-10 11:02:39 -05:00
elasticsearchmachine c691cf8933 Merge pull request ESQL-983 from elastic/main
🤖 ESQL: Merge upstream
2023-04-07 01:19:06 -04:00
Josh Mock 282eb771e3
Remove broken URL for cat.component_templates API (#94993)
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-04-06 13:26:13 -05:00
elasticsearchmachine 0c752ffa9b Merge pull request ESQL-979 from elastic/main
🤖 ESQL: Merge upstream
2023-04-05 13:17:12 -04:00
Jim Ferenczi ea14f15633
Update rest api spec for ent-search module (#95020)
This change sets the stability of ent-search APIs to beta and visibility to public.
It also removes the feature flag link since enabling the module is not considered as a feature flag
and the module is enabled by default.
2023-04-05 13:56:33 +01:00
elasticsearchmachine 0d2c6fd45c Merge pull request ESQL-968 from elastic/main
🤖 ESQL: Merge upstream
2023-04-04 13:28:26 -04:00
Mary Gouseti 99145bbe9c
Add new endpoints to configure data lifecycle on a data stream level. (#94590)
With PR we introduce CRUD endpoints which update/delete the data lifecycle on the data stream level. When this is updated it will apply at the next DLM run to all the backing indices that are managed by DLM.
2023-04-04 18:37:38 +02:00
elasticsearchmachine 960a923819 Merge pull request ESQL-965 from elastic/main
🤖 ESQL: Merge upstream
2023-04-04 11:28:33 -04:00
Carlos Delgado 7a782031c4
Add Enterprise Search Module (#94381)
Create base module for ent-search with CRUD APIs to manage behavioral analytics and Search Application.

---------

Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
Co-authored-by: Aurélien FOUCRET <aurelien.foucret@gmail.com>
Co-authored-by: Kathleen DeRusso <63422879+kderusso@users.noreply.github.com>
Co-authored-by: Ioana Tagirta <ioanatia@users.noreply.github.com>
Co-authored-by: Joseph McElroy <joseph.mcelroy@elastic.co>
Co-authored-by: Aurelien FOUCRET <aurelien.foucret@elastic.co>
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2023-04-04 10:36:13 +01:00
Mary Gouseti 1af2f438a3
[DLM] Extend the simulate template API to support include defaults (#94861) 2023-04-04 11:25:35 +02:00
elasticsearchmachine 118f164e2e Merge pull request ESQL-957 from elastic/main
🤖 ESQL: Merge upstream
2023-03-31 13:20:05 -04:00
Alan Woodward 093e36c875
Introduce DocumentParsingException (#92646)
Document parsing methods currently throw MapperParsingException. This
isn't very helpful, as it doesn't contain any information about where the parse
error happened - it is designed for parsing mappings, which are realised into
java maps before being examined. This commit introduces a new exception
specifically for document parsing that extends XContentException, so that
it reports the current position of the parser as part of its error message.

Fixes #85083
2023-03-31 12:14:19 +01:00
elasticsearchmachine 91cac568cf Merge pull request ESQL-956 from elastic/main
🤖 ESQL: Merge upstream
2023-03-30 23:29:15 -04:00
Seth Michael Larson 79b0b834cf
Make id parameter optional in logstash.get_pipeline API 2023-03-30 15:29:24 -05:00
elasticsearchmachine 21fec823e5 Merge pull request ESQL-942 from elastic/main
🤖 ESQL: Merge upstream
2023-03-29 01:22:00 -04:00
Benjamin Trent f23b906891
Add new `similarity` field to `knn` clause in `_search` (#94828)
This adds a new parameter to `knn` that allows filtering nearest neighbor results that are outside a given similarity.

`num_candidates` and `k` are still required as this controls the nearest-neighbor vector search accuracy and exploration. For each shard the query will search `num_candidates` and only keep those that are within the provided `similarity` boundary, and then finally reduce to only the global top `k` as normal.

For example, when using the `l2_norm` indexed similarity value, this could be considered a `radius` post-filter on `knn`.

relates to: https://github.com/elastic/elasticsearch/issues/84929 && https://github.com/elastic/elasticsearch/pull/93574
2023-03-28 15:29:01 -04:00
elasticsearchmachine 02930e4cc9 Merge pull request ESQL-937 from elastic/main
🤖 ESQL: Merge upstream
2023-03-28 13:23:49 -04:00
Ievgen Degtiarenko 05847ce813
Add computed shard movement metric (#94662)
This should help us ensure that desired balance is not producing too many shard movements during computation (that could be a sign of unusual configuration or a bug) that could eventually result in actual cluster balance diverging far from the desired balance (separate change is still required to warn/reset if we are in fact far during reconciliation step).
2023-03-28 11:25:00 +02:00
elasticsearchmachine 581a95ca05 Merge pull request ESQL-930 from elastic/main
🤖 ESQL: Merge upstream
2023-03-28 01:18:44 -04:00
Andrei Dan 223385f887
Introduce a _lifecycle/explain API for data stream backing indices (#94621)
This adds an {index}/_lifecycle/explain API to retrieve information
about an index's status within its lifecycle.

The response looks like so:
```
"indices" : {
    ".ds-metrics-foo-2023.03.22-000001" : {
      "index" : ".ds-metrics-foo-2023.03.22-000001",
      "managed_by_dlm" : true,
      "index_creation_date_millis" : 1679475563571,
      "time_since_index_creation" : "843ms",
      "rollover_date_millis" : 1679475564293,
      "time_since_rollover" : "121ms",
      "lifecycle" : { },
      "generation_time" : "121ms"
    },
    ".ds-metrics-foo-2023.03.22-000002" : {
      "index" : ".ds-metrics-foo-2023.03.22-000002",
      "managed_by_dlm" : true,
      "index_creation_date_millis" : 1679475564351,
      "time_since_index_creation" : "63ms",
      "lifecycle" : { }
    }
  }
}
```
2023-03-27 08:44:40 +01:00
elasticsearchmachine c895597b2b Merge pull request ESQL-919 from elastic/main
🤖 ESQL: Merge upstream
2023-03-24 01:18:28 -04:00
Jim Ferenczi 2f29830cd3
Add the ability to return the score of the named queries (#94564)
This change adds a new rest parameter called `rest_include_named_queries_score` that when set, includes the score of the named queries that matched the document.
Note that with this change, the score of named queries is always returned when using the transport client. The rest level has the ability to set the format of
the matched_queries section for BWC (kept as is by default).

Closes #65563
2023-03-23 13:17:26 +00:00
elasticsearchmachine 853f134e5e Merge pull request ESQL-903 from elastic/main
🤖 ESQL: Merge upstream
2023-03-20 13:21:04 -04:00
Ievgen Degtiarenko 21adad38a3
Add data tier to /_internal/desired-balance (#94496) 2023-03-20 10:44:55 +01:00
elasticsearchmachine 8dc5155b00 Merge pull request ESQL-898 from elastic/main
🤖 ESQL: Merge upstream
2023-03-18 01:19:15 -04:00
David Turner e43e7c2f4a
Improve transport stats histogram (#93598)
- omits empty buckets at the start and end of the histogram
- includes human-readable representation of the bucket boundaries if `?human` specified
2023-03-17 18:01:58 -04:00
elasticsearchmachine d121e3ad57 Merge pull request ESQL-895 from elastic/main
🤖 ESQL: Merge upstream
2023-03-15 13:24:29 -04:00
Salvatore Campagna a03a7796f9
Downsampling unmapped text fields (#94387)
* fix: downsampling unmapped text fields

When a field is unmapped usually dynamic mapping maps it using
a multi field which has the original field name as a text field
and a keyword sub-field. At downsampling time we skip text fields
and we only index the corresponding keyword field in the target index.
As a result, when indexing data into the target index we need to
use the name of the parent (text) field instead of the (keyword)
sub-field in order for indexing to succeed.

Here we derive the name of the parent field by stripping away the
name of the sub-field (whatever appears after the last '.' in the name).
The name of the subfield is still available through `MappedFieldType#name`.
2023-03-15 15:04:06 +01:00
elasticsearchmachine dcfa6b4aa6 Merge pull request ESQL-888 from elastic/main
🤖 ESQL: Merge upstream
2023-03-11 00:17:44 -05:00
elasticsearchmachine 05b219a683 Merge pull request ESQL-886 from elastic/main
🤖 ESQL: Merge upstream
2023-03-10 12:20:13 -05:00
Nhat Nguyen 07441d9784
Mute 380_sort_segments (#94473)
Tracked at #94357
2023-03-10 11:58:07 -05:00
Martijn van Groningen ef85f47c75
Added more debugging for investigating 'tsdb/25_id_generation/delete over _bulk' test failure (#94456)
Added mget call to verify the documents being deleted actually got indexed.
And added an assertion to PerThreadIDVersionAndSeqNoLookup to get more information
about the reader if there is no timestamp point values field.

Relates to #93852
2023-03-10 15:06:39 +01:00
Przemysław Witek a3f34a39c8
[Transform] Add `delete_dest_index` parameter to the `Delete Transform API` (#94162) 2023-03-10 13:02:19 +01:00
elasticsearchmachine 74eacc674f Merge pull request ESQL-877 from elastic/main
🤖 ESQL: Merge upstream
2023-03-08 12:21:43 -05:00
elasticsearchmachine 21e40b850f Merge pull request ESQL-874 from elastic/main
🤖 ESQL: Merge upstream
2023-03-08 11:26:46 -05:00
Pooya Salehi 5010402057
Revert to 1 node cluster for YAML tests and avoid wait for green (#94385)
I have reviewed the tests that motivated the use of a two node cluster
for the YAML tests. It seems there is no reason anymore to use
`wait_for_status: green` and `number_of_replicas: 0` since the [default
values](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-wait-for-active-shards)
make sure the write will succeed. There are many newer YAML tests where
an index creation is followed by an index operation, w/o waiting for
green. With this PR, I'm reverting the changes in
https://github.com/elastic/elasticsearch/pull/94304, and instead modify
the tests.
2023-03-08 07:25:43 -05:00
Luca Cavanna 4712a78d5f Deprecate _knn_search in the REST spec (#94103)
We deprecated the _knn_search endpoint with #88828 but we missed deprecating it in the REST spec.

Note that the REST spec parser was not aligned with its json schema in that the deprecated section caused an exception to be thrown. The parser is now updated to accept the deprecated section at the endpoint level.
2023-03-07 16:55:56 -08:00
William Brafford 8b6ce3465e
Reset feature states before deleting indices in EsRestTestCase (#94191)
* Call feature cleanup in ES Test Case
* Don't reset features in certain integration tests
* Cleanup and handle warnings
2023-03-07 10:53:48 -05:00
Mary Gouseti fe20d923ae
[DLM] Introduce default rollover cluster setting & expose it via APIs (#94240)
For managing data streams with DLM we chose to have one cluster setting that will determine the rollover conditions for all data streams. This PR introduces this cluster setting, it exposes it via the 3 existing APIs under the flag `include_defaults` and adjusts DLM to use it. The feature remains behind a feature flag.
2023-03-07 16:41:32 +01:00
Craig Taverner 1bd5ad97c8
Mute test for issue 94239 (#94365)
* Allow skip-all to work with other ranges

When muting tests that already have version ranges set in the skip
section, it is convenient to remember the previous ranges for when
we later un-mute.

* Mute test that fails 1% of the time (#94239)

* Remember previous versions in test skip.version
2023-03-07 15:23:09 +01:00
Pooya Salehi 97c2812e58
Use 1 replica in YAML tests that issue get/exist calls (#94303) 2023-03-07 09:38:21 +01:00
Nhat Nguyen 3bfbcf1b61
Remove replicas settings in indices.stats YAML tests (#94309)
These YAML tests can work with any number of replicas.
Removing the replicas setting so these tests can run with stateless.
2023-03-06 11:50:26 -08:00
Nhat Nguyen efe2bb61c1
Remove replicas settings in search YAML tests (#94307)
We don't need to specify the number of replicas in search YAML tests.
These tests should work with any number of replicas.
2023-03-06 11:49:45 -08:00
Nhat Nguyen 7824413869 Revert "Remove replicas settings in indices.sort YAML tests (#94308)"
This reverts commit 739cb9776f.
2023-03-06 09:36:43 -08:00
Pooya Salehi def4426e02
Use 2 nodes in YAML test cluster (#94304)
We have some YAML tests that would require at least one
replica (search shard) to run with Stateless and since they
wait for green, they explicitly set replicas to 0 (see e.g.
realtime_refresh). Using 2 nodes
by default makes sure we could run those tests w/o any
changes. IMO, they are pretty important/essential tests.

Relates #94303
2023-03-06 17:37:09 +01:00
Nhat Nguyen 739cb9776f
Remove replicas settings in indices.sort YAML tests (#94308) 2023-03-06 06:35:44 -08:00
Nhat Nguyen 5c3683eed1
Remove replicas settings in field-caps YAML tests (#94292) 2023-03-06 06:34:51 -08:00
Ievgen Degtiarenko 90d017ae09
Add cluster_info to the GET /_internal/desired-balance endpoint output (#94272) 2023-03-06 13:57:55 +01:00
Luca Cavanna 767b410cfb
Deprecate _knn_search in the REST spec (#94103)
We deprecated the _knn_search endpoint with #88828 but we missed deprecating it in the REST spec.

Note that the REST spec parser was not aligned with its json schema in that the deprecated section caused an exception to be thrown. The parser is now updated to accept the deprecated section at the endpoint level.
2023-03-02 20:04:31 +01:00
elasticsearchmachine d10f8743b8 Merge pull request ESQL-847 from elastic/main
🤖 ESQL: Merge upstream
2023-03-01 12:29:04 -05:00
Craig Taverner e7a2c44bbf
Support position time_series_metric on geo_point fields (#93946)
Added position time_series_metric:

* start creating position time_series_metric
* Add yaml tests for queries and aggs
* Disallow multi-values for geo_point as ts-metric
* Limit running on older versions, some parts of the time-series syntax were not supported on all versions
* ScaledFloatFieldMapper does not support POSITION, We should only test it against COUNTER and GAUGE, since it only supports those two metric types
* Expand unit tests and allow parsing of dimension. We expand the tests to cover all cases tested in DoubleFieldMapperTests which also tests the behaviour of setting the dimension to true or false, so we enable parsing that for symmetry, but reject `true` as illegal for geo_point.
* Add unit tests for position metric multi-values
2023-03-01 12:57:06 +01:00
elasticsearchmachine 75c5fb02d7 Merge pull request ESQL-841 from elastic/main
🤖 ESQL: Merge upstream
2023-03-01 00:26:10 -05:00
Alan Woodward 085146838c
Teach 'contains' yaml test directive to match substrings (#94183)
In many cases, matching a simple substring is sufficient for testing and is
easier than building a regex match.
2023-02-28 15:48:13 +00:00
elasticsearchmachine 1d175b1b5a Merge pull request ESQL-836 from elastic/main
🤖 ESQL: Merge upstream
2023-02-27 12:28:52 -05:00
Andrei Dan c63d0afb21
Fix data lifecycle tests for non-snapshot builds (#94146)
This fixes yaml tests when -Dbuild.snapshot is false.

The data lifecycle functionality is not enabled unless the feature flag
is configured.
This makes the yaml tests enable the feature flag for non-snapshot builds
(it's always enabled for snapshot builds)
2023-02-27 13:08:13 +00:00
elasticsearchmachine e29966a4e3 Merge pull request ESQL-826 from elastic/main
🤖 ESQL: Merge upstream
2023-02-23 12:28:01 -05:00
Craig Taverner 336928ab09
Update yamlRestTest docs skip.version (#94069)
* Update yamlRestTest docs skip.version

The skip.version field supports multiple versions,
and the setup/teardown areas combine with test skip versions
in undocumented ways, so we document them.

* Update rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/README.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

---------

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-02-23 15:53:35 +01:00
Nhat Nguyen f7a38a888a
Remove replias setting in search_shards YAML tests (#93997)
Remove the number of replicas setting to enable these tests to run with the stateless.
2023-02-23 06:45:48 -08:00
Mary Gouseti da7b1ede54
Change the settings of the index template test for DLM (#94078) 2023-02-23 15:32:04 +01:00
elasticsearchmachine 309d38b9e1 Merge pull request ESQL-822 from elastic/main
🤖 ESQL: Merge upstream
2023-02-22 12:28:59 -05:00
Mary Gouseti 0f5d1eb6ea
Introduce Lifecycle model in existing data stream and template configuration (#93652)
In this PR we introduce the DLM feature flag and the data lifecycle model. The model is added to the composable templates, the index templates (v2 only) and the data streams.

Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
2023-02-22 12:01:54 +01:00
elasticsearchmachine 4e4819ad6b Merge pull request ESQL-807 from elastic/main
🤖 ESQL: Merge upstream
2023-02-20 12:26:31 -05:00
Martijn van Groningen cadcb8f3cd
Adjust skip version of yaml test (#93897)
after #93800 has been backported to 8.7 branch.
2023-02-20 11:05:54 +01:00
Martijn van Groningen df4a8f72c8
Don't treat counter fields in outside of tsdb as counters. (#93800)
Fields that have the time_series_metric attribute set to counter in non tsdb indices should use number value source type instead of counter value source type. Essentially not handling these fields as counters at search time.

Relates to #93539
2023-02-20 08:03:21 +01:00
elasticsearchmachine a76af0c321 Merge pull request ESQL-790 from elastic/main
🤖 ESQL: Merge upstream
2023-02-16 00:24:18 -05:00
Yang Wang 26d01b71bd
Ensure search features work with the new RCS model (#93720)
This PR ensures most search features (scroll, async search, pit, field
caps, msearch, vector tile etc) work with the new RCS model. The main
code change is tested by adapting the common yaml CCS tests to use the
new RCS model to provide a broad test coverage. The tests ensure the new
RCS model works from search's perspective. We could still use more tests
from security's perspective, e.g. DLS/FLS, in separate PRs.

Note:  * Eql yaml test files are not located under `x-pack/plugin` and
this makes it hard to reuse. It should be possible to relocate them. But
I'll address it separately.  * Sql yaml requires special transformation
to work. I'll also have it separately.
2023-02-15 21:10:56 -05:00
elasticsearchmachine bf84c56b1d Merge pull request ESQL-788 from elastic/main
🤖 ESQL: Merge upstream
2023-02-15 12:28:46 -05:00
David Turner a7e2430b79
Include node ID in balance API (#93823)
Today we report node stats by name, but the desired nodes work in terms
of node IDs. This commit adds a mapping between node name and ID to make
the output easier to interpret.
2023-02-15 08:39:59 -05:00
elasticsearchmachine 3a0ebf345b Merge pull request ESQL-784 from elastic/main
🤖 ESQL: Merge upstream
2023-02-14 12:30:03 -05:00
Tanguy Leroux 7ce308039d
Revert YAML tests changes related to index sorting (#93753)
In #93386 we adjusted some YAML tests to allow their execution 
on a 2 nodes cluster where every index has at least 1 replica, but 
this caused test failures for single or multi node clusters.

This pull request reverts the changes that was made. Those tests 
will be muted for the 2 nodes cluster.

Closes #93572
Closes #93599
2023-02-14 10:52:39 +01:00
elasticsearchmachine 26d7244cc0 Merge pull request ESQL-779 from elastic/main
🤖 ESQL: Merge upstream
2023-02-13 19:42:53 -05:00
David Turner 9c8c9528ad
Add cluster stats re. snapshot activity (#93680)
Shows how many ongoing snapshots/clones/deletions/etc. there are, and
summarises the shard-level status too for progress tracking.
2023-02-13 07:48:14 +00:00
elasticsearchmachine a2850897a0 Merge pull request ESQL-760 from elastic/main
🤖 ESQL: Merge upstream
2023-02-09 12:24:49 -05:00
Iraklis Psaroudakis eef62fe2b9
Mute test (#93636)
Relates #93572
2023-02-09 05:54:48 -05:00
Ievgen Degtiarenko e887c97282
Fix desired balance spec (#93617)
Spec relied on a presence of a node with `test-cluster-0` name.
This change reads the name from the cluster state instead.
2023-02-09 11:37:04 +01:00
Iraklis Psaroudakis b29399e2da
Mute Index Sort (#93623)
Relates #93599
2023-02-09 04:14:31 -05:00
elasticsearchmachine 993905ed62 Merge pull request ESQL-759 from elastic/main
🤖 ESQL: Merge upstream
2023-02-08 19:42:53 -05:00
Mark Vieira 48ba3269d7 Mute MixedClusterClientYamlTestSuiteIT Test cluster_balance_stats 2023-02-08 13:36:39 -08:00
elasticsearchmachine 038333c3c3 Merge pull request ESQL-742 from elastic/main
🤖 ESQL: Merge upstream
2023-02-07 12:27:14 -05:00
Tanguy Leroux a1730019e5
Adjust number of replicas in YAML REST tests (#93386)
Some core yaml rest tests use an explicit number of replicas when creating indices. I suspect that this is often not needed and it prevents those tests to run in a 2 nodes (index & search) cluster.

Most of the impacted tests are search related so I'll use the :Search/Search label.

Relates ES-5253
2023-02-07 14:35:10 +01:00
elasticsearchmachine b8b0fe4d39 Merge pull request ESQL-725 from elastic/main
🤖 ESQL: Merge upstream
2023-02-06 12:27:25 -05:00
Ievgen Degtiarenko ab5ae88919
Expose forecasted and actual disk usage per tier and node (#93497) 2023-02-06 13:34:18 +01:00
David Roberts 927e165068
[ML] Remove semantic_search endpoint (#93492)
Instead of a separate endpoint the functionality will be built
into _search.
2023-02-06 09:59:38 +00:00
elasticsearchmachine 4ab1628967 Merge pull request ESQL-713 from elastic/main
🤖 ESQL: Merge upstream
2023-02-02 15:09:50 -05:00
Przemysław Witek f60401a61c
[Transform] Transform `_schedule_now` API (#92948) 2023-02-02 19:03:16 +01:00
Ievgen Degtiarenko 513dc2f24f
Expose per node counts (#93439) 2023-02-02 16:13:01 +01:00
Martijn van Groningen 9babcc9bb9
Enforce synthetic source for time series indices (#93380)
Support for synthetic source is also added to `unsigned_long` field as part of this change.
This is required because `unsigned_long` field types can be used in tsdb indices and
this change would prohibit the usage of these field type otherwise.

Closes #92319
2023-02-02 08:00:52 +01:00
elasticsearchmachine f7cb660fd4 Merge pull request ESQL-698 from elastic/main
🤖 ESQL: Merge upstream
2023-02-01 12:27:03 -05:00
Enrico Zimuel 56340cedf5
FIxed the doc URL for rest API update trained model deployment (#93072)
As titled, the previous URL was 404.
2023-02-01 14:30:30 +00:00
elasticsearchmachine 0ef50a4e0a Merge pull request ESQL-691 from elastic/main
🤖 ESQL: Merge upstream
2023-01-31 12:26:41 -05:00
Nicolas Ruflin 9f4d7fafad
Add `ignore_missing_component_templates` config option (#92436)
This change introduces the configuration option `ignore_missing_component_templates` as discussed in https://github.com/elastic/elasticsearch/issues/92426 The implementation [option 6](https://github.com/elastic/elasticsearch/issues/92426#issuecomment-1372675683) was picked with a slight adjustment meaning no patterns are allowed.

## Implementation

During the creation of an index template, the list of component templates is checked if all component templates exist. This check is extended to skip any component templates which are listed under `ignore_missing_component_templates`. An index template that skips the check for the component template `logs-foo@custom` looks as following:


```
PUT _index_template/logs-foo
{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}
```

The component template `logs-foo@package` has to exist before creation. It can be created with:

```
PUT _component_template/logs-foo@custom
{
  "template": {
    "mappings": {
      "properties": {
        "host.ip": {
          "type": "ip"
        }
      }
    }
  }
}
```

## Testing

For manual testing, different scenarios can be tested. To simplify testing, the commands from `.http` file are added. Before each test run, a clean cluster is expected.

### New behaviour, missing component template

With the new config option, it must be possible to create an index template with a missing component templates without getting an error:

```
### Add logs-foo@package component template

PUT http://localhost:9200/
    _component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json

{
  "template": {
    "mappings": {
      "properties": {
        "host.name": {
          "type": "keyword"
        }
      }
    }
  }
}

### Add logs-foo index template

PUT http://localhost:9200/
    _index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json

{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}

### Create data stream

PUT http://localhost:9200/
    _data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json

### Check if mappings exist

GET http://localhost:9200/
    logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```

It is checked if all templates could be created and data stream mappings are correct.

### Old behaviour, with all component templates

In the following, a component template is made optional but it already exists. It is checked, that it will show up in the mappings:

```
### Add logs-foo@package component template

PUT http://localhost:9200/
    _component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json

{
  "template": {
    "mappings": {
      "properties": {
        "host.name": {
          "type": "keyword"
        }
      }
    }
  }
}

### Add logs-foo@custom component template

PUT http://localhost:9200/
    _component_template/logs-foo@custom
Authorization: Basic elastic password
Content-Type: application/json

{
  "template": {
    "mappings": {
      "properties": {
        "host.ip": {
          "type": "ip"
        }
      }
    }
  }
}

### Add logs-foo index template

PUT http://localhost:9200/
    _index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json

{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}

### Create data stream

PUT http://localhost:9200/
    _data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json

### Check if mappings exist

GET http://localhost:9200/
    logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```

### Check old behaviour

Ensure, that the old behaviour still exists when a component template is used that is not part of `ignore_missing_component_templates`: 

```
### Add logs-foo index template

PUT http://localhost:9200/
    _index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json

{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}
```

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
2023-01-31 08:40:29 -07:00
Bogdan Pintea 867dd8576c Add YAML testing support (ESQL-672)
This adds support for YAML testing.
It also migrates part of the EsqlActionIT tests into yml ones.

Closes ESQL-549
2023-01-30 21:24:19 +01:00
Pooya Salehi 57e9e40235
Add stateless roles to cat.nodes basic yaml test (#93274) 2023-01-26 17:15:23 +01:00
Przemysław Witek 40d32205db
[Transform] Add `from` parameter to Transform Start API (#91116) 2023-01-17 10:36:21 +01:00
Mary Gouseti a7fdd3c036
GA the Health API under the url /_health_report (#92879) 2023-01-13 10:42:38 +01:00
Benjamin Trent a46e532cda
Allow more than one KNN search clause (#92118)
It makes sense to allow more than one KNN search clause per individual search request. It may be that different documents have separate vector spaces or that a single doc is index with more than one vector space. In both of these scenarios, users may want to retrieve a resulting set that takes into account all their indexed vector spaces. 

A prime example here would be searching a semantic text embedding along with searching an image embedding. 


closes https://github.com/elastic/elasticsearch/issues/91187
2023-01-12 11:35:50 -05:00
Ievgen Degtiarenko 22a1ba7b43
Expose tier balancing stats via internal endpoint (#92199) 2023-01-09 10:57:10 +01:00
Joe Gallo 0d59d3d00a
Fix version range after backport (#92707) 2023-01-05 12:04:43 -05:00
Seth Michael Larson c74877c488
Fix URL of Semantic Search API 2023-01-05 10:56:09 -06:00
Ievgen Degtiarenko 42464200fe
Add forecasted_write_load and forecasted_shard_size_in_bytes to the endpoint (#92303) 2023-01-05 12:08:52 +01:00
Joe Gallo 6850057676
Fix _bulk api dynamic_templates and explicit op_type (#92687) 2023-01-04 18:09:21 -05:00
Mark Vieira c2eda511de
Add JUnit rule based integration test cluster orchestration framework (#92379)
This commit adds a new test framework for configuring and orchestrating
test clusters for both Java and YAML REST testing. This will eventually
replace the existing "test-clusters" Gradle plugin and the build-time
cluster orchestration.
2022-12-21 15:33:46 -08:00
Andrei Dan c5ed72e70d
[Tests] Timeout 1s when creating a red index (#92486) 2022-12-21 19:12:21 +00:00
Przemysław Witek 26eabc67fd
Make transform _stats request cancellable (#92389) 2022-12-20 11:59:17 +01:00
Andrei Dan 0993c31eb7
GA the health API (#92420)
This marks the Health API as generally available.
2022-12-20 10:26:53 +00:00
Andrei Dan 3723af3ccd
[HealthAPI] Add size parameter that controls the number of affected resources returned (#92399)
This adds a `size` parameter that controls the maximum number of
returned affected resources. The parameter defaults to `1000`, must be
positive, and less than `10_000`
2022-12-16 16:15:01 +00:00
David Roberts 6fa3d73fd5
[ML] Make native inference generally available (#92213)
Previously this functionality was beta. This PR changes it to GA.
2022-12-12 15:43:30 +00:00
Pooya Salehi 3a223d933a
Prevalidate node removal API (pt. 2) (#91256)
This PR extends the basic Prevalidation API so that in case there are 
red non-searchable-snapshot indices in the cluster, we reach out to 
the nodes (whose removal is being prevalidated) to find out if they 
have a local copy of any red indices.

Closes #87776
2022-11-28 11:51:51 +01:00
Nik Everett dcfe6a3253
fix synthetic _source for sparse _doc_count field (#91769)
If the `_doc_count` field is sparse we were using Lucene incorrectly to
read it's values. This fixes how we interact with the iterator to load
the values.

Closes #91731
2022-11-22 14:32:27 -05:00
Ed Savage e0e32caf28
[ML] Option to delete user-added annotations for the reset/delete job APIs (#91698)
Currently there is no way to remove user-added annotations when a job is deleted or reset.
This change adds an option - delete_user_annotations - to both the delete and reset job APIs.
The default value is false, to keep the behaviour of these calls as it is currently.
2022-11-18 17:17:33 +00:00
Pooya Salehi 327f50ba46
Prevalidate node removal API (pt. 1) (#88952)
This PR adds the first part of the Prevalidate Node Removal API. This
API allows checking whether attempting to remove some node(s) from the
cluster is likely to succeed or not. This check is useful when a node
needs to be removed from a RED cluster, without risking loosing the last
copy of some RED shards.

In this PR, we only check whether a RED index is a Searchable Snapshot
index or not, in which case the removal of any node is safe as the RED
index is backed by a snapshot.

Relates #87776
2022-11-16 13:44:00 +01:00
Artem Prigoda 79ca59bc96
DesiredBalance: expose it via _internal/desired_balance (#91038)
Add an internal endpoint for exposed the desired balance and computation stats at a master node as GET _internal/desired_balance and returns
```
{
  "stats": {
    "computation_active": false,
    "computation_submitted": 5,
    "computation_executed": 5,
    "computation_converged": 5,
    "computation_iterations": 4,
    "computation_converged_index": 4,
    "computation_time_in_millis": 0,
    "reconciliation_time_in_millis": 0
  },
  "routing_table": {
    "test": {
      "0": {
        "current": [
          {
            "state": "STARTED",
            "primary": true,
            "node": "UPYt8VwWTt-IADAEbqpLxA",
            "node_is_desired": true,
            "relocating_node": null,
            "relocating_node_is_desired": false,
            "shard_id": 0,
            "index": "test"
          }
        ],
        "desired": {
          "node_ids": [
            "UPYt8VwWTt-IADAEbqpLxA"
          ],
          "total": 1,
          "unassigned": 0,
          "ignored": 0
        }
      },
      "1": {
        "current": [
          {
            "state": "STARTED",
            "primary": true,
            "node": "2x1VTuSOQdeguXPdN73yRw",
            "node_is_desired": true,
            "relocating_node": null,
            "relocating_node_is_desired": false,
            "shard_id": 1,
            "index": "test"
          }
        ],
        "desired": {
          "node_ids": [
            "2x1VTuSOQdeguXPdN73yRw"
          ],
          "total": 1,
          "unassigned": 0,
          "ignored": 0
        }
      }
    }
  }
}
```

Fixes #90583
2022-11-16 02:54:10 +01:00
Nik Everett 9d0b0bad86
Support synthetic _source for _doc_count field (#91465)
This add synthetic `_source` support for the `_doc_count` field so
downsampling should play nicely with sythetic `_source`.
2022-11-10 13:43:33 -05:00
Alan Woodward 547c8327b2
Allow FetchSubPhaseProcessors to report their required stored fields (#91269)
Loading of stored fields is currently handled directly in FetchPhase, with
some fairly complex logic examining various bits of the FetchContext to work
out what fields need to be loaded. This is further complicated by synthetic
source, which may have its own stored field requirements.

This commit tries to separate out these concerns a little by adding a new
StoredFieldsSpec record that holds information about which stored fields
need to be loaded. Each FetchSubPhaseProcessor can now report a
StoredFieldsSpec detailing what its requirements are, and these specs can
be merged together, along with requirements from a SourceLoader, to
determine up-front what fields should be loaded by the StoredFieldLoader.
The stored fields themselves are added into the SearchHit by a new
StoredFieldsPhase, which handles alias resolution and value post-
processing. The logic to determine when source should be loaded and
when not, based on the presence of script fields or stored fields, is
moved into FetchContext, which highlights some inconsistencies that
can be fixed in follow-up commits.
2022-11-10 08:40:22 +00:00
Andrei Dan c8e08fd512
[HealthAPI] Rename explain to verbose (#91417)
This renames the explain Health API parameter to verbose.

We decided to rename explain because verbose is a more established
term in the industry for "opt-in to get more information" and allows for more
flexibility to control what exactly that extra information is (explain is already
pushing the limits of what it semantically represents as it's controlling both
the diagnosis insights and the raw details information)
2022-11-09 08:47:53 +00:00
Albert Zaharovits af537cc4a3
Fix index expression options for requests with a single expression (#91231)
This PR affects requests that contain a single index name
or a single pattern (wildcard/datemath).
It aims to systematize the handling of the `allow_no_indices`
and `ignore_unavailable`indices options:
 * the allow_no_indices option is to be concerned with
wildcards that expand to nothing (or the entire request
expands to nothing)
 * the ignore_unavailable option is to be concerned with
explicit names only (not wildcards)

In addition, the behavior of the above options will now be
independent of the number of expressions in a request.
2022-11-04 18:11:12 +02:00
Dimitris Athanasiou 4e67df8b05
[ML] Low priority trained model deployments (#91234)
This adds a new parameter to the start trained model deployment API,
namely `priority`. The available settings are `normal` and `low`.

For normal priority deployments the allocations get distributed so that
node processors are never oversubscribed.

Low priority deployments allow users to test model functionality even if there
are no node processors available. They are limited to 1 allocation with a single thread.
In addition, the process is executed in low priority which limits the amount of
CPU that can be used when the CPU is under pressure. The intention of this is to
limit the impact of low priority deployments on normal priority deployments.

When we rebalance model assignments we now:

  1. compute a plan just for normal priority deployments
  2. fix the resources used by normal deployments
  3. compute a plan just for low priority deployments
  4. merge the two plans

Closes #91024
2022-11-04 14:22:30 +02:00
Jack Conradson f28ae4b288
Add support for indexing byte-sized knn vectors (#90774)
This change adds an element_type as an optional mapping parameter for dense vector fields as 
described in #89784. This also adds a byte element_type for dense vector fields that supports storing 
dense vectors using only 8-bits per dimension. This is only supported when the mapping parameter 
index is set to true.

The code follows a similar pattern to our NumberFieldMapper where we have an enum for 
ElementType, and it has methods that DenseVectorFieldType and DenseVectorMapper can delegate to 
to support each available type (just float and byte for now).
2022-10-20 14:45:58 -07:00
Nik Everett 71b5cad4eb
Move aggregations tests to module (#90953)
We're going to move all aggregations to the module soon and this saves a
little time in the build by only running the tests one time - in the
aggregations module.
2022-10-18 07:38:10 -04:00
Nik Everett 9660e8b1c2
Move mov_fn agg to module (#90836)
This continues to populate the `aggregations` module with it's first
pipeline aggregation and it's first custom script context.

Relates to #90283
2022-10-17 13:55:56 -04:00
Nik Everett cd4116cc07
Run aggs tests in aggs module (#90851)
Run the aggregations tests v7 compat tests against the aggregations
module and *not* the `rest-api-spec` module. This allows us to drop
`rest-api-spec`'s dependency on the aggregations module and keep it
"just the server" which is nice.

There are a few side effects here that are ok:
1. We run all aggregations REST tests in the aggregations module.
   Even the ones in `rest-api-spec`. This means we run them twice. We
   plan to move all of the aggregations REST tests into the aggregations
   module anyway.
2. We now bundle the REST tests in the aggregations module into the
   tests that the clients run for their verification step. This should
   keep our clients from losing coverage.
2022-10-17 09:36:28 -04:00
Salvatore Campagna bb73711b4b
Fail downsampling if DLS or FLS is defined on the source index (#90593)
We fail downsampling if field level security or document level security
restrict access to fields and/or documents in the source index.
This is done mainly to prevent situations where a user not allowed to
read documents and/or fields on the source index is (by mistake) allowed
access to documents and/or fields in the target index, which would normally
not be allowed access to.

We also add YAML test for the following four scenarios:
1. Donwsample operation executed by a non-admin user
2. Downsample operation executed by an admin with field level security
3. Downsample operation executed by an admin with document level security
4. Downsample operation executed by an admin without field or document level security
2022-10-14 12:39:09 +02:00
Francisco Fernández Castaño 1a3032beb6
Keep track of average shard write load (#90768)
This commit adds a new field, write_load, into the shard stats. This new stat exposes the average number of write threads used while indexing documents.

Closes #90102
2022-10-13 16:34:45 +02:00
David Kyle 9e6a784aa5
[ML] Semantic search endpoint (#90450)
Adds a {index}_semantic_search endpoint which first converts the query text into a dense vector
using a NLP text embedding model then performs a knn search against an index containing 
dense vectors created with the same embedding model.
2022-10-13 13:17:30 +01:00
Martijn van Groningen 03054d066e
Move auto date histogram to aggregations module (#90746)
This change also moves adjacency_matrix aggregation to its own package.

Note that that this PR also moves test code not related to auto date
histogram. I think this is cleaner then leaving some tests in a non
desired state between PRs. Also the test code that has been moved is
slatted for being moved to the aggregations module. I suspect that
future changes, like for example moving `terms` agg, require that other
aggregations  to be moved as well (e.g. `significant_terms`), since a
lot of code is reused as well.

Relates to #90283
2022-10-12 04:15:57 -04:00
Dimitris Athanasiou 16bfc550ea
[ML] Add api to update trained model deployment number_of_allocations (#90728)
This commit adds a new API that users can use calling:

```
POST _ml/trained_models/{model_id}/deployment/_update
{
  "number_of_allocations": 4
}
```

This allows a user to update the number of allocations for a deployment
that is `started`.

If the allocations are increased we rebalance and let the assignment
planner find how to allocate the additional allocations.

If the allocations are decreased we cannot use the assignment planner.
Instead, we implement the reduction in a new class `AllocationReducer`
that tries to reduce the allocations so that:

  1. availability zone balance is maintained
  2. assignments that can be completely stopped are preferred to release memory
2022-10-12 10:04:23 +03:00
Andrei Dan f641d1c6d7
Health API symptom copy (#90761)
Update the copy for a few symptoms and impacts in the health API.
2022-10-11 09:16:21 +01:00
Andrei Dan c6ee8242af
Update min version for the diagnosis yaml test (#90731) 2022-10-06 19:48:02 +01:00
Przemyslaw Gomulka 3eaaffb488
[Testing] Enable bwc and fix sorting for 500_date_range (#90681)
the #90458 has been backported to all branches so the bwc testing can be enable for this tests were incorrectly relying on sort order. Added sort to make it deterministic

closes #90668
2022-10-05 18:56:42 +02:00
Jack Conradson 8b0d0716d1
Add profiling and documentation for dfs phase (#90536)
Adds profiling statistics for the dfs phase, and adds documentation for both the dfs phase profiling 
and kNN profiling.

Closes #89713
2022-10-05 09:54:36 -07:00
Andrei Dan b78e71d600
Update test versions after backport (#90679) 2022-10-05 16:06:33 +01:00
Andrei Dan 5d97f0e09f
[HealthAPI] Diagnosis: report typed affected resources (#90653)
The health API reports the affected resources in case of an unhealthy
deployment. Until now all indicators reported one type of resource per
diagnosis (index, ILM policy, snapshot repository)

With the introduction of the disk indicator we now have an indicator
that reports multiple types of resources under the same diagnosis (ie.
nodes and indices).

This changes the structure of the `affected_resources` field to
accommodate multiple types of resources:
```
"affected_resources": {
  "nodes": [
    {
      "id": "e1af6F5rTcmgpExkdOMzCg",
      "name": "hot"
    },
    {
      "id": "u_wBVl4ZRne4uZq_ziLsuw",
      "name": "warm"
    }
  ],
  "indices": [
    ".geoip_databases",
    "test_index"
  ]
}
```
2022-10-05 14:06:48 +01:00
Luca Cavanna 8f44713658
Don't shortcut the total hit count for text fields (#90341)
When we switched to using the FieldExistsQuery (see #88312) instead of the deprecated NormsFieldExistsQuery and
DocValuesFieldExistsQuery, we ended up shortcutting the total hit count for text fields to the doc count retrieved
from the terms enum. This does not take into account empty strings, as that converts to an empty token set for text
fields. In presence of text fields, we cannot shortcut, and this can be prevented by checking that the field has
doc_values. This was checked before indirectly by checking that the query is a DocValuesFieldExistsQuery.

Closes #89760
2022-10-05 10:43:04 +02:00
Ievgen Degtiarenko 4d6d979e0e
Deprecate state field in `/_cluster/reroute` response (#90399) 2022-10-05 08:18:27 +02:00
Przemyslaw Gomulka 3f3a95e2dc
Fix date rounding for date math parsing (#90458)
in #89693 the rounding logic was only applied when a field was present on a pattern. This is incorrect as for dates like "2020" we want to default to "2020-01-01T23:59:59.999..." when rounding is enabled.

This commit always applies monthOfYear or dayofMonth defaulting (when rounding enabled) except when the dayOfYear is set
closes #90187
2022-10-04 10:43:44 +02:00
Martijn van Groningen 14cf04d74b
Introduce a new aggregation module (#90294)
This commit introduces a new aggregation module
and moves the `adjacency_matrix` to this new module.

The new module name is `aggregations`.
The new module will use the `org.elasticsearch.aggregations.bucket` package for all bucket aggregations. 

Relates to #90283
2022-09-30 16:24:52 +02:00
Yang Wang 90375ac5bc
Move user profile status to stable and private (#90439)
This PR moves the user profile feature and associated APIs from
experimental to stable since higher level features built on top of it
are going to be GA. The feature and APIs are still kept private because
they are meant to internally support higher level features and we don't
expect them to be directly used by end-users.

This PR also moves the security domain feature to GA by removing the
beta label. Security domain requires user configuration to work so it is
not something internally controlled by stack and solutions.
2022-09-29 10:44:09 +09:30
David Roberts d9ea080d10
[ML] Release native inference functionality as beta (#90418)
Previously this functionality was tech preview (aka experimental).
This PR changes it to beta.
2022-09-28 11:09:02 +01:00
Dimitris Athanasiou d1d0f6d623
[ML] Update `number_of_allocations` description in REST spec (#90413)
In 8.5 the definition of `number_of_allocations` parameter to the
start trained model deployment API was changed. This commit updates
the REST spec accordingly.
2022-09-27 19:13:40 +03:00
Jack Conradson 94f05da248
Add profiling information for knn vector queries (#90200)
This adds timers to the dfs phase to profile a knn vector query and provide a breakdown of several 
parts of the query.
2022-09-26 15:31:16 -07:00
Salvatore Campagna 13d946c81b
test: test date histogram aggregations using different date field types (#90342)
As a result of closing issue #75509 here we test data histogram
aggregations including auto date histograms and composite aggregations,
running the aggregation on two different indices having the same
field but with different date type.
2022-09-26 16:53:25 +02:00
Nik Everett d0cf9f5034
Synthetic `_source`: `ignore_malformed` for `ip` (#90038)
This adds synthetic `_source` support for `ip` fields with
`ignore_malfored` set to `true`. We save the field values in hidden
stored field, just like we do for `ignore_above` keyword fields. Then we
load them at load time.
2022-09-26 09:28:55 -04:00
Iraklis Psaroudakis 2a701bae19
Nodes test to use 4 shards (#90297)
Fixes #90228
2022-09-26 11:03:11 +03:00
Nik Everett f2a1ee9995
Synthetic _source: test `top_hits` (#90137)
This adds a test for the `top_hits` aggregation using synthetic
`_source`. It works but let's be a bit paranoid here because it's a
whole new fetch phase.....
2022-09-24 05:55:23 +09:30
Nik Everett 5a1b26cc9a
Synthetic _source: test _source filtering (#90138)
This adds some tests for `_source` filtering during `GET` and
`POST _search` when the index uses synthetic `_source`. It works, but
let's be paranoia and have an explicit test just in case.
2022-09-21 09:42:25 -04:00
David Turner fb59f12101
Mute failing test (#90186) 2022-09-21 21:35:14 +09:30
Iraklis Psaroudakis 3ed7a04d22
Introduce node mappings stats (#89807)
So that they are visible in NodeIndicesStats only at the node and index (but not shard) levels. Also visible in the _cat/nodes table. And make an exact count yaml REST test.
2022-09-19 15:47:47 +03:00
Nik Everett 953a0dd707
REST tests for `moving_fn` agg (#90012)
This expands on the REST layer tests for the `moving_fn` agg asserting
the results of the various moving functions, some failure cases, and
some access edge cases. These tests buy us backwards compatibility tests
and, eventually, forwards compatibility testing.
2022-09-14 05:39:22 +09:30
David Kyle 54001e2215
[CI] Mute 110_max_metric (#89997)
For #89994
2022-09-12 11:01:55 +01:00
tmgordeeva 9ce5657d34
Fix merging with empty results (#86939)
* Fix merging with empty results

If we try to merge responses with an empty response, we might take the RAW
format from an empty response over a format from a non-empty response.

Closes #84622
2022-09-09 14:16:54 -07:00
tmgordeeva 5809b0d318
Date histogram range edge case fix (#88957)
* Date histogram range edge case fix

This fixes an illegal argument exception when using conflicting ranges.
Sometimes if the min of the range is high enough above the max, we get an error
in TimeUnitRounding.prepareOffsetOrJavaTimeRounding because our min time is
greater than our max. This appears to have started when we began using query
bounds to bound ranges.
2022-09-09 14:16:38 -07:00
Nik Everett 188990fea7
Fix segment stats in tsdb (#89754)
The serialization for segment stats was broken for tsdb because we
return a *slightly* different sort configuration. That caused
`_segments` and `_cat/segments` to break when any shard of the tsdb
index is hosted on another node.

Closes #89609
2022-09-09 01:41:11 +09:30
Nik Everett c4a77d572d
Synthetic _source: support dense_vector (#89840)
This adds support for synthetic _source to `dense_vector` fields.

![image](https://user-images.githubusercontent.com/215970/188734496-0f0772c7-4c7a-46b6-b978-0c220e73474d.png)
2022-09-09 00:54:59 +09:30
Christos Soulios f8d1d2afa6
[TSDB] Rename rollup public API to downsample (#89809)
This PR renames all public APIs for downsampling so that they contain the downsample 
keyword instead of the rollup that we had until now.

1. The API endpoint for the downsampling action is renamed to:

/source-index/_downsample/target-index

2. The ILM action is renamed to

PUT _ilm/policy/my_policy
{
  "policy": {
    "phases": {
      "warm": {
        "actions": {
          "downsample": {
  	    "fixed_interval": "24h"
  	  }
  	}
      }
    }
  }
}

3.  unsupported_aggregation_on_rollup_index was renamed to unsupported_aggregation_on_downsampled_index

4. Internal trasport actions were renamed:

    indices:admin/xpack/rollup -> indices:admin/xpack/downsample
    indices:admin/xpack/rollup_indexer -> indices:admin/xpack/downsample_indexer

5. Renamed the following index settings:

    index.rollup.source.uuid -> index.downsample.source.uuid
    index.rollup.source.name -> index.downsample.source.name
    index.rollup.status -> index.downsample.status

Finally, we renamed many internal variables and classes from *Rollup* to *Downsample*. 
However, this effort will be completed in more than one PRs so that we minimize conflicts with other in-flight PRs.

Relates to #74660
2022-09-07 19:23:44 +03:00
Nik Everett e399f8b2ed
Synthetic _source: fix extra fields in GET (#89778)
The fields loaded to support synthetic `_source` were all coming back in
the `fields` response of `GET` which was confusing. This removes them
from the results unless they are explicitly asked for.
2022-09-06 14:20:01 -04:00
Nik Everett 703571a4f1
Synthetic _source: support ignore_above (#89466)
This allows you to use `ignore_above` with `keyword` fields in synthetic
source. Ignored values are stored in a "backup" stored field and added
to the end of the list of results. This makes `ignore_above` work pretty
much the same way as it does when you don't have synthetic source. The
only difference is the order of the results. But synthetic source
changes the order of results anyway. That should be fine.
2022-09-01 10:43:34 -04:00
Yang Wang 80c6d9faea
Option to return profile uid in GetUser response (#89570)
This PR adds a new query parameter to the GetUser API to optionally return
profile uid for each user if there is an associated profile.
2022-08-31 12:19:54 +10:00
Jack Conradson 8c30b86fe2
Fix bug for kNN with filtered aliases (#89621)
This change adds the filter query for a filtered alias to the knn query during the dfs phase on the 
shard. This ensures the correct number of k results are returned instead of removing results as a post 
filter.

Fixes: #89561
2022-08-30 15:57:37 -07:00
Nik Everett c6ad2dac74
Paranoid tests for `meta` (#89677)
This adds some extra paranoid tests for the `meta` parameter in aggs,
specifically the `filters` agg. These tests are at the REST level so
they provide backwards compatibility tests as well. They make sure that
`meta: {}` and `meta: null` do what we expect - return a `meta: {}` and
return an error.

Relates to #89467
2022-08-29 11:38:43 -04:00
QY cdcf238787
Remove unexpected meta param in agg's response (#89467)
Only output the `meta` param from aggs if it was sent in the request.
2022-08-26 15:57:51 -04:00
Yang Wang e276fd9506
Docs and yaml tests for viewing API key's limited-by (#89443)
This PR updates relevant docs and yaml tests to cover the new feature
of viewing API key's limited-by role descriptors introduced in #89273

Relates: #89058
2022-08-24 09:39:59 +10:00
David Turner c238aa1b46
Add YAML spec docs about matching errors (#89370)
It's not obvious that a YAML test with a `catch` stanza also permits
`match` blocks to assert things about the structure of the error
response, but this structure may be an important part of the API spec.
This commit adds this info to the docs about YAML tests.
2022-08-18 22:20:13 +09:30
Nik Everett b46d95b2fb
REST tests for percentiles_bucket agg (#88029)
Adds REST tests for the `percentiles_bucket` pipeline bucket
aggregation. This gives us forwards and backwards compatibility tests
for these aggs as well as mixed version cluster tests for these aggs.

Relates to #26220
2022-08-17 13:19:49 -04:00
Nik Everett 63b850cac9
REST tests for cumulative pipeline aggs (#88966)
Adds REST tests for the `cumulative_cardinality` and `cumulative_sum`
pipeline aggregations. This gives us forwards and backwards compatibility
tests for these aggs as well as mixed version cluster tests for these
aggs.

Relates to #26220
2022-08-17 13:05:47 -04:00
Nik Everett 79a89790e3
Synthetic source: load text from stored fields (#87480)
Adds support for loading `text` and `keyword` fields that have
`store: true`. We could likely load *any* stored fields, but I
wanted to blaze the trail using something fairly useful.
2022-08-17 10:18:36 -04:00
Nik Everett b327b17653
Fix shard splitting for `nested` (#89351)
I broke shard splitting when `_routing` is required and you use `nested`
docs. The mapping would look like this:
```
"mappings": {
  "_routing": {
    "required": true
  },
  "properties": {
    "n": { "type": "nested" }
  }
}
```

If you attempt to split an index with a mapping like this it'll blow up
with an exception like this:
```
Caused by: [idx] org.elasticsearch.action.RoutingMissingException: routing is required for [idx]/[0]
	at org.elasticsearch.cluster.routing.IndexRouting$IdAndRoutingOnly.checkRoutingRequired(IndexRouting.java:181)
	at org.elasticsearch.cluster.routing.IndexRouting$IdAndRoutingOnly.getShard(IndexRouting.java:175)
```

This fixes the problem by entirely avoiding the branch of code. That
branch was trying to find any top level documents that don't have a
`_routing`. But we *know* that there aren't any top level documents
without a routing in this case - the routing is "required". ES wouldn't
have let you index any top level documents without the routing.

This also adds a small pile of REST layer tests for shard splitting that
hit various branches in this area. For extra paranoia.

Closes #88109
2022-08-16 11:55:46 -04:00
weizijun 104ad7fd92
TSDB: fix time series field caps bwc yaml test (#89236)
Stops the repeated test failures due to #89171
2022-08-15 09:46:09 +01:00
Yang Wang d663231a83
User Profile - GetProfile API nows supports multiple UIDs (#89023)
This PR expands the existing GetProfile API to support getting multiple
profiles by IDs. As a result, the response format is also changed to
align with the latest version of API design guideline. Concretely, this
means moving the profiles as an array inside a top level "profiles"
field so that (1) does not mix dynamic fields (uid) with static fields
and (2) enforcing an order in the response which is desirable for
clients.

The change also reports any error encounter in the retrieving process in
a top level "errors" field.

Relates: #81910
2022-08-10 10:51:38 +09:30
Benjamin Trent d588d456f0
[ML] add new trained model deployment cache clear API (#89074)
This adds a new `_ml/trained_models/<model_id>/deployment/cache/_clear` API. This will clear the inference cache on every node where the model is allocated.
2022-08-04 19:45:15 +01:00
Nhat Nguyen e3c33e2acd
Deduplicate fetching doc-values fields (#89094)
If a docvalues field matches multiple field patterns, then ES will 
return the value of that doc-values field multiple times. Like fetching
fields from source, we should deduplicate the matching doc-values
fields.
2022-08-04 14:05:09 -04:00
likzn f28f4545b2
In the field capabilities API, re-add support for `fields` in the request body (#88972)
We previously removed support for `fields` in the request body, to ensure there
was only one way to specify the parameter. We've now decided to undo the
change, since it was disruptive and the request body is actually the best place to
pass variable-length data like `fields`.

This PR restores support for `fields` in the request body. It throws an error
if the parameter is specified both in the URL and the body.

Closes #86875
2022-08-04 13:44:50 -04:00
Christos Soulios b81f4187ab
[TSDB] Metric fields in the field caps API (#88695)
To assist the user in configuring the visualizations correctly while leveraging TSDB
functionality, information about TSDB configuration should be exposed via the field 
caps API per field.

Especially for metrics fields, it must be clear which fields are metrics and if they belong 
to only time-series indexes or mixed time-series and non-time-series indexes.

To further distinguish metric fields when they belong to any of the following indices:

  -  Standard (non-time-series) indexes
  -  Time series indexes
  -  Downsampled time series indexes

This PR modifies the field caps API so that the mapping parameters time_series_dimension 
and time_series_dimension are presented only when they are set on fields of time-series indexes.
Those parameters are completely ignored when they are set on standard (non-time-series) indexes.

This PR revisits some of the conventions adopted by #78790
2022-08-04 20:42:34 +03:00
Ed Savage 188f8872c6
[ML] ECS Grok patterns in the _text_structure/find_structure endpoint (#88982)
Also add support for new CATALINA/TOMCAT timestamp formats used by ECS Grok patterns

Relates #77065

Co-authored-by: David Roberts <dave.roberts@elastic.co>
2022-08-04 18:39:04 +01:00
Julie Tibshirani 0bed7f768a Fix failures in vector field usage mixed cluster test 2022-08-03 16:14:46 -04:00
Julie Tibshirani 21eb984e64
Deprecate the _knn_search endpoint (#88828)
This change deprecates the kNN search API in favor of the new 'knn' option
inside the search API. The 'knn' option is now the preferred way of performing
kNN search.

Relates to #87625
2022-08-03 15:19:01 -04:00
Nikolaj Volgushev a124bafe7e
REST tests and spec for bulk update API keys (#89027)
This PR adds REST API spec and YAML test files for the BulkUpdateApiKey
operation.
2022-08-03 12:42:54 +02:00
Artem Prigoda f4e617e894
Add a test for checking for misspelled "dry_run" parameters for Desired Nodes API (#88898)
Check we the API doesn't accept a misspelled parameter and returns a client error.
2022-07-28 16:15:43 +02:00
Nik Everett 3bcee8eaa0
Format runtime geo_points (#85449)
This formats the result of the `fields` section of the `_search` API for
runtime `geo_point` fields using the `format` parameter like we do for
non-runtime `geo_point` fields. This changes the default format for
those fields from `lat, lon` to `geojson` with the option to get `wkt`
or any other format we support.

The fix does so by preserving the `double, double` nature of the
`geo_point` rather than encoding it immediately in the script. Callers can
use the results. The field fetchers use the `double, double` natively,
preserving as much precision as possible. The queries quantize the points
exactly like lucene indexing does. And like the script did before this Pr.

Closes #85245
2022-07-27 13:11:07 -04:00
Przemko Robakowski 539434dbb4
Add min_* conditions to rollover (#83345) 2022-07-26 11:46:39 -04:00
Julie Tibshirani abd561a277
Support kNN vectors in disk usage action (#88785)
This change adds support for kNN vector fields to the `_disk_usage` API. The
strategy:
* Iterate the vector values (using the same strategy as for doc values) to
estimate the vector data size
* Run some random vector searches to estimate the vector index size 

Co-authored-by: Yannick Welsch <yannick@welsch.lu>

Closes #84801
2022-07-26 07:57:47 -07:00
Artem Prigoda c0bc85522d
Clean up desired nodes in between dry run tests (#88797) 2022-07-26 12:04:06 +02:00
Artem Prigoda 72a6fdc2b8
Support "dry run" mode for updating Desired Nodes (#88305)
Add the dry_run query parameter to support simulating of updating of desired nodes. The update request will be validated, but no cluster state updates will be performed. In order to indicate that the response was a result of a dry run, we add the dry_run run field to the JSON representation of a response.

See #82975
2022-07-26 09:03:12 +02:00
Keith Massey 4b060a6046
Removing the notion of components from the health API (#88663)
This commit removes the notion of components from the health API. They are gone from being
a top-level field in the response, and indicators is promoted into its place.
2022-07-25 12:29:06 -05:00
Andrei Dan da765ced7f
Remove help_url,rename summary to symptom, and user_actions to diagnosis (#88553)
Remove help_url,rename summary->symptom,user_actions->diagnosis
Separate the diagnosis `message` field in `cause` and `action`
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
2022-07-25 10:35:16 +01:00
Julie Tibshirani e3ede67262
Integrate ANN into _search endpoint (#88694)
This PR adds a new `knn` option to the `_search` API to support ANN search.
It's powered by the same Lucene ANN capabilities as the old `_knn_search`
endpoint. The `knn` option can be combined with other search features like
queries and aggregations.

Addresses #87625
2022-07-22 08:02:07 -07:00
Benjamin Trent 94f2544998
Adding cardinality support for random_sampler agg (#86838)
This adds support for the `cardinality` aggregation within a random_sampler.

This usecase is helpful in determining the ratio of unique values compared to the count of total documents within the sampled set.
2022-07-21 07:19:35 -04:00
Seth Michael Larson fffabae10a
Add pagination parameters to API spec and docs for 'snapshot.get' API 2022-07-20 06:35:52 -05:00
tmgordeeva ab2602ecb0
Propagate alias filters to significance aggs filters (#88221)
Propagate alias filters to significance aggs filters

If we have an alias filter, use it as part of the background filter on a
signficant terms agg. Previously, alias filters did not apply to background
filters so this will change bg_count results for some significant terms aggs
using background filter.

Closes #81585
2022-07-19 10:03:08 -07:00
Seth Michael Larson 478c06ef29
Verify that 'details' aren't sent when explain=false 2022-07-18 09:48:11 -05:00
Benjamin Trent afa28d49b4
[ML] add new cache_size parameter to trained_model deployments API (#88450)
With: https://github.com/elastic/ml-cpp/pull/2305 we now support caching pytorch inference responses per node per model.

By default, the cache will be the same size has the model on disk size. This is because our current best estimate for memory used (for deploying) is 2*model_size + constant_overhead. 

This is due to the model having to be loaded in memory twice when serializing to the native process. 

But, once the model is in memory and accepting requests, its actual memory usage is reduced vs. what we have "reserved" for it within the node.

Consequently, having a cache layer that takes advantage of that unused (but reserved) memory is effectively free. When used in production, especially in search scenarios, caching inference results is critical for decreasing latency.
2022-07-18 09:19:01 -04:00
Alan Woodward 5c11a81913
Add 'mode' option to `_source` field mapper (#88211)
Currently we have two parameters that control how the source of a document
is stored, `enabled` and `synthetic`, both booleans. However, there are only
three possible combinations of these, with `enabled:false` and `synthetic:true`
being disallowed. To make this easier to reason about, this commit replaces
the `enabled` parameter with a new `mode` parameter, which can take the values
`stored`, `synthetic` and `disabled`. The `mode` parameter cannot be set
in combination with `enabled`, and we will subsequently move towards
deprecating `enabled` entirely.
2022-07-18 12:50:10 +01:00
Chen Ni c45c205c33
Add test execution guide in yamlRestTest asciidoc (#88490) 2022-07-14 08:22:35 -07:00
Nhat Nguyen 227d80975b
Add tests for query/agg on lookup runtime fields (#88389)
Adds tests to ensure that querying and aggregating on lookup runtimes
aren't supported.

Relates #88296
2022-07-09 02:02:13 +09:30
Nikolaj Volgushev f42b15bc8c
Updatable API keys - REST API spec and tests (#88270)
This PR adds REST API spec and YAML test files for the UpdateApiKey
operation.
2022-07-08 11:48:02 +02:00
Ryan Ernst 9016883e1c
Add build_flavor back to info api rest response (#88336)
The build_flavor was previously removed since it is no longer relevant;
only the default distribution now exists. However, the removal of build
flavor included removing it from the version information on the info
response for the root path. This API is supposed to be stable, so
removing that key was a compatibility break. This commit adds the
build_flavor back to that API, hardcoded to `default`. Additionally, a
test is added to ensure the key exists going forward, until it can be
properly deprecated.

closes #88318
2022-07-08 09:54:29 +09:30
Mark Tozzi 9ee6a19187
Add ability to select execution mode for cardinality aggregation (#87704)
Plumbs through a new parameter for the cardinality aggregation, to allow configuring the execution mode.  This can have significant impacts on speed and memory usage.  This PR exposes three collection modes and two heuristics that we can tune going forward.  All of these are treated as hints and can be silently ignored, e.g. if not applicable to the given field type.  I've change the default behavior to optimize for time, which potentially uses more memory.  Users can override this for the old behavior if needed.
2022-07-05 09:11:22 -04:00
Rene Groeschke 8ccae4da71
Setup elasticsearch dependency monitoring with Snyk for production code (#88036)
This adds the generation and upload logic of Gradle dependency graphs to snyk

We directly implemented a rest api based snyk plugin as:

the existing snyk gradle plugin delegates to the snyk command line tool the command line tool 
uses custom gradle logic by injecting a init file that is 

a) using deprecated build logic which we definitely want to avoid
b) uses gradle api we avoid like eager task creation.

Shipping this as a internal gradle plugin gives us the most flexibility as we only want to monitor 
production code for now we apply this plugin as part of the elasticsearch.build plugin, 
that usage has been for now the de-facto indicator if a project is considered a "production" project 
that ends up in our distribution or public maven repositories. This isnt yet ideal and we will revisit 
the distinction between production and non production code / projects in a separate effort.

As part of this effort we added the elasticsearch.build plugin to more projects that actually end up 
in the distribution. To unblock us on this we for now disabled a few check tasks that started failing by applying elasticsearch.build. 

Addresses  #87620
2022-06-29 13:29:14 +02:00
Nik Everett d88dfb11c7
More REST tests for avg/max/min/sum_bucket aggs (#88027)
Adds REST layer tests for some sneaky cases in the the `avg_bucket`,
`max_bucket`, `min_bucket`, and `sum_bucket` pipeline aggregations.
This gives us forwards and backwards compatibility tests for these
aggs as well as mixed version cluster tests for these aggs.

Relates to #26220
2022-06-27 13:49:29 -04:00
Ryan Ernst e3c4cddbe2
Remove legacy bootstrap plugins (#87775)
Bootstrap plugins were an internal mechanism added to allow a
filesystemprovider for cloud with the quota-aware-fs plugin. Since that
was removed, bootstrap plugins no longer serve a purpose. They were
never officially documented because they were for internal use only.
This commit removes the bootstrap plugins infrastructure.
2022-06-23 20:38:06 -04:00
Julie Tibshirani 572a5b9bb4 Skip dense_vector field usage test before 8.1
Fixes #87971.
2022-06-23 10:25:17 -07:00
Julie Tibshirani 3a9e511117
Move kNN search and dense vectors to core (#87815)
This PR moves kNN search and dense vector support out of an xpack plugin and
into server.

In #87625 we plan to integrate ANN search into the main `_search` endpoint as a
new top-level component called `knn`. So kNN will be a dedicated part of the
search request, and we'll have kNN logic within the search phases. The classes
and logic will live in server, matching the other search components like
suggesters, field collapsing, etc.
2022-06-22 21:10:20 -07:00
Nik Everett 463d46cd79
Add force_synthetic_source to mget (#87574)
This adds the option to force synthetic source to the MGET API. See
 #87068 for more discussion on why you'd want to do that - the short
version is to get an upper bound on the performance cost of using
synthetic source in MGET.
2022-06-22 08:55:55 -04:00
Mark Tozzi 5f2411a3b8
Revert "Correct skip versions for new flattened terms test (#87540)" (#87764)
This reverts commit f72c7da7ee.
2022-06-16 16:35:39 -04:00
Nik Everett cf154fd367
Tests for synthetic _source from translog (#87578)
This adds tests to make sure that we use all of the normal synthetic
source machinery, even when loading from the translog. So all GETs on
synthetic source indices will require an in memory index. That'll be an
extra cost on indices that are updated very very frequently.
2022-06-16 14:51:17 -04:00
Mark Tozzi f72c7da7ee
Correct skip versions for new flattened terms test (#87540)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2022-06-16 14:38:05 -04:00
Nik Everett 5b525a290e
REST tests for avg/max/min/sum_bucket aggs (#87009)
Adds REST layer tests for the `avg_bucket`, `max_bucket`, `min_bucket`,
and `sum_bucket` pipeline aggregations. This gives us forwards and
backwards compatibility tests for these aggs as well as mixed version
cluster tests for these aggs.

Relates to #26220
2022-06-16 12:16:06 -04:00
Nik Everett d74b45b9b1
REST tests for stats_bucket aggs (#87006)
Adds REST tests for `stats_bucket` and `extended_stats_bucket` aggs.

Relates to #26220
2022-06-16 12:14:52 -04:00
Nik Everett 48ab87f60b
Fix synthetic source highlighting tests (#87749)
The synthetic source highlighting tests would sometimes fail in a
strange way - they expect the entire search request to fail but it
*didn't* - only a single shard would fail. This locks the tests to
always make single shard indices so the failures are consistent.

Closes #87730
2022-06-16 12:07:43 -04:00
Nik Everett 8ebf39b7e1
Fixup highlighting with synthetic source (#87667)
Synthetic source has a habit of reordering text fields. This frustrates
highlighting because it *often* wants to use index structures to find
the offsets to values in the field. This disables the FVH highlighter
for multi-valued text fields when synthetic source is enabled and runs
the unified highlighter in "analyze" mode when synthetic source is
enabled. That's *enough* to stop them from spitting out wrong answers.

We might be leaving some performance on the table when the unified
highlighter works on a single valued text field that is indexed with
offsets or term vectors. We don't really expect that to be common at all
though because *generally* folks will enable synthetic source to save
space and adding offsets or term vectors is quite space inefficient. If
it comes up, we might be able to improve here.
2022-06-15 14:49:06 -04:00
David Turner fcf293f87c
Report overall mapping size in cluster stats (#87556)
Adds measures of the total size of all mappings and the total number of
fields in the cluster (both before and after deduplication).

Relates #86639
Relates #77466
2022-06-14 13:55:14 +01:00
Nik Everett a37edb7796
Add force_synthetic_source to GET (#87536)
This adds the option to force synthetic source to the GET API. See
 #87068 for more discussion on why you'd want to do that - the short
version is to get an upper bound on the performance cost of using
synthetic source in GET.
2022-06-09 09:40:36 -04:00
Nik Everett 2ec59e799b
Fix test in synthetic source (#87534)
Fixes a test for forcing synthetic source that sometimes fails if the
index has more than one shard. We're just looking for a sensible failure
message here so we can lock it to one shard.
2022-06-08 15:17:25 -04:00
Yang Wang f5ceed19fc
User Profile - remove feature flag (#87383)
The feature flag is no longer necessary in the 8.4 release cycle. The
feature itself is still in beta.
2022-06-08 10:18:18 -04:00
Mark Tozzi c9af118237
Fix a bug with flattened fields in terms aggregations (#87392)
The root cause here was that missing did not correctly delegate `supportsGlobalOrdinalsMappnig` to the wrapped values source, instead falling back to the default.  I've added the delegation, and made the base method abstract so this doesn't happen again.
2022-06-08 08:08:18 -04:00
Salvatore Campagna 5d062f9fdd
Make the metric in the buckets_path parameter optional (#87220)
With this change the metric field name becomes optional if the
'bukets_path' is pointing to a multi-value aggregation with a single
metric field. Normally the full path would be required including
the aggregation name followed by the metric field.

If the metric is not specified in the path and the multi-value
aggregation computes more than one value an error is thrown.

The old notation is still supported for backward compatibility in case
the full path is specified and the target multi-value aggregation
computes a single value.
2022-06-08 10:44:02 +02:00
Nik Everett d0b50b56a0
Add an option to _search to force synthetic source (#87068)
This adds `?force_synthetic_source` to, well, force running the fetch
phase with synthetic source. If the mapping is incompatible with
synthetic source it'll throw a 400 error.
2022-06-07 11:11:14 -04:00
Albert Zaharovits 7be60d6068
[DOCS] Profile Has Privileges API (#87360)
Docs for the new Has Privileges API for profiles from #85898.

[Has privileges user profile API
preview](https://elasticsearch_87360.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/security-api-has-privileges-user-profiles.html).
2022-06-07 03:58:32 -04:00
Keith Massey c95230d155
Master stability health indicator part 1 (when a master has been seen recently) (#86524)
This is the first PR for the master stability check, which is part of the health API. It handles the case
when we have seen a master node recently. The more complicated case when we have not seen a
master node recently will be in subsequent PRs.
2022-06-06 14:40:15 -05:00
Nik Everett 5f70d30330
Synthetic source: paranoid tests for configuration (#87182)
This adds some paranoid REST layer tests for modifying the `synthetic`
configuration.
2022-06-06 09:37:57 -04:00
Luca Cavanna 50793a68a8
Fields API to allow fetching values when _source is disabled (#87267)
Back when we introduced the fields parameter to the search API, it could only fetch values from _source, hence
the corresponding sub-fetch phase fails early whenever _source is disabled. Today though runtime fields can
be retrieved from a separate value fetcher that reads from fielddata, and metadata fields can be retrieved
from stored fields. These two scenarios currently throw an unnecessary error whenever _source is disabled.

This commit removes the check for disabled _source, so that runtime fields and metadata fields can be retrieved even when _source is disabled. Fields that need to be loaded from _source are simply skipped whenever _source is disabled, similar to when a field is not found in _source.

Closes #87072
2022-06-02 11:28:36 +02:00
Nik Everett f6958190cf
Synthetic source: tests for disabling subobjects (#87261)
This adds some paranoid tests for synthetic source with disabling
subobjects, as added by #86166. It turns out that synthetic source does
exactly what you'd expect with disabling subobjects - it creates fields
with dots in their names. This adds tests for that.
2022-06-01 09:53:31 -04:00
Fernando Briano 89bc5596fb
Rest API Spec: Wraps YAML tests values with colon in quotes (#87148) 2022-05-26 15:24:59 +01:00
James Baiera 0ec6998f16
Health api copy editing (#87010)
Edits to human readable text returned from the health api. A number of edits have been added for 
readability and correctness.
2022-05-24 16:32:30 -04:00