Commit Graph

490 Commits

Author SHA1 Message Date
David Roberts bf00ab381e
[ML] Add ML memory stats API (#83802)
Adds an API that can be used to find out how much memory ML
is permitted to use and is currently using on each node, both
within the JVM heap, and natively, outside of the JVM.
2022-02-17 09:19:14 +00:00
Lisa Cawley 458ef91066
[DOCS] Move ML info and upgrade APIs (#84005) 2022-02-16 11:23:00 -08:00
Tobias Stadler e3deacf547
[DOCS] Fix typos (#83895) 2022-02-15 12:42:17 -05:00
Lisa Cawley 104efd4343
[DOCS] Minor edits to trained model APIs (#81549) 2022-02-09 13:44:13 -08:00
David Kyle c1fbf87de8
[ML] Add error counts to trained model stats (#82705)
Adds inference_count, timeout_count, rejected_execution_count
and error_count fields to trained model stats.
2022-01-27 16:18:20 +00:00
István Zoltán Szabó b42ba64019
[DOCS] Fixes geo function field names. (#83198) 2022-01-27 12:03:58 +01:00
Ugo Sangiorgi 305ff20b8f
[DOCS] Add missing HTML anchors to CCR and ML (#80287) 2022-01-26 11:00:40 -08:00
István Zoltán Szabó a5affc7104
[DOCS] Fixes field names in ML sum functions. (#83048) 2022-01-25 15:28:06 +01:00
Lisa Cawley 91cd38df57
[DOCS] Fix links to anomaly detection docs (#82836) 2022-01-19 17:54:18 -08:00
Lisa Cawley c98833f9c6
[DOCS] Fix links to anomaly detection docs (#82774) 2022-01-18 17:42:16 -08:00
Dimitris Athanasiou 93777b4e99
[ML] Add latest search interval to datafeed stats (#82620)
This commit adds `search_interval` to the datafeed stats API
`running_state` object. When the datafeed is running, it reports
the last search interval that was searched. It is useful to
understand the point in time where the datafeed is currently
searching.

Closes #82405
2022-01-16 16:04:35 +02:00
David Kyle 1473b09415
[ML] Add NLP inference configs to the inference processor docs (#82320) 2022-01-11 08:50:45 +00:00
Ed Savage e8a46649c5
[ML] Warn when creating job with an unusual bucket span (#82145)
Emit deprecation warning when creating new jobs with bucket spans that
aren't an integral divisor or multiple of a day.

Relates #81645

Co-authored-by: lcawl <lcawley@elastic.co>
2022-01-10 17:04:18 +00:00
Benjamin Trent 9dc8aea1cb
[ML] adds new mpnet tokenization for nlp models (#82234)
This commit adds support for MPNet based models.

MPNet models differ from BERT style models in that:

 - Special tokens are different
 - Input to the model doesn't require token positions.

To configure an MPNet tokenizer for your pytorch MPNet based model:

```
"tokenization": {
  "mpnet": {...}
}
```
The options provided to `mpnet` are the same as the previously supported `bert` configuration.
2022-01-05 12:56:47 -05:00
Dimitris Athanasiou 14a63ac115
[ML] Improve reporting of trained model size stats (#82000)
This improves reporting of trained model size in the response of the stats API.

In particular, it removes the `model_size_bytes` from the `deployment_stats` section and
replaces it with a top-level `model_size_stats` object that contains:

- `model_size_bytes`: the actual model size
- `required_native_memory_bytes`: the amount of memory required to load a model

In addition, these are now reported for PyTorch models regardless of their deployment state.
2021-12-22 18:20:47 +02:00
Ed Savage a646f55c57
[ML] Set default value of 30 days for model prune window (#81377)
For new jobs, when the analysis config field model_prune_window is not set, use a default value of 30 days or 20 times the bucket span, whichever is greater.

Co-authored-by: David Roberts <dave.roberts@elastic.co>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2021-12-20 11:27:30 +00:00
David Kyle d1ee756da8
[ML][DOCS] Add note about max values of thread settings (#81367) 2021-12-14 13:07:34 +00:00
David Roberts 0559dd087b
[ML] Model snapshot upgrade needs a stats endpoint (#81641)
Previously the ML model snapshot upgrade endpoint did not
provide a way to reliably monitor progress. This could lead
to the upgrade assistant UI thinking that a model snapshot
upgrade had finished when it actually hadn't.

This change adds a new "stats" API that allows external
interested parties to find out the status of each model
snapshot upgrade and which node (if any) each is running on.

Fixes #81519
2021-12-14 08:31:49 +00:00
Lisa Cawley 1751ced80a
[DOCS] Fix formatting in get anomaly job API (#81682) 2021-12-13 12:56:27 -08:00
David Kyle 3c974a1e5d
[ML][DOCS] Remove orphaned GET deployment stats doc (#81505) 2021-12-09 08:32:33 +00:00
Lisa Cawley 429bdd9afc
[DOCS] Move trained model APIs out of dataframe analytics (#81315) 2021-12-03 09:21:09 -08:00
David Kyle aba14aacfa
[ML][DOCS] Add zero shot example and setting truncation at inference (#81003)
More examples for the _infer endpoint
2021-12-01 11:44:04 +00:00
Lisa Cawley e5de9d8ad7
[DOCS] Add actual and typical values in ML alerting docs (#80571) 2021-11-25 10:06:52 -08:00
Lisa Cawley 8da1236bca
[DOCS] Clarify impact of force stop trained model deployment (#81026) 2021-11-25 09:08:46 -08:00
Lisa Cawley d1af86cfdd
[DOCS] Fixes start and stop trained model deployment APIs (#80978) 2021-11-24 10:09:45 -08:00
Lisa Cawley 38cbd116c9
[DOCS] Fixes query parameters for get buckets API (#80643) 2021-11-22 11:34:43 -08:00
Lisa Cawley f3a69ae4b1
[DOCS] Adds missing query parameters to ML APIs (#80863) 2021-11-22 09:25:01 -08:00
Lisa Cawley fffac5bd08
[DOCS] Adds missing query parameters in get influencer and get snapshot APIs (#80801) 2021-11-18 08:24:24 -08:00
Lisa Cawley d6f48dc5bd
[DOCS] Add query parameters to update datafeed API (#80777) 2021-11-17 07:40:31 -08:00
Dimitris Athanasiou c7f745b40a
[ML] Force delete trained models (#80595)
Adds a `force` parameter to the delete trained models API
which when set to `true` allows deletion of a model that
is referenced by ingest pipelines or has a started deployment.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-11-11 10:54:01 +02:00
Benjamin Trent 5627dc66e1
[ML] deprecate estimated_heap_memory_usage_bytes and replace with model_size_bytes (#80554)
This deprecates estimated_heap_memory_usage_bytes on model put and replaces it with model_size_bytes.

On GET, only model_size_bytes is returned unless v7 rest-api compatibility is requested.

For the ml/info API, only model_size_bytes is returned

A forward-port of: #80545
2021-11-10 10:23:25 -05:00
David Roberts a61088063e
[ML] use_auto_machine_memory_percent now defaults max_model_memory_limit (#80532)
If the xpack.ml.use_auto_machine_memory_percent setting is true,
and xpack.ml.max_model_memory_limit is not set then
xpack.ml.max_model_memory_limit is now considered to be set to
the largest size that could be assigned in the cluster.

This functionality will be crucial for Cloud once the Elasticsearch
startup code is setting the Elasticsearch JVM heap size. Then the
Cloud code will no longer be able to accurately set
xpack.ml.max_model_memory_limit, so will not set it at all.
Instead the Cloud code will just set
xpack.ml.use_auto_machine_memory_percent and the ML code will
calculate the appropriate maximum model_memory_limit that should
be permitted.
2021-11-10 08:38:02 +00:00
Lisa Cawley 6ecc495d15
[DOCS] Clarify parameters in delete expired data, forecast, and flush job APIs (#80517) 2021-11-09 14:57:35 -08:00
Lisa Cawley 1c98a23ca8
[DOCS] Edits stop and start datafeed APIs (#80461) 2021-11-09 14:39:13 -08:00
Benjamin Trent cf5f521fac
[ML] add deployment_stats to trained model stats (#80531)
This commit adds a new field deployment_stats that is optionally set for models that are deployed.

If a model does not have a deployment, it will be null.

Also, removes the get deployment stats API and makes the deployment stats action internal only.
2021-11-09 16:09:47 -05:00
Benjamin Trent c3c3f88000
[ML] validate model definition on start deployment (#80439)
When a deployment is started, we do not validate that the definition
documents are all present and not truncated. This commit adds a
validation on _start that prevents a bad state from occurring where the
deployment starts, but the model is incorrectly defined, or some unknown
error occurs to late in the deployment process.
2021-11-09 10:33:55 -05:00
Dimitris Athanasiou afe58ba6d8
[ML] Force stop deployment in use (#80431)
Implements a `force` parameter to the stop deployment API.
This allows a user to forcefully stop a deployment. Currently,
this specifically allows stopping a deployment that is in use
by ingest processors.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-11-08 14:35:52 +02:00
Lisa Cawley 733381bed2
[DOCS] Adds missing query parameters to datafeed APIs (#80314) 2021-11-05 16:31:04 -07:00
James Rodewig f56a0f4b66
[DOCS] Remove `testenv` annotations from doc snippet tests (#80023)
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible.

Relates to #79309, #31619
2021-11-05 18:38:50 -04:00
István Zoltán Szabó f72e2da221
[DOCS] Adds missing query params to GET category and GET influencer APIs (#79448) 2021-11-05 10:59:57 +01:00
David Kyle 0635f2758f
[ML] Consistently apply the default truncation option for the BERT tokenizer (#80339)
The default is Truncate.First
2021-11-05 09:10:59 +00:00
Lisa Cawley 638fe2c26a
[DOCS] Fixes typo in start trained models API (#80368) 2021-11-04 14:23:03 -07:00
Dimitris Athanasiou d13baade69
[ML] Report start_time for trained model deployments and allocations (#80188)
Adds `start_time` to the get deployment stats API for the deployment
and each allocation.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-11-02 17:12:46 +02:00
David Kyle 58a517309a
[ML] [DOCS] Update the model part upload URL in example (#80181) 2021-11-02 11:33:04 +00:00
Benjamin Trent 8887cfa080
[ML] updating the infer trained model deployment docs (#80083)
the infer endpoint has changed its format.

Also, the results format for the various tasks have changed. This updates the docs to match what is currently in 8.0.0.
2021-10-29 13:07:23 -04:00
Benjamin Trent f9bf4e57b9
[ML] adds new params to the start trained model deployment docs (#80016) 2021-10-28 11:23:25 -04:00
Benjamin Trent 375fc779b4
[ML] update truncation default & adding field output when input is truncated (#79942)
This commit makes the two following changes (along with some
refactoring)  - Nlp results will now indicate if the input was truncated
or not  - The default truncation is now `none` instead of `first`
2021-10-28 10:40:49 -04:00
Benjamin Trent d2b638356b
[ML] Update trained model docs for truncate parameter for bert tokenization (#79652) 2021-10-28 07:19:10 -04:00
David Roberts 6b20e8e1b0
[ML] Fixing doc test substitution bug (#79943)
The substitutions should not have a space after the field
name.

Fixes #79931
2021-10-27 19:45:15 +01:00
Mark Vieira 8f79cfacab Mute documentation test 2021-10-27 09:48:20 -07:00