Commit Graph

255 Commits

Author SHA1 Message Date
David Kyle bbeda643a6
Delete expired data by job (#57337)
Deleting expired data can take a long time leading to timeouts if there
are many jobs. Often the problem is due to a few large jobs which 
prevent the regular maintenance of the remaining jobs. This change adds
a job_id parameter to the delete expired data endpoint to help clean up
those problematic jobs.
2020-06-05 13:32:35 +01:00
David Roberts 605b4d0ea9
[ML] Add per-partition categorization option (#57683)
This PR adds the initial Java side changes to enable
use of the per-partition categorization functionality
added in elastic/ml-cpp#1293.

There will be a followup change to complete the work,
as there cannot be any end-to-end integration tests
until elastic/ml-cpp#1293 is merged, and also
elastic/ml-cpp#1293 does not implement some of the
more peripheral functionality, like stop_on_warn and
per-partition stats documents.

The changes so far cover REST APIs, results object
formats, HLRC and docs.
2020-06-05 11:56:15 +01:00
Dimitris Athanasiou e116ac850f
[ML] Fix race condition when force stopping DF analytics job (#57680)
When we force delete a DF analytics job, we currently first force
stop it and then we proceed with deleting the job config.
This may result in logging errors if the job config is deleted
before it is retrieved while the job is starting.

Instead of force stopping the job, it would make more sense to
try to stop the job gracefully first. So we now try that out first.
If normal stop fails, then we resort to force stopping the job to
ensure we can go through with the delete.

In addition, this commit introduces `timeout` for the delete action
and makes use of it in the child requests.
2020-06-05 12:13:02 +03:00
István Zoltán Szabó 3a15d84af9
[DOCS] Changes parameter order in model_plot_config. (#57642) 2020-06-04 10:57:36 +02:00
Przemysław Witek c4c094c006
Introduce ModelPlotConfig. annotations_enabled setting (#57539) 2020-06-04 09:27:40 +02:00
Lisa Cawley 0f52cab495
[DOCS] Replaces docdir attributes in ML APIs (#57390) 2020-06-01 11:46:10 -07:00
Benjamin Trent 251b17009a
[ML] adds new for_export flag to GET _ml/inference API (#57351)
Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API.

This flag is useful for moving models between clusters.
2020-05-29 12:29:28 -04:00
Benjamin Trent ec67787a2e
[ML] add max_model_memory parameter to forecast request (#57254)
This adds a max_model_memory setting to forecast requests. 
This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb").

The default value is `20mb`.

There is a HARD limit at `500mb` which will throw an error if used.

If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited.

related native change: https://github.com/elastic/ml-cpp/pull/1238

closes: https://github.com/elastic/elasticsearch/issues/56420
2020-05-29 08:59:50 -04:00
István Zoltán Szabó eaf0d5ffee
[DOCS] Puts a link into the loss_function variable description (#56678) 2020-05-28 09:42:27 +02:00
István Zoltán Szabó b9b3546985
[DOCS] Fixes formatting of admonition paragraph in PUT inference API docs. (#57196) 2020-05-27 13:42:50 +02:00
István Zoltán Szabó 90056edaf4
[DOCS] Improves navigation between forecast APIs and adds short description. (#57035) 2020-05-25 09:09:47 +02:00
István Zoltán Szabó 69b6041d57
[DOCS] Removes the Jobs section from the ML anomaly detection APIs page. (#57031) 2020-05-21 17:30:59 +02:00
Benjamin Trent 8fed077b0a
[ML] relax throttling on expired data cleanup (#56711)
Throttling nightly cleanup as much as we do has been over cautious.

Night cleanup should be more lenient in its throttling. We still
keep the same batch size, but now the requests per second scale
with the number of data nodes. If we have more than 5 data nodes,
we don't throttle at all.

Additionally, the API now has `requests_per_second` and `timeout` set.
So users calling the API directly can set the throttling.

This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`.
This will allow users to adjust throttling of the nightly maintenance.
2020-05-18 07:21:06 -04:00
David Roberts cbb8b17d74
[DOCS] Docs changes for overridden delimiter in find_file_structure (#56288)
Docs for #55735

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-05-14 09:24:07 +01:00
Lisa Cawley 84e28e42c8
[DOCS] Clarify model snapshot retention properties (#56477) 2020-05-11 07:41:47 -07:00
István Zoltán Szabó c994369893
[DOCS] Expands GET DFA stats API docs with new phases (#56407)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-05-11 09:22:30 +02:00
David Roberts c99021cdcb
[ML] More advanced model snapshot retention options (#56125)
This PR implements the following changes to make ML model snapshot
retention more flexible in advance of adding a UI for the feature in
an upcoming release.

- The default for `model_snapshot_retention_days` for new jobs is now
  10 instead of 1
- There is a new job setting, `daily_model_snapshot_retention_after_days`,
  that defaults to 1 for new jobs and `model_snapshot_retention_days`
  for pre-7.8 jobs
- For days that are older than `model_snapshot_retention_days`, all
  model snapshots are deleted as before
- For days that are in between `daily_model_snapshot_retention_after_days`
  and `model_snapshot_retention_days` all but the first model snapshot
  for that day are deleted
- The `retain` setting of model snapshots is still respected to allow
  selected model snapshots to be retained indefinitely

Closes #52150
2020-05-05 12:55:50 +01:00
Dimitris Athanasiou 6bf3834059
[ML] Add loss_function to regression (#56118)
Adds parameters `loss_function` and `loss_function_parameter`
to regression.
2020-05-05 12:36:05 +03:00
István Zoltán Szabó 86032ac56a
[DOCS] Simplifies footnote text in DFA APIs (#56105)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-05-05 09:03:16 +02:00
Lisa Cawley 52a2f7689f
[DOCS] Synchs and links hyperparameter descriptions (#55827) 2020-05-04 07:37:14 -07:00
Lisa Cawley 5ef7aacbf7
[DOCS] Adds documentation for secondary authorization headers (#55365)
Co-authored-by: Tim Vernum <tim@adjective.org>
2020-04-29 08:28:42 -07:00
István Zoltán Szabó d70cef3474
[DOCS] Makes the footnotes less verbose in configuring aggs page. (#55857) 2020-04-29 09:50:41 +02:00
István Zoltán Szabó ca2f98382f
[DOCS] Changes feature importance links to point to the new page (#55531)
* [DOCS] Changes feature importance links to point to the new page.

* [DOCS] Fixes line breaks.
2020-04-28 09:02:14 +02:00
David Roberts dcb6ed03cd
[ML] Adding failed_category_count to model_size_stats (#55716)
The failed_category_count statistic records the number of times
categorization wanted to create a new category but couldn't
because the job had reached its model_memory_limit.

Relates elastic/ml-cpp#1130
2020-04-25 08:01:21 +01:00
Lisa Cawley 7fafec0f8f
[DOCS] Update example and nesting in get data frame analytics job stats API (#55191)
Co-Authored-By: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>
2020-04-22 08:07:31 -07:00
David Roberts d1a9b3a545
[ML] Add effective max model memory limit to ML info (#55529)
The ML info endpoint returns the max_model_memory_limit setting
if one is configured.  However, it is still possible to create
a job that cannot run anywhere in the current cluster because
no node in the cluster has enough memory to accommodate it.

This change adds an extra piece of information,
limits.effective_max_model_memory_limit, to the ML info
response that returns the biggest model memory limit that could
be run in the current cluster assuming no other jobs were
running.

The idea is that the ML UI will be able to warn users who try to
create jobs with higher model memory limits that their jobs will
not be able to start unless they add a bigger ML node to their
cluster.

Relates elastic/kibana#63942
2020-04-22 11:36:58 +01:00
David Roberts 8906e76079
[ML] Return assigned node in start/open job/datafeed response (#55473)
Adds a "node" field to the response from the following endpoints:

1. Open anomaly detection job
2. Start datafeed
3. Start data frame analytics job

If the job or datafeed is assigned to a node immediately then
this field will return the ID of that node.

In the case where a job or datafeed is opened or started lazily
the node field will contain an empty string.  Clients that want
to test whether a job or datafeed was opened or started lazily
can therefore check for this.

Fixes #54067
2020-04-22 08:44:57 +01:00
István Zoltán Szabó f8bfab2dab
[DOCS] Provides further details on aggregations in datafeeds (#55462)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-04-22 08:53:34 +02:00
Benjamin Trent f72b2db29a
[ML] partitions model definitions into chunks (#55260)
This paves the data layer way so that exceptionally large models are partitioned across multiple documents.

This change means that nodes before 7.8.0 will not be able to use trained inference models created on nodes on or after 7.8.0.

I chose the definition document limit to be 100. This *SHOULD* be plenty for any large model. One of the largest models that I have created so far had the following stats:
~314MB of inflated JSON, ~66MB when compressed, ~177MB of heap. 
With the chunking sizes of `16 * 1024 * 1024` its compressed string could be partitioned to 5 documents. 
Supporting models 20 times this size (compressed) seems adequate for now.
2020-04-20 15:13:23 -04:00
Benjamin Trent c1afda4a23
[ML] adding prediction_field_type to inference config (#55128)
Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. 

Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. 

Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). 

Analytics provides the default `prediction_field_type` when the model is created from the process.
2020-04-15 08:32:48 -04:00
Lisa Cawley 1f0341db39
[DOCS] Removes unshared sections from ml-shared.asciidoc (#55129) 2020-04-14 15:19:31 -07:00
Lisa Cawley 998a085c14
[DOCS] Edits create data frame analytics job API (#54751) 2020-04-13 09:58:03 -07:00
István Zoltán Szabó b1b067c5ba
[DOCS] Adds link points to the data frame analytics supported fields (#55004)
Co-authored-by: lcawl <lcawley@elastic.co>
2020-04-09 11:16:13 -07:00
István Zoltán Szabó bb44726ad6
[DOCS] Reworks some parts of EMM API docs (#54872)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-04-08 09:50:12 +02:00
Lisa Cawley c355fea8f4
[DOCS] Remove text fields from classification dependent variables (#54849) 2020-04-07 10:43:15 -07:00
István Zoltán Szabó 1ae8bde756
[DOCS] Changes kibana_user to kibana_admin in DFA API prerequisites. (#54806) 2020-04-06 15:45:08 +02:00
István Zoltán Szabó a0662399c7
[DOCS] Makes PUT inference API docs collapsible (#54653)
Co-authored-by: lcawl <lcawley@elastic.co>
2020-04-03 09:45:42 +02:00
Benjamin Trent 4e1ff31c3c
[ML] add new inference_config field to trained model config (#54421)
A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. 

The inference processor can still override whatever is set as the default in the trained model config.
2020-04-02 10:34:17 -04:00
Benjamin Trent 1d24960ff8
[ML] prefer secondary authorization header for data[feed|frame] authz (#54121)
Secondary authorization headers are to be used to facilitate Kibana spaces support + ML jobs/datafeeds. 

Now on PUT/Update/Preview datafeed, and PUT data frame analytics the secondary authorization is preferred over the primary (if provided).

closes https://github.com/elastic/elasticsearch/issues/53801
2020-04-02 10:10:46 -04:00
Benjamin Trent bbd6e943de
[ML] add num_matches and preferred_to_categories to category defintion objects (#54214)
This adds two new fields to category definitions.

- `num_matches` indicating how many documents have been seen by this category
- `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized.

These fields are only guaranteed to be up to date after a `_flush` or `_close`

native change: https://github.com/elastic/ml-cpp/pull/1062
2020-04-02 07:49:09 -04:00
István Zoltán Szabó b0f6d4ee0e
[DOCS] Updates estimate model memory docs (#54574) 2020-04-01 15:53:53 +02:00
István Zoltán Szabó b96743cfc5
[DOCS] Adds data_counts object to the GET DFA stats API (#54498) 2020-04-01 10:05:00 +02:00
Jason Tedor 95a7eed9aa
Rename MetaData to Metadata in all of the places (#54519)
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
2020-03-31 15:52:01 -04:00
Lisa Cawley b90e491f68
[DOCS] Collapses nested objects in data frame analytics APIs (#54472) 2020-03-31 10:56:48 -07:00
Dimitris Athanasiou 5a98fc20e1
[ML] Fix DF analytics explain API request in docs (#54510)
The explain API expects a data frame analytics config
as its request.
2020-03-31 18:37:19 +03:00
István Zoltán Szabó 85d9b34dc5
[DOCS] Adds description of analysis_stats object and its properties to GET DFA stats API docs (#53881)
Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-03-31 13:27:54 +02:00
Lisa Cawley fdcd19483d
[DOCS] Collapses content in machine learning APIs (#54234) 2020-03-30 10:08:38 -07:00
Jason Tedor 1fc0432b24
Introduce formal role for remote cluster client (#53924)
This commit introduce a formal role for identifying nodes that are
capable of making connections to remote clusters.
2020-03-24 19:21:56 -04:00
David Roberts 8ee770560a
[ML] Add a model memory estimation endpoint for anomaly detection (#53507)
A new endpoint for estimating anomaly detection job
model memory requirements:

POST _ml/anomaly_detectors/estimate_model_memory

Closes #53219
2020-03-24 21:38:19 +00:00
David Roberts cbe063a074
[ML] Introduce a "starting" datafeed state for lazy jobs (#53918)
It is possible for ML jobs to open lazily if the "allow_lazy_open"
option in the job config is set to true.  Such jobs wait in the
"opening" state until a node has sufficient capacity to run them.

This commit fixes the bug that prevented datafeeds for jobs lazily
waiting assignment from being started.  The state of such datafeeds
is "starting", and they can be stopped by the stop datafeed API
while in this state with or without force.

Fixes #53763
2020-03-24 10:01:13 +00:00