elasticsearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	1b34c88d56	[ML] adding docs + hlrc for data frame analysis feature_processors (#61149 ) Adds HLRC and some docs for the new feature_processors field in Data frame analytics. Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co> Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-08-24 12:00:44 -04:00
James Rodewig	a94e5cb7c4	[DOCS] Replace Wikipedia links with attribute (#61171 )	2020-08-17 09:44:24 -04:00
James Rodewig	6b9b8c5e31	[DOCS] Move script and stored fields content to search fields page (#60826 ) Changes: * Moves `Retrieve selected fields` to its own page and adds a title abbreviation. * Adds existing script and stored fields content to `Retrieve selected fields` * Adds a xref for `Retrieve selected fields` to `Search your data` * Adds related redirects and updates existing xrefs	2020-08-06 12:45:03 -04:00
Lisa Cawley	fb0157460f	[DOCS] Changes level offset of anomaly detection pages (#59911 )	2020-07-20 16:33:54 -07:00
Benjamin Trent	b551f75ec3	[ML] add new `custom` field to trained model processors (#59542 ) This commit adds the new configurable field `custom`. `custom` indicates if the preprocessor was submitted by a user or automatically created by the analytics job. Eventually, this field will be used in calculating feature importance. When `custom` is true, the feature importance for the processed fields is calculated. When `false` the current behavior is the same (we calculate the importance for the originating field/feature). This also adds new required methods to the preprocessor interface. If users are to supply their own preprocessors in the analytics job configuration, we need to know the input and output field names.	2020-07-16 09:35:56 -04:00
Przemysław Witek	dfbb47dcaa	Add a "verbose" option to the data frame analytics stats endpoint (#59589 )	2020-07-15 15:59:56 +02:00
Przemysław Witek	4a43b03855	Report peak model memory in ModelSizeStats (#59017 )	2020-07-06 10:33:54 +02:00
István Zoltán Szabó	d0042fb791	[DOCS] Updates results_field description in the inference processor docs (#58554 )	2020-06-29 11:28:17 +02:00
Przemysław Witek	76c7e3259f	Make ModelPlotConfig.annotations_enabled default to ModelPlotConfig.enabled if unset (#57808 )	2020-06-08 15:31:37 +02:00
David Roberts	605b4d0ea9	[ML] Add per-partition categorization option (#57683 ) This PR adds the initial Java side changes to enable use of the per-partition categorization functionality added in elastic/ml-cpp#1293. There will be a followup change to complete the work, as there cannot be any end-to-end integration tests until elastic/ml-cpp#1293 is merged, and also elastic/ml-cpp#1293 does not implement some of the more peripheral functionality, like stop_on_warn and per-partition stats documents. The changes so far cover REST APIs, results object formats, HLRC and docs.	2020-06-05 11:56:15 +01:00
Przemysław Witek	c4c094c006	Introduce ModelPlotConfig. annotations_enabled setting (#57539 )	2020-06-04 09:27:40 +02:00
Lisa Cawley	0f52cab495	[DOCS] Replaces docdir attributes in ML APIs (#57390 )	2020-06-01 11:46:10 -07:00
Lisa Cawley	84e28e42c8	[DOCS] Clarify model snapshot retention properties (#56477 )	2020-05-11 07:41:47 -07:00
David Roberts	c99021cdcb	[ML] More advanced model snapshot retention options (#56125 ) This PR implements the following changes to make ML model snapshot retention more flexible in advance of adding a UI for the feature in an upcoming release. - The default for `model_snapshot_retention_days` for new jobs is now 10 instead of 1 - There is a new job setting, `daily_model_snapshot_retention_after_days`, that defaults to 1 for new jobs and `model_snapshot_retention_days` for pre-7.8 jobs - For days that are older than `model_snapshot_retention_days`, all model snapshots are deleted as before - For days that are in between `daily_model_snapshot_retention_after_days` and `model_snapshot_retention_days` all but the first model snapshot for that day are deleted - The `retain` setting of model snapshots is still respected to allow selected model snapshots to be retained indefinitely Closes #52150	2020-05-05 12:55:50 +01:00
Lisa Cawley	52a2f7689f	[DOCS] Synchs and links hyperparameter descriptions (#55827 )	2020-05-04 07:37:14 -07:00
István Zoltán Szabó	ca2f98382f	[DOCS] Changes feature importance links to point to the new page (#55531 ) * [DOCS] Changes feature importance links to point to the new page. * [DOCS] Fixes line breaks.	2020-04-28 09:02:14 +02:00
David Roberts	dcb6ed03cd	[ML] Adding failed_category_count to model_size_stats (#55716 ) The failed_category_count statistic records the number of times categorization wanted to create a new category but couldn't because the job had reached its model_memory_limit. Relates elastic/ml-cpp#1130	2020-04-25 08:01:21 +01:00
Lisa Cawley	7fafec0f8f	[DOCS] Update example and nesting in get data frame analytics job stats API (#55191 ) Co-Authored-By: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>	2020-04-22 08:07:31 -07:00
Benjamin Trent	c1afda4a23	[ML] adding prediction_field_type to inference config (#55128 ) Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). Analytics provides the default `prediction_field_type` when the model is created from the process.	2020-04-15 08:32:48 -04:00
Lisa Cawley	1f0341db39	[DOCS] Removes unshared sections from ml-shared.asciidoc (#55129 )	2020-04-14 15:19:31 -07:00
Lisa Cawley	998a085c14	[DOCS] Edits create data frame analytics job API (#54751 )	2020-04-13 09:58:03 -07:00
István Zoltán Szabó	b1b067c5ba	[DOCS] Adds link points to the data frame analytics supported fields (#55004 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-04-09 11:16:13 -07:00
István Zoltán Szabó	a0662399c7	[DOCS] Makes PUT inference API docs collapsible (#54653 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-04-03 09:45:42 +02:00
Benjamin Trent	4e1ff31c3c	[ML] add new inference_config field to trained model config (#54421 ) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config.	2020-04-02 10:34:17 -04:00
István Zoltán Szabó	b96743cfc5	[DOCS] Adds data_counts object to the GET DFA stats API (#54498 )	2020-04-01 10:05:00 +02:00
Lisa Cawley	b90e491f68	[DOCS] Collapses nested objects in data frame analytics APIs (#54472 )	2020-03-31 10:56:48 -07:00
István Zoltán Szabó	85d9b34dc5	[DOCS] Adds description of analysis_stats object and its properties to GET DFA stats API docs (#53881 ) Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-03-31 13:27:54 +02:00
Lisa Cawley	fdcd19483d	[DOCS] Collapses content in machine learning APIs (#54234 )	2020-03-30 10:08:38 -07:00
Jason Tedor	1fc0432b24	Introduce formal role for remote cluster client (#53924 ) This commit introduce a formal role for identifying nodes that are capable of making connections to remote clusters.	2020-03-24 19:21:56 -04:00
David Roberts	cbe063a074	[ML] Introduce a "starting" datafeed state for lazy jobs (#53918 ) It is possible for ML jobs to open lazily if the "allow_lazy_open" option in the job config is set to true. Such jobs wait in the "opening" state until a node has sufficient capacity to run them. This commit fixes the bug that prevented datafeeds for jobs lazily waiting assignment from being started. The state of such datafeeds is "starting", and they can be stopped by the stop datafeed API while in this state with or without force. Fixes #53763	2020-03-24 10:01:13 +00:00
Tom Veasey	58340c2dbe	[ML] Adds the class_assignment_objective parameter to classification (#52763 ) Adds a new parameter for classification that enables choosing whether to assign labels to maximise accuracy or to maximise the minimum class recall. Fixes #52427.	2020-03-12 18:39:29 +00:00
István Zoltán Szabó	77ec60baa0	[DOCS] Adds a warning about reindexing docs with the same ID to the PUT DFA docs. (#53490 )	2020-03-12 18:00:36 +01:00
Benjamin Trent	4e1f029b04	[ML][Inference] adds new default_field_map field to trained models (#53294 ) Adds a new `default_field_map` field to trained model config objects. This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data. The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.	2020-03-11 12:23:56 -04:00
Dimitris Athanasiou	5a32f50d18	[ML] Rename data frame analytics maximum_number_trees to max_trees (#53300 ) Deprecates `maximum_number_trees` parameter of classification and regression and replaces it with `max_trees`.	2020-03-11 10:33:53 +02:00
István Zoltán Szabó	870e1891d9	[DOCS] Makes the naming convention of the DFA response objects coherent (#53172 )	2020-03-05 16:25:43 +01:00
István Zoltán Szabó	d7fb6416dd	[DOCS] Expands GET DFA stat API docs with response objects. (#53107 )	2020-03-05 15:30:30 +01:00
Lisa Cawley	7004216455	[DOCS] Adds link in datafeed indices_options (#53067 )	2020-03-03 10:28:54 -08:00
István Zoltán Szabó	24fe7e5899	[DOCS] Adds response body documentation to GET inference API (#53050 )	2020-03-03 16:25:24 +01:00
Lisa Cawley	b6534834f9	[DOCS] Adds cat anomaly detectors API (#52866 )	2020-02-28 12:15:21 -08:00
Dimitris Athanasiou	dd331935b3	[ML] Parse and report memory usage for DF Analytics (#52778 ) Adds reporting of memory usage for data frame analytics jobs. This commit introduces a new index pattern `.ml-stats-*` whose first concrete index will be `.ml-stats-000001`. This index serves to store instrumentation information for those jobs.	2020-02-28 17:35:07 +02:00
Benjamin Trent	d7a63333b5	[ML] Add indices_options to datafeed config and update (#52793 ) This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index. This is necessary for the following use cases: - Reading from frozen indices - Allowing certain indices in multiple index patterns to not exist yet These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object. closes https://github.com/elastic/elasticsearch/issues/48056	2020-02-27 12:22:35 -05:00
Lisa Cawley	42fbca7dc6	[DOCS] Adds cat datafeeds API (#52738 )	2020-02-26 09:20:36 -08:00
István Zoltán Szabó	490e8b47e6	[DOCS] Adds cat data frame analytics API (#52764 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-02-26 11:09:37 +01:00
Lisa Cawley	f41ebe47e3	[DOCS] Clarifies description of num_top_feature_importance_values (#52246 ) Co-Authored-By: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>	2020-02-18 08:48:24 -08:00
Lisa Cawley	ab139244d7	[DOCS] Fixes, sorts ML tagged regions (#52283 )	2020-02-12 13:43:21 -08:00
István Zoltán Szabó	85e581282d	[DOCS] Refines description. (#51400 )	2020-01-24 13:31:44 +01:00
Benjamin Trent	c9e285c1e6	[ML][Inference] add tags url param to GET (#51330 ) Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint. This parameter allows the list of models to be further reduced to those who contain all the provided tags.	2020-01-24 07:30:56 -05:00
Lisa Cawley	551a83a2ff	[DOCS] Clarify interval, frequency, and bucket span in ML APIs and example (#51280 )	2020-01-22 08:08:31 -08:00
David Kyle	7978f0b8ef	[ML] Calculate results and snapshot retention using latest bucket timestamps (#51061 ) The retention period is calculated relative to the last bucket result or snapshot time rather than wall clock	2020-01-22 10:08:41 +00:00
Dimitris Athanasiou	4d2be9bd32	[ML] Add num_top_feature_importance_values param to regression and classi… (#50914 ) Adds a new parameter to regression and classification that enables computation of importance for the top most important features. The computation of the importance is based on SHAP (SHapley Additive exPlanations) method.	2020-01-14 15:01:47 +02:00

1 2

65 Commits