elasticsearch

Commit Graph

Author	SHA1	Message	Date
Lisa Cawley	998a085c14	[DOCS] Edits create data frame analytics job API (#54751 )	2020-04-13 09:58:03 -07:00
István Zoltán Szabó	b1b067c5ba	[DOCS] Adds link points to the data frame analytics supported fields (#55004 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-04-09 11:16:13 -07:00
István Zoltán Szabó	bb44726ad6	[DOCS] Reworks some parts of EMM API docs (#54872 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-04-08 09:50:12 +02:00
Lisa Cawley	c355fea8f4	[DOCS] Remove text fields from classification dependent variables (#54849 )	2020-04-07 10:43:15 -07:00
István Zoltán Szabó	1ae8bde756	[DOCS] Changes kibana_user to kibana_admin in DFA API prerequisites. (#54806 )	2020-04-06 15:45:08 +02:00
István Zoltán Szabó	a0662399c7	[DOCS] Makes PUT inference API docs collapsible (#54653 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-04-03 09:45:42 +02:00
Benjamin Trent	4e1ff31c3c	[ML] add new inference_config field to trained model config (#54421 ) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config.	2020-04-02 10:34:17 -04:00
Benjamin Trent	1d24960ff8	[ML] prefer secondary authorization header for data[feed\|frame] authz (#54121 ) Secondary authorization headers are to be used to facilitate Kibana spaces support + ML jobs/datafeeds. Now on PUT/Update/Preview datafeed, and PUT data frame analytics the secondary authorization is preferred over the primary (if provided). closes https://github.com/elastic/elasticsearch/issues/53801	2020-04-02 10:10:46 -04:00
Benjamin Trent	bbd6e943de	[ML] add num_matches and preferred_to_categories to category defintion objects (#54214 ) This adds two new fields to category definitions. - `num_matches` indicating how many documents have been seen by this category - `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized. These fields are only guaranteed to be up to date after a `_flush` or `_close` native change: https://github.com/elastic/ml-cpp/pull/1062	2020-04-02 07:49:09 -04:00
István Zoltán Szabó	b0f6d4ee0e	[DOCS] Updates estimate model memory docs (#54574 )	2020-04-01 15:53:53 +02:00
István Zoltán Szabó	b96743cfc5	[DOCS] Adds data_counts object to the GET DFA stats API (#54498 )	2020-04-01 10:05:00 +02:00
Jason Tedor	95a7eed9aa	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 15:52:01 -04:00
Lisa Cawley	b90e491f68	[DOCS] Collapses nested objects in data frame analytics APIs (#54472 )	2020-03-31 10:56:48 -07:00
Dimitris Athanasiou	5a98fc20e1	[ML] Fix DF analytics explain API request in docs (#54510 ) The explain API expects a data frame analytics config as its request.	2020-03-31 18:37:19 +03:00
István Zoltán Szabó	85d9b34dc5	[DOCS] Adds description of analysis_stats object and its properties to GET DFA stats API docs (#53881 ) Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-03-31 13:27:54 +02:00
Lisa Cawley	fdcd19483d	[DOCS] Collapses content in machine learning APIs (#54234 )	2020-03-30 10:08:38 -07:00
Jason Tedor	1fc0432b24	Introduce formal role for remote cluster client (#53924 ) This commit introduce a formal role for identifying nodes that are capable of making connections to remote clusters.	2020-03-24 19:21:56 -04:00
David Roberts	8ee770560a	[ML] Add a model memory estimation endpoint for anomaly detection (#53507 ) A new endpoint for estimating anomaly detection job model memory requirements: POST _ml/anomaly_detectors/estimate_model_memory Closes #53219	2020-03-24 21:38:19 +00:00
David Roberts	cbe063a074	[ML] Introduce a "starting" datafeed state for lazy jobs (#53918 ) It is possible for ML jobs to open lazily if the "allow_lazy_open" option in the job config is set to true. Such jobs wait in the "opening" state until a node has sufficient capacity to run them. This commit fixes the bug that prevented datafeeds for jobs lazily waiting assignment from being started. The state of such datafeeds is "starting", and they can be stopped by the stop datafeed API while in this state with or without force. Fixes #53763	2020-03-24 10:01:13 +00:00
István Zoltán Szabó	8279f82dea	[DOCS] Fixes typo in start datafeed API docs. (#53811 )	2020-03-19 17:55:26 +01:00
István Zoltán Szabó	57321124ea	[DOCS] Changes seconds to milliseconds since the Epoch in AD docs. (#53797 )	2020-03-19 15:40:53 +01:00
Tom Veasey	58340c2dbe	[ML] Adds the class_assignment_objective parameter to classification (#52763 ) Adds a new parameter for classification that enables choosing whether to assign labels to maximise accuracy or to maximise the minimum class recall. Fixes #52427.	2020-03-12 18:39:29 +00:00
István Zoltán Szabó	77ec60baa0	[DOCS] Adds a warning about reindexing docs with the same ID to the PUT DFA docs. (#53490 )	2020-03-12 18:00:36 +01:00
Benjamin Trent	4e1f029b04	[ML][Inference] adds new default_field_map field to trained models (#53294 ) Adds a new `default_field_map` field to trained model config objects. This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data. The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.	2020-03-11 12:23:56 -04:00
Dimitris Athanasiou	5a32f50d18	[ML] Rename data frame analytics maximum_number_trees to max_trees (#53300 ) Deprecates `maximum_number_trees` parameter of classification and regression and replaces it with `max_trees`.	2020-03-11 10:33:53 +02:00
István Zoltán Szabó	54b66d3385	[DOCS] Makes the description clearer on how to use aggregations in an anomaly detection job (#53103 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-03-09 09:48:23 +01:00
István Zoltán Szabó	08fcc0b02f	[DOCS] Adds deleting flag to the GET job stats API docs (#53223 )	2020-03-06 16:03:09 +01:00
István Zoltán Szabó	870e1891d9	[DOCS] Makes the naming convention of the DFA response objects coherent (#53172 )	2020-03-05 16:25:43 +01:00
István Zoltán Szabó	d7fb6416dd	[DOCS] Expands GET DFA stat API docs with response objects. (#53107 )	2020-03-05 15:30:30 +01:00
Lisa Cawley	7004216455	[DOCS] Adds link in datafeed indices_options (#53067 )	2020-03-03 10:28:54 -08:00
István Zoltán Szabó	24fe7e5899	[DOCS] Adds response body documentation to GET inference API (#53050 )	2020-03-03 16:25:24 +01:00
Lisa Cawley	b6534834f9	[DOCS] Adds cat anomaly detectors API (#52866 )	2020-02-28 12:15:21 -08:00
Dimitris Athanasiou	dd331935b3	[ML] Parse and report memory usage for DF Analytics (#52778 ) Adds reporting of memory usage for data frame analytics jobs. This commit introduces a new index pattern `.ml-stats-*` whose first concrete index will be `.ml-stats-000001`. This index serves to store instrumentation information for those jobs.	2020-02-28 17:35:07 +02:00
Benjamin Trent	d7a63333b5	[ML] Add indices_options to datafeed config and update (#52793 ) This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index. This is necessary for the following use cases: - Reading from frozen indices - Allowing certain indices in multiple index patterns to not exist yet These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object. closes https://github.com/elastic/elasticsearch/issues/48056	2020-02-27 12:22:35 -05:00
Lisa Cawley	42fbca7dc6	[DOCS] Adds cat datafeeds API (#52738 )	2020-02-26 09:20:36 -08:00
István Zoltán Szabó	490e8b47e6	[DOCS] Adds cat data frame analytics API (#52764 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-02-26 11:09:37 +01:00
Lisa Cawley	cd069a861c	[DOCS] Updates custom rules example (#52731 )	2020-02-25 09:30:14 -08:00
David Roberts	ca80ad69f2	[ML] Use event.timezone in file_structure_finder ingest pipeline (#52720 ) This is because beat.timezone was renamed to event.timezone in elastic/beats#9458	2020-02-25 12:18:53 +00:00
lcawl	b590b49205	[DOCS] Adds anchor for custom rules	2020-02-24 10:04:34 -08:00
Lisa Cawley	f41ebe47e3	[DOCS] Clarifies description of num_top_feature_importance_values (#52246 ) Co-Authored-By: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>	2020-02-18 08:48:24 -08:00
Lisa Cawley	ab139244d7	[DOCS] Fixes, sorts ML tagged regions (#52283 )	2020-02-12 13:43:21 -08:00
David Kyle	f64c6359ed	[ML] Make Ensemble feature names optional (#51996 ) The featureNames field is requisite in individual models but is not required by the Ensemble.	2020-02-07 10:07:18 +00:00
David Roberts	72346b91f9	[ML] Add new categorization stats to model_size_stats (#51879 ) This change adds support for the following new model_size_stats fields: - categorized_doc_count - total_category_count - frequent_category_count - rare_category_count - dead_category_count - categorization_status Relates #50749	2020-02-06 17:08:43 +00:00
Darren LaCasse	ea67e24b7b	[DOCS] Remove extra word (#51757 )	2020-01-31 10:27:37 -08:00
István Zoltán Szabó	67f14c3978	[DOCS] Adds PUT inference API docs (#51231 ) Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com> Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-01-31 13:12:24 +01:00
Lisa Cawley	32adcd2c9d	[DOCS] Adds missing testenv attribute (#51719 )	2020-01-30 16:13:26 -08:00
David Roberts	a5a2e4eaee	[ML] Use CSV ingest processor in find_file_structure ingest pipeline (#51492 ) Changes the find_file_structure response to include a CSV ingest processor in the ingest pipeline it suggests. Previously the Kibana file upload functionality parsed CSV in the browser, but by parsing CSV in the ingest pipeline it makes the Kibana file upload functionality more easily interchangable with Filebeat such that the configurations it creates can more easily be used to import data with the same structure repeatedly in production.	2020-01-28 12:46:00 +00:00
István Zoltán Szabó	85e581282d	[DOCS] Refines description. (#51400 )	2020-01-24 13:31:44 +01:00
Benjamin Trent	c9e285c1e6	[ML][Inference] add tags url param to GET (#51330 ) Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint. This parameter allows the list of models to be further reduced to those who contain all the provided tags.	2020-01-24 07:30:56 -05:00
Lisa Cawley	789aeaedab	[DOCS] Updates categorization examples with wizard screenshots (#51133 )	2020-01-22 11:26:10 -08:00
Lisa Cawley	551a83a2ff	[DOCS] Clarify interval, frequency, and bucket span in ML APIs and example (#51280 )	2020-01-22 08:08:31 -08:00
David Kyle	7978f0b8ef	[ML] Calculate results and snapshot retention using latest bucket timestamps (#51061 ) The retention period is calculated relative to the last bucket result or snapshot time rather than wall clock	2020-01-22 10:08:41 +00:00
István Zoltán Szabó	087a048ee6	[DOCS] Adds text about data types to the categorization docs (#51145 )	2020-01-17 09:52:57 -08:00
Dimitris Athanasiou	24ce598239	[ML] DF Analytics _explain API should skip object fields (#51115 ) Object fields cannot be used as features. At the moment _explain API includes them and even worse it allows it does not error when an object field is excluded. This creates the expectation to the user that all children fields will also be excluded while it's not the case. This commit omits object fields from the _explain API and also adds an error if an object field is included or excluded.	2020-01-17 12:24:17 +02:00
David Kyle	5ad1d0d2cc	Fix hardcoded version replacement in put-dfanalytics.asciidoc (#51056 )	2020-01-16 10:06:45 +00:00
Przemysław Witek	999884d8fb	Add missing docs for new evaluation metrics (#50967 )	2020-01-15 14:23:37 +01:00
István Zoltán Szabó	406810c172	[DOCS] Describes the relationship of the time-related settings in anomaly detection docs (#50959 ) Co-Authored-By: David Roberts <dave.roberts@elastic.co>	2020-01-15 08:45:03 +01:00
Dimitris Athanasiou	4d2be9bd32	[ML] Add num_top_feature_importance_values param to regression and classi… (#50914 ) Adds a new parameter to regression and classification that enables computation of importance for the top most important features. The computation of the importance is based on SHAP (SHapley Additive exPlanations) method.	2020-01-14 15:01:47 +02:00
Lisa Cawley	979a28d2b5	[DOCS] Clarify detector_index property in ML APIs (#50723 )	2020-01-09 08:12:53 -08:00
István Zoltán Szabó	b3457154a3	[DOCS] Fine-tunes data frame analytics API docs formatting. (#50799 )	2020-01-09 16:21:01 +01:00
István Zoltán Szabó	b683f96e23	[DOCS] Moves analysis resources to PUT DFA API docs (#50704 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-01-09 13:57:11 +01:00
István Zoltán Szabó	659b4ceb97	[DOCS] Improves find_file_structure documentation (#50743 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-01-09 11:19:19 +01:00
István Zoltán Szabó	bc21500201	[DOCS] Forms role and privilege requirements as bulleted lists in DFA API docs (#50732 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2020-01-09 10:44:07 +01:00
István Zoltán Szabó	2f55c3566f	[DOCS] Clarifies model_size_stats.total_xxx_field_count objects and removes notes in GET job stats API docs. (#50728 )	2020-01-09 09:43:55 +01:00
István Zoltán Szabó	d5fcb73b1f	[DOCS] Improves description for forecast_stats (#50729 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2020-01-09 09:31:30 +01:00
Lisa Cawley	b13a755842	[DOCS] Adds missing timing_stats descriptions (#50574 )	2020-01-03 09:07:08 -08:00
István Zoltán Szabó	675b98f90c	[DOCS] Fine-tunes training_percent definition. (#50601 )	2020-01-03 14:49:43 +01:00
Dimitris Athanasiou	af0ce426cc	[ML] Implement force deleting a data frame analytics job (#50553 ) Adds a `force` parameter to the delete data frame analytics request. When `force` is `true`, the action force-stops the jobs and then proceeds to the deletion. This can be used in order to delete a non-stopped job with a single request. Closes #48124	2020-01-03 12:01:41 +02:00
István Zoltán Szabó	fd50169c74	[DOCS] Specifies the possible data types of classification dependent_variable (#50582 )	2020-01-03 10:41:38 +01:00
Lisa Cawley	dd4ede5c56	[DOCS] Adds filter and calendar attributes (#50566 )	2020-01-02 10:59:54 -08:00
lcawl	c7408a25f1	[DOCS] Minor fixes in ML APIs	2019-12-30 15:21:18 -08:00
James Rodewig	e8a6d4a3fb	[DOCS] Remove unneeded redirects (#50476 ) The docs/reference/redirects.asciidoc file stores a list of relocated or deleted pages for the Elasticsearch Reference documentation. This prunes several older redirects that are no longer needed and don't require work to fix broken links in other repositories.	2019-12-26 07:49:41 -05:00
Lisa Cawley	6501338a9e	[DOCS] Remove redundant results from ML APIs (#50477 )	2019-12-24 08:34:03 -08:00
Orhan Toy	48342740c5	[DOCS] Fixes "enables you to" typos (#50225 )	2019-12-23 14:38:37 -05:00
Lisa Cawley	362ce41eaf	[DOCS] Updates ML links (#50387 )	2019-12-19 14:47:28 -08:00
lcawl	d8a94f0397	[DOCS] Fixes security links	2019-12-18 11:51:03 -08:00
Lisa Cawley	68e02a19d8	[DOCS] Move machine learning results definitions into APIs (#50257 )	2019-12-18 09:50:31 -08:00
István Zoltán Szabó	50e26d40a2	[DOCS] Adds GET, GET stats and DELETE inference APIs (#50224 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-12-18 09:10:12 +01:00
Lisa Cawley	207094cd67	[DOCS] Moves model snapshot resource definitions into APIs (#50157 ) Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>	2019-12-16 10:42:30 -08:00
István Zoltán Szabó	3857e3d94f	[DOCS] Moves data frame analytics job resource definitions into APIs (#50021 )	2019-12-12 10:59:37 +01:00
Lisa Cawley	ca482127fa	[DOCS] Move job count resource definitions into API (#50057 ) Co-Authored-By: Przemysław Witek <przemyslaw.witek@elastic.co> Co-Authored-By: David Roberts <dave.roberts@elastic.co> Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>	2019-12-11 11:17:15 -08:00
Lisa Cawley	3d96e6b68e	[DOCS] Move datafeed resource definitions into APIs (#50005 ) Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>	2019-12-11 09:50:41 -08:00
Dimitris Athanasiou	269425b54d	[ML] Introduce randomize_seed setting for regression and classification (#49990 ) This adds a new `randomize_seed` for regression and classification. When not explicitly set, the seed is randomly generated. One can reuse the seed in a similar job in order to ensure the same docs are picked for training.	2019-12-10 10:22:53 +02:00
Lisa Cawley	0f51bc2f72	[DOCS] Move anomaly detection job resource definitions into APIs (#49700 ) Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>	2019-12-06 15:32:07 -08:00
István Zoltán Szabó	e5d512a8ed	[DOCS] Fixes classification evaluation example response. (#49905 )	2019-12-06 13:24:22 +01:00
István Zoltán Szabó	f7a5b73972	[DOCS] Adds an example of preprocessing actions to the PUT DFA API docs (#49831 )	2019-12-05 14:15:19 +01:00
István Zoltán Szabó	c793e80d3b	[DOCS] Fixes typo in the ML anomaly detection time functions docs. (#49834 )	2019-12-05 09:57:01 +01:00
Dimitris Athanasiou	bad07b76f7	[ML] Add optional source filtering during data frame reindexing (#49690 ) This adds a `_source` setting under the `source` setting of a data frame analytics config. The new `_source` is reusing the structure of a `FetchSourceContext` like `analyzed_fields` does. Specifying includes and excludes for source allows selecting which fields will get reindexed and will be available in the destination index. Closes #49531	2019-11-29 14:20:31 +02:00
lcawl	3b3f3ca925	[DOCS] Fixes typo in ML resources	2019-11-26 10:28:18 -08:00
lcawl	63b944c00f	[DOCS] Fixes data type formatting	2019-11-26 08:21:39 -08:00
David Roberts	40c951d781	[ML] Add default categorization analyzer definition to ML info (#49545 ) The categorization job wizard in the ML UI will use this information when showing the effect of the chosen categorization analyzer on a sample of input.	2019-11-25 13:20:12 +00:00
Dimitris Athanasiou	5a6967af57	[ML][DOCS] Anomaly detection job retention days settings do not require restart (#49546 )	2019-11-25 15:12:41 +02:00
Dimitris Athanasiou	0390ec3627	[ML] Explain data frame analytics API (#49455 ) This commit replaces the _estimate_memory_usage API with a new API, the _explain API. The API consolidates information that is useful before creating a data frame analytics job. It includes: - memory estimation - field selection explanation Memory estimation is moved here from what was previously calculated in the _estimate_memory_usage API. Field selection is a new feature that explains to the user whether each available field was selected to be included or not in the analysis. In the case it was not included, it also explains the reason why.	2019-11-22 20:08:14 +02:00
Lisa Cawley	8d214e851c	[DOCS] Clarify ML job closure prerequisites (#49265 )	2019-11-19 08:31:24 -08:00
David Roberts	b6c6387af5	[TEST] Mute docs snippet test in close-job.asciidoc (#49000 ) Due to https://github.com/elastic/elasticsearch/pull/48583#issuecomment-552991325	2019-11-12 17:31:07 +00:00
Benjamin Trent	ee8853fbc1	[ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050 ) [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050) Related PR: https://github.com/elastic/ml-cpp/pull/809	2019-11-11 13:21:18 -05:00
István Zoltán Szabó	7180b90646	[DOCS] Removes best practice about fields that are highly correlated to the dependent variable. (#48935 )	2019-11-11 10:00:11 -05:00
István Zoltán Szabó	e9cec6e1f7	[DOCS] Extends analyzed_fields description in PUT DFA API docs. (#48307 )	2019-11-11 09:53:59 -05:00
István Zoltán Szabó	6c3fed8d4d	[DOCS] Adds classification type DFA API docs and ml-shared.asciidoc (#48241 )	2019-11-06 07:40:27 -05:00
István Zoltán Szabó	fe92cd0a26	[DOCS] Adds classification type evaluation docs to the DFA evaluation API (#47657 )	2019-11-06 07:37:14 -05:00
Lisa Cawley	29ac34a45c	[DOCS] Re-enable code snippet testing in close anomaly detection job API (#48259 )	2019-10-28 08:08:38 -07:00
David Roberts	d308095b28	[ML] Add option to stop datafeed that finds no data (#47922 ) Adds a new datafeed config option, max_empty_searches, that tells a datafeed that has never found any data to stop itself and close its associated job after a certain number of real-time searches have returned no data.	2019-10-14 13:26:06 +01:00
David Roberts	fd83c18cc1	[ML] Add lazy assignment job config option (#47726 ) This change adds: - A new option, allow_lazy_open, to anomaly detection jobs - A new option, allow_lazy_start, to data frame analytics jobs Both work in the same way: they allow a job to be opened/started even if no ML node exists that can accommodate the job immediately. In this situation the job waits in the opening/starting state until ML node capacity is available. (The starting state for data frame analytics jobs is new in this change.) Additionally, the ML nightly maintenance tasks now creates audit warnings for ML jobs that are unassigned. This means that jobs that cannot be assigned to an ML node for a very long time will show a yellow warning triangle in the UI. A final change is that it is now possible to close a job that is not assigned to a node without using force. This is because previously jobs that were open but not assigned to a node were an aberration, whereas after this change they'll be relatively common.	2019-10-14 12:13:01 +01:00
István Zoltán Szabó	448d19f0ca	[DOCS] Adds supported fields section to the PUT DFA API description (#47842 )	2019-10-10 12:34:39 +02:00
István Zoltán Szabó	ab08c0cd76	[DOCS] Extends the analyzed_fields description in the PUT DFA API docs (#47791 )	2019-10-09 18:13:33 +02:00
Dimitris Athanasiou	e99435a7f6	[ML] Additional outlier detection parameters (#47600 ) Adds the following parameters to `outlier_detection`: - `compute_feature_influence` (boolean): whether to compute or not feature influence scores - `outlier_fraction` (double): the proportion of the data set assumed to be outlying prior to running outlier detection - `standardization_enabled` (boolean): whether to apply standardization to the feature values	2019-10-07 15:28:21 +03:00
Lisa Cawley	4e4990c6a0	[DOCS] Cleans up links to security content (#47610 )	2019-10-04 16:10:26 -07:00
István Zoltán Szabó	b03be6e816	[DOCS] Fixes an attribute in the update datafeed API docs. (#47551 )	2019-10-04 08:42:30 +02:00
István Zoltán Szabó	c0da956b6e	[DOCS] Amends update datafeed API docs (#47448 )	2019-10-03 13:12:19 +02:00
István Zoltán Szabó	4977baf63a	[DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs (#46966 ) * [DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs. * [DOCS] Removes extra lines from examples. * Update docs/reference/ml/df-analytics/apis/evaluate-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * Update docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * [DOCS] Explains examples.	2019-10-02 10:26:20 +02:00
István Zoltán Szabó	aa7c4030cd	[DOCS] Fine tunes update anomaly detection job API documentation (#47280 ) * [DOCS] Fine tunes update anomaly detection job API documentation. * [DOCS] Removes delimiter to fix the table.	2019-10-02 10:04:35 +02:00
István Zoltán Szabó	4073499f43	[DOCS] Fixes typos in the PUT dfa and the evaluate dfa documentation. (#47348 )	2019-10-02 09:49:59 +02:00
István Zoltán Szabó	a6c517a96e	[DOCS] Changes wording to move away from data frame terminology in the ES repo (#47093 ) * [DOCS] Changes wording to move away from data frame terminology in the ES repo. Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-10-01 08:04:06 +02:00
István Zoltán Szabó	14227106b0	[DOCS] Adds regression analytics resources and examples to the data frame analytics APIs and the evaluation API (#46176 ) * [DOCS] Adds regression analytics resources and examples to the data frame analytics APIs. Co-Authored-By: Benjamin Trent <ben.w.trent@gmail.com> Co-Authored-By: Tom Veasey <tveasey@users.noreply.github.com>	2019-09-19 09:10:11 +02:00
István Zoltán Szabó	bd4d46c416	[DOCS] Adds outlier detection params to the data frame analytics resources (#46323 ) * [DOCS] Adds outlier detection params to the data frame analytics resources. Co-Authored-By: Tom Veasey <tveasey@users.noreply.github.com> Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-09-16 14:21:50 +02:00
James Rodewig	5c78f606c2	[DOCS] Change // CONSOLE comments to [source,console] (#46440 )	2019-09-09 10:45:37 -04:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	97802d8aff	[DOCS] Change // CONSOLE comments to [source,console] (#46441 )	2019-09-06 10:55:16 -04:00
István Zoltán Szabó	e39cdd63c3	[DOCS] Adds progress parameter description to the GET stats data frame analytics API doc. (#46434 )	2019-09-06 15:17:18 +02:00
James Rodewig	466c59a4a7	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 )	2019-09-05 16:47:18 -04:00
István Zoltán Szabó	626bbccd6e	[DOCS] [PUT DFA] Documents inline the child params of source and dest (#45649 ) * [DOCS] [PUT DFA] Documents inline the child params of source and dest. * [DOCS] Fixes indentation issues and amends dfa definitions.	2019-08-29 14:38:14 +02:00
Benjamin Trent	9f716fbd4c	[ML] Throw an error when a datafeed needs CCS but it is not enabled for the node (#46044 ) Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot. This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern.	2019-08-28 15:06:26 -05:00
Dimitris Athanasiou	f6a97decac	[ML] Improve progress reportings for DF analytics (#45856 ) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one).	2019-08-23 17:31:36 +03:00
Przemysław Witek	31f6e78acd	Allow the user to specify 'query' in Evaluate Data Frame request (#45775 )	2019-08-22 08:27:38 +02:00
Dimitris Athanasiou	8af319481e	[ML] Add description to DF analytics (#45774 )	2019-08-21 19:58:09 +03:00
Przemysław Witek	c6a25a818d	Add docs for HLRC for Estimate memory usage API (#45538 )	2019-08-21 12:52:17 +02:00
Przemysław Witek	7107c221a7	Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188 )	2019-08-13 20:59:35 +02:00
István Zoltán Szabó	78e35c2c3d	[DOCS] Adds supported time units ref to the ML and DF API params. (#45322 )	2019-08-08 13:43:55 +02:00
Lisa Cawley	46912c8f3d	[DOCS] Reformats ML update APIs (#45253 )	2019-08-06 11:05:01 -07:00
István Zoltán Szabó	fbd9c9e2e3	[DOCS] Makes clearer the note under freq_rare. (#45193 )	2019-08-05 13:28:22 +02:00
James Rodewig	8b152d6d79	Rename "indices APIs" to "index APIs" (#44863 )	2019-08-02 14:09:46 -04:00
Lisa Cawley	53980c6267	[DOCS] Clarifies bucket span in overall buckets API (#45110 )	2019-08-02 08:36:39 -07:00
Lisa Cawley	285f2e0625	[DOCS] Updates terms in machine learning get APIs (#44986 )	2019-07-30 10:52:23 -07:00
István Zoltán Szabó	c22296d0c2	[DOCS] Adds allow no jobs param to the GET, GET stats and Close APIs (#44503 )	2019-07-30 14:22:14 +02:00
Lisa Cawley	75999ff83c	[DOCS] Updates anomaly detection terminology (#44888 )	2019-07-26 11:07:01 -07:00
Lisa Cawley	3f31859669	[DOCS] Updates terms in machine learning datafeed APIs (#44883 )	2019-07-26 10:47:03 -07:00
István Zoltán Szabó	84793476ba	[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806 ) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections	2019-07-26 11:39:59 +02:00
Lisa Cawley	aefb72040c	[DOCS] Updates terms in machine learning calendar APIs (#44866 )	2019-07-25 11:20:42 -07:00
Lisa Cawley	990e037728	[DOCS] Updates terms in anomaly detection job APIs (#44839 )	2019-07-25 08:58:16 -07:00
István Zoltán Szabó	5275392b47	[DOCS] Adds allow no datafeeds query param to the GET, GET stats and STOP datafeed APIs (#44499 )	2019-07-25 16:45:06 +02:00
James Rodewig	ea1adb61c2	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:16:35 -04:00
Lisa Cawley	dbe7a48e82	[DOCS] Fixes query default value (#44572 )	2019-07-18 08:15:28 -07:00
Lisa Cawley	4fd8e34662	[DOCS] Moves content to ML anomaly-detection folder (#44520 )	2019-07-17 13:48:12 -07:00
Lisa Cawley	146be77ec3	[DOCS] Separates data frame analytics APIs (#44451 ) * [DOCS] Separates data frame analytics APIs * [DOCS] Adds links between new pages	2019-07-16 13:22:27 -07:00
James Rodewig	bd52e148c5	[DOCS] Remove :edit_url: overrides. (#44445 ) These overrides do not work in Asciidoctor and are no longer needed.	2019-07-16 15:02:38 -04:00
Lisa Cawley	2316703b93	[DOCS] Removes unnecessary resource definition pages (#44289 ) * [DOCS] Removes calendar resource definition page * [DOCS] Removes scheduled event and filter resource definitions	2019-07-15 09:44:57 -07:00
David Kyle	4402cf38bf	Wait for pending tasks in docs tests cleanup (#44123 ) ML and Data Frame tests should wait for pending tasks	2019-07-15 11:58:09 +01:00
James Rodewig	e5a3ae97e2	Revert "[DOCS] Fix broken links for ES API docs move (#44279 )" This reverts commit `3bdd2f4432`.	2019-07-12 17:06:51 -04:00
James Rodewig	860984536c	Revert "[DOCS] Fix broken link reused in Stack Overview" This reverts commit `c08c253432`.	2019-07-12 17:06:44 -04:00
James Rodewig	f9c09fa7f6	Revert "[DOCS] Fix broken links" This reverts commit `313030263f`.	2019-07-12 17:06:28 -04:00
James Rodewig	313030263f	[DOCS] Fix broken links	2019-07-12 14:03:30 -04:00
James Rodewig	c08c253432	[DOCS] Fix broken link reused in Stack Overview	2019-07-12 13:15:05 -04:00
James Rodewig	3bdd2f4432	[DOCS] Fix broken links for ES API docs move (#44279 ) * [DOCS] Fix broken links for ES API docs move Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-07-12 12:46:22 -04:00
Lisa Cawley	b3a7b2221b	[DOCS] Reformats API parameter details (#44194 )	2019-07-12 08:26:31 -07:00
Lisa Cawley	727199e398	[DOCS] Removes links to ML tutorial (#44251 )	2019-07-12 08:25:23 -07:00
István Zoltán Szabó	74c16efe2a	[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972 ) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool.	2019-07-11 18:05:05 +02:00
lcawl	c9a265b092	[DOCS] Fixes formatting in data frame analytics API	2019-07-10 17:58:17 -07:00
Przemysław Witek	1572080a63	[ML] Add DatafeedTimingStats to datafeed GetDatafeedStatsAction.Response (#43045 )	2019-07-09 16:07:27 +02:00
David Kyle	071b652874	Mute put job docs test Relates to #43271	2019-07-09 13:15:25 +01:00
Lisa Cawley	0601aaf621	[DOCS] Enables testing for create job ML API (#44022 )	2019-07-08 11:25:21 -07:00
Lisa Cawley	f60b35cbcc	[DOCS] Fixes earliest_record_timestamp data type (#44030 )	2019-07-08 10:14:37 -07:00
István Zoltán Szabó	cccf5bac43	[DOCS] Adds data frame analytics APIs to the ML APIs (#43875 ) This PR adds the reference documentation pages of the data frame analytics APIs (PUT, START, STOP, GET, GET stats, DELETE, Evaluate) to the ML APIs pool.	2019-07-05 13:34:05 +02:00
Lisa Cawley	f1e3a8fd6c	[DOCS] Adds data frame API response codes for allow_no_match (#43666 )	2019-06-27 15:16:24 -07:00
Lisa Cawley	c75773745c	[DOCS] Updates ML APIs to use new API template (#43711 )	2019-06-27 13:58:42 -07:00
lcawl	66e1853f34	[DOCS] Adds anchors and attributes to ML APIs	2019-06-27 09:43:43 -07:00
Matthew Adams	4c8f089ebd	Clarify storage location of ML Snapshots (#43437 ) The existing language was misleading about the model snapshots and where they are located. Saying "to disk" sounds like files external to Elasticsearch IMO. It raises the obvious question, where on disk? which node? Is it in the Elasticsearch snapshot repo? The model snapshots are held in an internal index.	2019-06-24 09:13:21 +01:00
Przemysław Witek	13596c807a	Report exponential_avg_bucket_processing_time which gives more weight to recent buckets (#43189 )	2019-06-16 20:41:27 +02:00
Ryan Ernst	a3f2f4079c	Add native code info to ML info api (#43172 ) The machine learning feature of xpack has native binaries with a different commit id than the rest of code. It is currently exposed in the xpack info api. This commit adds that commit information to the ML info api, so that it may be removed from the info api.	2019-06-13 11:38:29 -07:00
lcawl	aa4ff855a6	[DOCS] Fix link to ML node description	2019-06-13 11:17:12 -07:00
Benjamin Trent	82adbce9ca	[ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds (#42969 ) * [ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds * only supporting doc_values for geo_point fields * moving validation into GeoPointField ctor	2019-06-10 16:48:36 -05:00
David Roberts	b3a778093c	[ML] Add earliest and latest timestamps to field stats (#42890 ) This change adds the earliest and latest timestamps into the field stats for fields of type "date" in the output of the ML find_file_structure endpoint. This will enable the cards for date fields in the file data visualizer in the UI to be made to look more similar to the cards for date fields in the index data visualizer in the UI.	2019-06-06 08:56:57 +01:00
David Roberts	bda2d5809c	[ML] Add a limit on line merging in find_file_structure (#42501 ) When analysing a semi-structured text file the find_file_structure endpoint merges lines to form multi-line messages using the assumption that the first line in each message contains the timestamp. However, if the timestamp is misdetected then this can lead to excessive numbers of lines being merged to form massive messages. This commit adds a line_merge_size_limit setting (default 10000 characters) that halts the analysis if a message bigger than this is created. This prevents significant CPU time being spent subsequently trying to determine the internal structure of the huge bogus messages.	2019-06-03 13:44:06 +01:00
Benjamin Trent	f2cde97a3b	[ML] adding delayed_data_check_config to datafeed update docs (#42095 ) * [ML] adding delayed_data_check_config to datafeed update docs * [DOCS] Edits delayed data configuration details	2019-05-28 10:03:39 -04:00
David Roberts	a15f1ee4f6	[ML] Improve file structure finder timestamp format determination (#41948 ) This change contains a major refactoring of the timestamp format determination code used by the ML find file structure endpoint. Previously timestamp format determination was done separately for each piece of text supplied to the timestamp format finder. This had the drawback that it was not possible to distinguish dd/MM and MM/dd in the case where both numbers were 12 or less. In order to do this sensibly it is best to look across all the available timestamps and see if one of the numbers is greater than 12 in any of them. This necessitates making the timestamp format finder an instantiable class that can accumulate evidence over time. Another problem with the previous approach was that it was only possible to override the timestamp format to one of a limited set of timestamp formats. There was no way out if a file to be analysed had a timestamp that was sane yet not in the supported set. This is now changed to allow any timestamp format that can be parsed by a combination of these Java date/time formats: yy, yyyy, M, MM, MMM, MMMM, d, dd, EEE, EEEE, H, HH, h, mm, ss, a, XX, XXX, zzz Additionally S letter groups (fractional seconds) are supported providing they occur after ss and separated from the ss by a dot, comma or colon. Spacing and punctuation is also permitted with the exception of the question mark, newline and carriage return characters, together with literal text enclosed in single quotes. The full list of changes/improvements in this refactor is: - Make TimestampFormatFinder an instantiable class - Overrides must be specified in Java date/time format - Joda format is no longer accepted - Joda timestamp formats in outputs are now derived from the determined or overridden Java timestamp formats, not stored separately - Functionality for determining the "best" timestamp format in a set of lines has been moved from TextLogFileStructureFinder to TimestampFormatFinder, taking advantage of the fact that TimestampFormatFinder is now an instantiable class with state - The functionality to quickly rule out some possible Grok patterns when looking for timestamp formats has been changed from using simple regular expressions to the much faster approach of using the Shift-And method of sub-string search, but using an "alphabet" consisting of just 1 (representing any digit) and 0 (representing non-digits) - Timestamp format overrides are now much more flexible - Timestamp format overrides that do not correspond to a built-in Grok pattern are mapped to a %{CUSTOM_TIMESTAMP} Grok pattern whose definition is included within the date processor in the ingest pipeline - Grok patterns that correspond to multiple Java date/time patterns are now handled better - the Grok pattern is accepted as matching broadly, and the required set of Java date/time patterns is built up considering all observed samples - As a result of the more flexible acceptance of Grok patterns, when looking for the "best" timestamp in a set of lines timestamps are considered different if they are preceded by a different sequence of punctuation characters (to prevent timestamps far into some lines being considered similar to timestamps near the beginning of other lines) - Out-of-the-box Grok patterns that are considered now include %{DATE} and %{DATESTAMP}, which have indeterminate day/month ordering - The order of day/month in formats with indeterminate day/month order is determined by considering all observed samples (plus the server locale if the observed samples still do not suggest an ordering) Relates #38086 Closes #35137 Closes #35132	2019-05-23 21:06:47 +01:00
Zachary Tong	290c8b8256	Force selection of calendar or fixed intervals in date histo agg (#33727 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-06 17:17:11 -04:00
James Rodewig	ba6135f0c7	[DOCS] Allow attribute substitution in titleabbrevs for Asciidoctor migration (#41574 ) * [DOCS] Replace attributes in titleabbrevs for Asciidoctor migration * [DOCS] Add [subs="attributes"] so attributes render in Asciidoctor * Revert "[DOCS] Replace attributes in titleabbrevs for Asciidoctor migration" This reverts commit `98f130257a`. * [DOCS] Fix merge conflict	2019-04-30 13:46:13 -04:00
Russ Cam	b3493631a8	[ML] Specify from and size as querystring parameters (#40575 ) This commit updates the header for from and size to indicate that they can be specified as querystring parameters.	2019-04-29 11:06:23 +01:00
David Roberts	2e2c08b011	[DOCS] Use "source" instead of "inline" in ML docs (#40635 ) Specifying an inline script in an "inline" field was deprecated in 5.x. The new field name is "source". (Since 6.x still accepts "inline" I will only backport this docs change as far as 7.0.)	2019-03-29 16:40:25 +00:00
David Kyle	abb814012b	[ML] Correct small inconsistencies in ml APIs spec and docs (#39801 )	2019-03-11 10:01:02 +00:00
David Roberts	6242beef7a	[ML] Use scaling thread pool and xpack.ml.max_open_jobs cluster-wide dynamic (#39320 ) This change does the following: 1. Makes the per-node setting xpack.ml.max_open_jobs into a cluster-wide dynamic setting 2. Changes the job node selection to continue to use the per-node attributes storing the maximum number of open jobs if any node in the cluster is older than 7.1, and use the dynamic cluster-wide setting if all nodes are on 7.1 or later 3. Changes the docs to reflect this 4. Changes the thread pools for native process communication from fixed size to scaling, to support the dynamic nature of xpack.ml.max_open_jobs 5. Renames the autodetect thread pool to the job comms thread pool to make clear that it will be used for other types of ML jobs (data frame analytics in particular) Closes #29809	2019-03-06 09:45:13 +00:00
Tal Levy	e0e47c53c9	relax ML Info Docs expected response (#38993 ) the get-ml-info API documentation tested that the response show that ML's `upgrade_mode` was false. For reasons that may be true due to other tests running in parallel or not cleaning themselves up, this may not be guaranteed. Since the actual value here is not of importance, this commit relaxes the requirement that upgrade_mode be static.	2019-02-15 16:26:41 -08:00
Alexander Reelsen	5f7168ea74	Remove joda time mentions in documentation (#38720 ) This is the forward port of #38720 (not containing the 7.0 migration docs)	2019-02-14 10:18:48 +01:00
David Roberts	4184524474	[DOCS] Add warning about bypassing ML PUT APIs (#38509 ) Now that ML configurations are stored in the .ml-config index rather than in cluster state there is a possibility that some users may try to add configurations directly to the index. Allowing this creates a variety of problems including possible data exflitration attacks (depending on how security is set up), so this commit adds warnings against allowing writes to the .ml-config index other than via the ML APIs.	2019-02-08 10:44:52 +00:00
David Roberts	1fa413a16d	[ML] Remove "8" prefixes from file structure finder timestamp formats (#38016 ) In 7.x Java timestamp formats are the default timestamp format and there is no need to prefix them with "8". (The "8" prefix was used in 6.7 to distinguish Java timestamp formats from Joda timestamp formats.) This change removes the "8" prefixes from timestamp formats in the output of the ML file structure finder.	2019-02-01 15:36:04 +00:00
Benjamin Trent	8280a20664	ML: Add upgrade mode docs, hlrc, and fix bug (#37942 ) * ML: Add upgrade mode docs, hlrc, and fix bug * [DOCS] Fixes build error and edits text * adjusting docs * Update docs/reference/ml/apis/set-upgrade-mode.asciidoc Co-Authored-By: benwtrent <ben.w.trent@gmail.com> * Update set-upgrade-mode.asciidoc * Update set-upgrade-mode.asciidoc	2019-01-30 06:51:11 -06:00
Lisa Cawley	19529da2db	[DOCS] Delayed data annotations (#37939 )	2019-01-28 13:04:38 -08:00
Benjamin Trent	7e4c0e6991	ML: Adds set_upgrade_mode API endpoint (#37837 ) * ML: Add MlMetadata.upgrade_mode and API * Adding tests * Adding wait conditionals for the upgrade_mode call to return * Adding tests * adjusting format and tests * Adjusting wait conditions for api return and msgs * adjusting doc tests * adding upgrade mode tests to black list	2019-01-28 09:07:30 -06:00
David Roberts	f2c0c26d15	[ML] Adjust structure finder for Joda to Java time migration (#37306 ) The ML file structure finder has always reported both Joda and Java time format strings. This change makes the Java time format strings the ones that are incorporated into mappings and ingest pipeline definitions. The BWC syntax of prepending "8" to these formats is used. This will need to be removed once Java time format strings become the default in Elasticsearch. This commit also removes direct imports of Joda classes in the structure finder unit tests. Instead the core Joda BWC class is used.	2019-01-26 20:19:57 +00:00
Christoph Büscher	34f2d2ec91	Remove remaining occurances of "include_type_name=true" in docs (#37646 )	2019-01-22 15:13:52 +01:00
David Kyle	0ae7f8630c	Document ml datafeed Id limitations (#37653 )	2019-01-21 14:12:20 +00:00
Benjamin Trent	12cdf1cba4	ML: Add support for single bucket aggs in Datafeeds (#37544 ) Single bucket aggs are now supported in datafeed aggregation configurations.	2019-01-18 15:08:53 -06:00
Lisa Cawley	6dcb3af4c8	[DOCS] Adds size limitation to the get datafeeds APIs (#37578 )	2019-01-17 10:47:15 -08:00
Lisa Cawley	a2d9c464b2	[DOCS] Adds limitation to the get jobs API (#37549 )	2019-01-17 08:21:37 -08:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
lcawl	2d5a8ec59d	[DOCS] Remove unused screenshots	2019-01-10 11:10:25 -08:00
lcawl	382e4d39ef	[DOCS] Cleans up xpackml attributes	2019-01-07 14:33:10 -08:00
Dimitris Athanasiou	586453fef1	[ML] Remove types from datafeed (#36538 ) Closes #34265	2019-01-04 09:43:44 +02:00
lcawl	32bed098bb	[DOCS] Synchs titles of X-Pack APIs	2018-12-20 10:27:24 -08:00
Lisa Cawley	4140b9eede	[DOCS] Update X-Pack terminology in security docs (#36564 )	2018-12-19 14:53:37 -08:00
Benjamin Trent	75f1c79d9f	Adding more docs for delayed data detection (#36738 ) * Adding more docs for delayed data detection	2018-12-18 19:14:18 -06:00
David Kyle	e294056bbf	[ML] Merge the Jindex master feature branch (#36702 ) * [ML] Job and datafeed mappings with index template (#32719) Index mappings for the configuration documents * [ML] Job config document CRUD operations (#32738) * [ML] Datafeed config CRUD operations (#32854) * [ML] Change JobManager to work with Job config in index (#33064) * [ML] Change Datafeed actions to read config from the config index (#33273) * [ML] Allocate jobs based on JobParams rather than cluster state config (#33994) * [ML] Return missing job error when .ml-config is does not exist (#34177) * [ML] Close job in index (#34217) * [ML] Adjust finalize job action to work with documents (#34226) * [ML] Job in index: Datafeed node selector (#34218) * [ML] Job in Index: Stop and preview datafeed (#34605) * [ML] Delete job document (#34595) * [ML] Convert job data remover to work with index configs (#34532) * [ML] Job in index: Get datafeed and job stats from index (#34645) * [ML] Job in Index: Convert get calendar events to index docs (#34710) * [ML] Job in index: delete filter action (#34642) This changes the delete filter action to search for jobs using the filter to be deleted in the index rather than the cluster state. * [ML] Job in Index: Enable integ tests (#34851) Enables the ml integration tests excluding the rolling upgrade tests and a lot of fixes to make the tests pass again. * [ML] Reimplement established model memory (#35500) This is the 7.0 implementation of a master node service to keep track of the native process memory requirement of each ML job with an associated native process. The new ML memory tracker service works when the whole cluster is upgraded to at least version 6.6. For mixed version clusters the old mechanism of established model memory stored on the job in cluster state was used. This means that the old (and complex) code to keep established model memory up to date on the job object has been removed in 7.0. Forward port of #35263 * [ML] Need to wait for shards to replicate in distributed test (#35541) Because the cluster was expanded from 1 node to 3 indices would initially start off with 0 replicas. If the original node was killed before auto-expansion to 1 replica was complete then the test would fail because the indices would be unavailable. * [ML] DelayedDataCheckConfig index mappings (#35646) * [ML] JIndex: Restore finalize job action (#35939) * [ML] Replace Version.CURRENT in streaming functions (#36118) * [ML] Use 'anomaly-detector' in job config doc name (#36254) * [ML] Job In Index: Migrate config from the clusterstate (#35834) Migrate ML configuration from clusterstate to index for closed jobs only once all nodes are v6.6.0 or higher * [ML] Check groups against job Ids on update (#36317) * [ML] Adapt to periodic persistent task refresh (#36633) * [ML] Adapt to periodic persistent task refresh If https://github.com/elastic/elasticsearch/pull/36069/files is merged then the approach for reallocating ML persistent tasks after refreshing job memory requirements can be simplified. This change begins the simplification process. * Remove AwaitsFix and implement TODO * [ML] Default search size for configs * Fix TooManyJobsIT.testMultipleNodes Two problems: 1. Stack overflow during async iteration when lots of jobs on same machine 2. Not effectively setting search size in all cases * Use execute() instead of submit() in MlMemoryTracker We don't need a Future to wait for completion * [ML][TEST] Fix NPE in JobManagerTests * [ML] JIindex: Limit the size of bulk migrations (#36481) * [ML] Prevent updates and upgrade tests (#36649) * [FEATURE][ML] Add cluster setting that enables/disables config migration (#36700) This commit adds a cluster settings called `xpack.ml.enable_config_migration`. The setting is `true` by default. When set to `false`, no config migration will be attempted and non-migrated resources (e.g. jobs, datafeeds) will be able to be updated normally. Relates #32905 * [ML] Snapshot ml configs before migrating (#36645) * [FEATURE][ML] Split in batches and migrate all jobs and datafeeds (#36716) Relates #32905 * SQL: Fix translation of LIKE/RLIKE keywords (#36672) * SQL: Fix translation of LIKE/RLIKE keywords Refactor Like/RLike functions to simplify internals and improve query translation when chained or within a script context. Fix #36039 Fix #36584 * Fixing line length for EnvironmentTests and RecoveryTests (#36657) Relates #34884 * Add back one line removed by mistake regarding java version check and COMPAT jvm parameter existence * Do not resolve addresses in remote connection info (#36671) The remote connection info API leads to resolving addresses of seed nodes when invoked. This is problematic because if a hostname fails to resolve, we would not display any remote connection info. Yet, a hostname not resolving can happen across remote clusters, especially in the modern world of cloud services with dynamically chaning IPs. Instead, the remote connection info API should be providing the configured seed nodes. This commit changes the remote connection info to display the configured seed nodes, avoiding a hostname resolution. Note that care was taken to preserve backwards compatibility with previous versions that expect the remote connection info to serialize a transport address instead of a string representing the hostname. * [Painless] Add boxed type to boxed type casts for method/return (#36571) This adds implicit boxed type to boxed types casts for non-def types to create asymmetric casting relative to the def type when calling methods or returning values. This means that a user calling a method taking an Integer can call it with a Byte, Short, etc. legally which matches the way def works. This creates consistency in the casting model that did not previously exist. * SNAPSHOTS: Adjust BwC Versions in Restore Logic (#36718) * Re-enables bwc tests with adjusted version conditions now that #36397 enables concurrent snapshots in 6.6+ * ingest: fix on_failure with Drop processor (#36686) This commit allows a document to be dropped when a Drop processor is used in the on_failure fork of the processor chain. Fixes #36151 * Initialize startup `CcrRepositories` (#36730) Currently, the CcrRepositoryManger only listens for settings updates and installs new repositories. It does not install the repositories that are in the initial settings. This commit, modifies the manager to install the initial repositories. Additionally, it modifies the ccr integration test to configure the remote leader node at startup, instead of using a settings update. * [TEST] fix float comparison in RandomObjects#getExpectedParsedValue This commit fixes a test bug introduced with #36597. This caused some test failure as stored field values comparisons would not work when CBOR xcontent type was used. Closes #29080 * [Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320) This commit exposes lucene's LatLonShape field as the default type in GeoShapeFieldMapper. To use the new indexing approach, simply set "type" : "geo_shape" in the mappings without setting any of the strategy, precision, tree_levels, or distance_error_pct parameters. Note the following when using the new indexing approach: * geo_shape query does not support querying by MULTIPOINT. * LINESTRING and MULTILINESTRING queries do not yet support WITHIN relation. * CONTAINS relation is not yet supported. The tree, precision, tree_levels, distance_error_pct, and points_only parameters are deprecated. * TESTS:Debug Log. IndexStatsIT#testFilterCacheStats * ingest: support default pipelines + bulk upserts (#36618) This commit adds support to enable bulk upserts to use an index's default pipeline. Bulk upsert, doc_as_upsert, and script_as_upsert are all supported. However, bulk script_as_upsert has slightly surprising behavior since the pipeline is executed _before_ the script is evaluated. This means that the pipeline only has access the data found in the upsert field of the script_as_upsert. The non-bulk script_as_upsert (existing behavior) runs the pipeline _after_ the script is executed. This commit does _not_ attempt to consolidate the bulk and non-bulk behavior for script_as_upsert. This commit also adds additional testing for the non-bulk behavior, which remains unchanged with this commit. fixes #36219 * Fix duplicate phrase in shrink/split error message (#36734) This commit removes a duplicate "must be a" from the shrink/split error messages. * Deprecate types in get_source and exist_source (#36426) This change adds a new untyped endpoint `{index}/_source/{id}` for both the GET and the HEAD methods to get the source of a document or check for its existance. It also adds deprecation warnings to RestGetSourceAction that emit a warning when the old deprecated "type" parameter is still used. Also updating documentation and tests where appropriate. Relates to #35190 * Revert "[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)" This reverts commit `5bc7822562`. * Enhance Invalidate Token API (#35388) This change: - Adds functionality to invalidate all (refresh+access) tokens for all users of a realm - Adds functionality to invalidate all (refresh+access)tokens for a user in all realms - Adds functionality to invalidate all (refresh+access) tokens for a user in a specific realm - Changes the response format for the invalidate token API to contain information about the number of the invalidated tokens and possible errors that were encountered. - Updates the API Documentation After back-porting to 6.x, the `created` field will be removed from master as a field in the response Resolves: #35115 Relates: #34556 * Add raw sort values to SearchSortValues transport serialization (#36617) In order for CCS alternate execution mode (see #32125) to be able to do the final reduction step on the CCS coordinating node, we need to serialize additional info in the transport layer as part of each `SearchHit`. Sort values are already present but they are formatted according to the provided `DocValueFormat` provided. The CCS node needs to be able to reconstruct the lucene `FieldDoc` to include in the `TopFieldDocs` and `CollapseTopFieldDocs` which will feed the `mergeTopDocs` method used to reduce multiple search responses (one per cluster) into one. This commit adds such information to the `SearchSortValues` and exposes it through a new getter method added to `SearchHit` for retrieval. This info is only serialized at transport and never printed out at REST. * Watcher: Ensure all internal search requests count hits (#36697) In previous commits only the stored toXContent version of a search request was using the old format. However an executed search request was already disabling hit counts. In 7.0 hit counts will stay enabled by default to allow for proper migration. Closes #36177 * [TEST] Ensure shard follow tasks have really stopped. Relates to #36696 * Ensure MapperService#getAllMetaFields elements order is deterministic (#36739) MapperService#getAllMetaFields returns an array, which is created out of an `ObjectHashSet`. Such set does not guarantee deterministic hash ordering. The array returned by its toArray may be sorted differently at each run. This caused some repeatability issues in our tests (see #29080) as we pick random fields from the array of possible metadata fields, but that won't be repeatable if the input array is sorted differently at every run. Once setting the tests seed, hppc picks that up and the sorting is deterministic, but failures don't repeat with the seed that gets printed out originally (as a seed was not originally set). See also https://issues.carrot2.org/projects/HPPC/issues/HPPC-173. With this commit, we simply create a static sorted array that is used for `getAllMetaFields`. The change is in production code but really affects only testing as the only production usage of this method was to iterate through all values when parsing fields in the high-level REST client code. Anyways, this seems like a good change as returning an array would imply that it's deterministically sorted. * Expose Sequence Number based Optimistic Concurrency Control in the rest layer (#36721) Relates #36148 Relates #10708 * [ML] Mute MlDistributedFailureIT	2018-12-18 17:45:31 +00:00
David Roberts	9e8cfbb40d	[ML] Deprecate X-Pack centric ML endpoints (#36315 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs. Relates #35958	2018-12-07 20:34:11 +00:00
Lisa Cawley	c24be278e4	[DOCS] Refreshes population job examples (#36101 )	2018-11-30 08:55:29 -08:00
Ed Savage	13e11966ca	[HLRC][ML] Add delete expired data API (#35906 ) Relates to #29827	2018-11-26 16:15:54 +00:00
David Roberts	6f46584380	[ML] Add docs for ML info endpoint (#35783 ) This endpoint was not previously documented as it was not particularly useful to end users. However, since the HLRC will support the endpoint we need some documentation to link to. The purpose of the endpoint is to provide defaults and limits used by ML. These are needed to fully understand configurations that have missing values because the missing value means the default should be used. Relates #35777	2018-11-22 16:23:31 +00:00
Benjamin Trent	bc7dea4480	ML: changing automatic check_window calculation (#35643 ) * ML: changing automatic check_window calculation * adding docs on how we calculate the default	2018-11-19 08:03:34 -06:00
Benjamin Trent	f7ada9b29b	Add delayed datacheck to the datafeed job runner (#35387 ) * ML: Adding missing datacheck to datafeedjob * Adding client side and docs * Making adjustments to validations * Making values default to on, having more sensible limits * Intermittent commit, still need to figure out interval * Adjusting delayed data check interval * updating docs * Making parameter Boolean, so it is nullable * bumping bwc to 7 before backport * changing to version current * moving delayed data check config its own object * Separation of duties for delayed data detection * fixing checkstyles * fixing checkstyles * Adjusting default behavior so that null windows are allowed * Mentioning the default value * Fixing comments, syncing up validations	2018-11-15 13:32:45 -06:00
Lisa Cawley	9aeaceac4b	[DOCS] Clarify results_index_name description (#35463 )	2018-11-12 13:08:57 -08:00
David Roberts	c455be7bc2	[ML] Rename the json file structure to ndjson (#34901 ) The file structure finder endpoint can find the NDJSON (newline-delimited JSON) file format, but called it `json`. This change renames the `format` for this file structure to `ndjson`, which is more precise and will hopefully avoid confusion.	2018-10-29 10:06:12 +01:00
Julie Tibshirani	f854330e06	Make sure to use the type _doc in the REST documentation. (#34662 ) * Replace custom type names with _doc in REST examples. * Avoid using two mapping types in the percolator docs. * Rename doc -> _doc in the main repository README. * Also replace some custom type names in the HLRC docs.	2018-10-22 11:54:04 -07:00
Ryan Ernst	d445785f1a	Scripting: Convert domainSplit function for ML to whitelist (#34426 ) This commit moves the definition of domainSplit into java and exposes it as a painless whitelist extension. The method also no longer needs params, and version which ignores params is added and deprecated.	2018-10-17 15:54:21 -07:00
David Roberts	21c759af0e	[ML] Add an ingest pipeline definition to structure finder (#34350 ) The ingest pipeline that is produced is very simple. It contains a grok processor if the format is semi-structured text, a date processor if the format contains a timestamp, and a remove processor if required to remove the interim timestamp field parsed out of semi-structured text. Eventually the UI should offer the option to customize the pipeline with additional processors to perform other data preparation steps before ingesting data to an index.	2018-10-12 07:56:35 +01:00
Dimitris Athanasiou	4dacfa95d2	[ML] Allow asynchronous job deletion (#34058 ) This changes the delete job API by adding the choice to delete a job asynchronously. The commit adds a `wait_for_completion` parameter to the delete job request. When set to `false`, the action returns immediately and the response contains the task id. This also changes the handling of subsequent delete requests for a job that is already being deleted. It now uses the task framework to check if the job is being deleted instead of the cluster state. This is a beneficial for it is going to also be working once the job configs are moved out of the cluster state and into an index. Also, force delete requests that are waiting for the job to be deleted will not proceed with the deletion if the first task fails. This will prevent overloading the cluster. Instead, the failure is communicated better via notifications so that the user may retry. Finally, this makes the `deleting` property of the job visible (also it was renamed from `deleted`). This allows a client to render a deleting job differently. Closes #32836	2018-10-05 02:41:28 +03:00
Ed Savage	577261ee57	[ML] Label anomalies with multi_bucket_impact (#34233 ) * [ML] Label anomalies with multi_bucket_impact Add the multi_bucket_impact field to record results.	2018-10-04 09:08:21 +01:00
Kazuhiro Sera	d45fe43a68	Fix a variety of typos and misspelled words (#32792 )	2018-10-03 18:11:38 +01:00
David Roberts	a1d2ded98d	[ML] Fix unit test deadlock problem (#34174 ) This change fixes a potential deadlock problem in the unit test introduced in #34117. It also removes a piece of debug code and corrects a docs formatting problem that were both added in that same PR.	2018-10-01 15:35:37 +01:00
lcawl	57052f617a	[DOCS] Fixes callout in ML API	2018-09-28 10:06:02 -07:00
David Roberts	f709c2f694	[ML] Add a timeout option to file structure finder (#34117 ) This can be used to restrict the amount of CPU a single structure finder request can use. The timeout is not implemented precisely, so requests may run for slightly longer than the timeout before aborting. The default is 25 seconds, which is a little below Kibana's default timeout of 30 seconds for calls to Elasticsearch APIs.	2018-09-28 17:32:35 +01:00
David Roberts	dfe5af0411	[ML] Return both Joda and Java formats from structure finder (#33900 ) Previously the timestamp_formats field in the response from the find_file_structure endpoint contained Joda timestamp formats. This change makes that clear by renaming the field to joda_timestamp_formats, and also adds a java_timestamp_formats field containing the equivalent Java time format strings.	2018-09-25 12:52:51 +01:00
David Roberts	b89551c452	[ML] Display integers without .0 in file structure field stats (#33947 ) Previously numeric values in the field_stats created by the find_file_structure endpoint were always output with a decimal point. This looked unfriendly and unnatural for fields that clearly store integer values. This change converts integer values to type Integer before output in the file structure field stats.	2018-09-22 15:48:59 +01:00
David Roberts	f5a2ffc3f6	[DOCS][ML] Document the ML find_file_structure endpoint (#33723 ) Relates #33471 Relates #33630	2018-09-20 12:17:09 +01:00
Lisa Cawley	7441c0376e	[DOCS] Adds delete forecast API (#33401 )	2018-09-06 09:20:42 -07:00
Lisa Cawley	b7a63f7e7d	[DOCS] Moves machine learning APIs to docs folder (#31118 )	2018-08-31 16:49:24 -07:00
Lisa Cawley	874ebcb6d4	[DOCS] Moves ml folder from x-pack/docs to docs (#33248 )	2018-08-31 11:56:26 -07:00

... 3 4 5 6 7 ...

424 Commits