elasticsearch

Commit Graph

Author	SHA1	Message	Date
Lisa Cawley	638fe2c26a	[DOCS] Fixes typo in start trained models API (#80368 )	2021-11-04 14:23:03 -07:00
Dimitris Athanasiou	d13baade69	[ML] Report start_time for trained model deployments and allocations (#80188 ) Adds `start_time` to the get deployment stats API for the deployment and each allocation. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2021-11-02 17:12:46 +02:00
David Kyle	58a517309a	[ML] [DOCS] Update the model part upload URL in example (#80181 )	2021-11-02 11:33:04 +00:00
Benjamin Trent	8887cfa080	[ML] updating the infer trained model deployment docs (#80083 ) the infer endpoint has changed its format. Also, the results format for the various tasks have changed. This updates the docs to match what is currently in 8.0.0.	2021-10-29 13:07:23 -04:00
Benjamin Trent	f9bf4e57b9	[ML] adds new params to the start trained model deployment docs (#80016 )	2021-10-28 11:23:25 -04:00
Benjamin Trent	d2b638356b	[ML] Update trained model docs for truncate parameter for bert tokenization (#79652 )	2021-10-28 07:19:10 -04:00
David Roberts	6b20e8e1b0	[ML] Fixing doc test substitution bug (#79943 ) The substitutions should not have a space after the field name. Fixes #79931	2021-10-27 19:45:15 +01:00
Mark Vieira	8f79cfacab	Mute documentation test	2021-10-27 09:48:20 -07:00
Lisa Cawley	610043f100	[DOCS] Edits formatting in create trained models API (#79758 ) Related to #78376 This PR fixes minor formatting issues in the create trained models API documentation	2021-10-27 07:41:11 -04:00
István Zoltán Szabó	c879db98b1	[DOCS] Updates get trained models API docs (#79372 ) * [DOCS] Updates get trained models API docs. * [DOCS] Reviews get trained models related definitions in ml-shared.	2021-10-25 11:47:45 +02:00
Benjamin Trent	498e6e3d0f	[ML] adding docs for estimated heap and operations (#78376 ) Add docs for optionally supplying memory and operation estimates in put model	2021-09-29 09:11:42 -04:00
Benjamin Trent	b96d929af3	[ML] add documentation for get deployment stats API (#78412 ) * [ML] add documentation for get deployment stats API * Apply suggestions from code review Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2021-09-29 07:20:25 -04:00
Benjamin Trent	408489310c	[ML] add zero_shot_classification task for BERT nlp models (#77799 ) Zero-Shot classification allows for text classification tasks without a pre-trained collection of target labels. This is achieved through models trained on the Multi-Genre Natural Language Inference (MNLI) dataset. This dataset pairs text sequences with "entailment" clauses. An example could be: "Throughout all of history, man kind has shown itself resourceful, yet astoundingly short-sighted" could have been paired with the entailment clauses: ["This example is history", "This example is sociology"...]. This training set combined with the attention and semantic knowledge in modern day NLP models (BERT, BART, etc.) affords a powerful tool for ad-hoc text classification. See https://arxiv.org/abs/1909.00161 for a deeper explanation of the MNLI training and how zero-shot works. The zeroshot classification task is configured as follows: ```js { // <snip> model configuration </snip> "inference_config" : { "zero_shot_classification": { "classification_labels": ["entailment", "neutral", "contradiction"], // <1> "labels": ["sad", "glad", "mad", "rad"], // <2> "multi_label": false, // <3> "hypothesis_template": "This example is {}.", // <4> "tokenization": { /<snip> tokenization configuration </snip>/} } } } ``` * <1> For all zero_shot models, there returns 3 particular labels when classification the target sequence. "entailment" is the positive case, "neutral" the case where the sequence isn't positive or negative, and "contradiction" is the negative case * <2> This is an optional parameter for the default zero_shot labels to attempt to classify * <3> When returning the probabilities, should the results assume there is only one true label or multiple true labels * <4> The hypothesis template when tokenizing the labels. When combining with `sad` the sequence looks like `This example is sad.` For inference in a pipeline one may provide label updates: ```js { //<snip> pipeline definition </snip> "processors": [ //<snip> other processors </snip> { "inference": { // <snip> general configuration </snip> "inference_config": { "zero_shot_classification": { "labels": ["humanities", "science", "mathematics", "technology"], // <1> "multi_label": true // <2> } } } } //<snip> other processors </snip> ] } ``` * <1> The `labels` we care about, these replace the default ones if they exist. * <2> Should the results allow multiple true labels Similarly one may provide label changes against the `_infer` endpoint ```js { "docs":[{ "text_field": "This is a very happy person"}], "inference_config":{"zero_shot_classification":{"labels": ["glad", "sad", "bad", "rad"], "multi_label": false}} } ```	2021-09-28 09:38:23 -04:00
Benjamin Trent	00defa38a9	[ML] adding some initial document for our pytorch NLP model support (#78270 ) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations	2021-09-27 12:46:13 -04:00
Lisa Cawley	b5a32678e7	[DOCS] Fixes admonition formatting (#77393 )	2021-09-08 11:20:43 -07:00
Benjamin Trent	a68c6acdb3	[ML] adding new PUT trained model vocabulary endpoint (#77387 ) This commit removes the ability to set the vocabulary location in the model config. This opts instead for sane defaults to be set and used. Wrapping this up in an API. The index is now always the internally managed .ml-inference-native index and the document ID is always <model_id>_vocabulary This API only works for pytorch/nlp type models.	2021-09-08 10:21:45 -04:00
Benjamin Trent	708491d0d3	[ML] add allocation state reason and support for partial model allocations (#76925 ) Previously, if a model failed to be allocated on any node, the deployment failed. This commit allows for an allocation to be partially_started and indicates its current state via a new state value in the deployment stats API. Additionally, when starting a deployment, the user may specify to wait_for starting, partially_started, started and the API will block (as long as timeout doesn't expire) until that state is reached.	2021-09-07 15:23:13 -04:00
Benjamin Trent	de49ff22a4	[ML] creating new PUT model definition part API (#76987 ) This commit simplifies the interactions for uploading chunked model definitions and model vocabulary.	2021-09-07 08:22:52 -04:00
Benjamin Trent	02e17c3442	[ML] adding new defer_definition_decompression parameter to put trained model API (#77189 ) This new parameter is a boolean parameter that allows users to put in a compressed model without it having to be inflated on the master node during the put request This is useful for system/module set up and then later having the model validated and fully parsed when it is being loaded on a node for usage	2021-09-03 09:07:54 -04:00
István Zoltán Szabó	cdec5228e8	[DOCS] Fixes line breaks. (#77248 )	2021-09-03 14:40:43 +02:00
István Zoltán Szabó	70a012b0c7	[DOCS] Fixes section IDs in start/stop trained model deployment APIs. (#77247 )	2021-09-03 14:24:37 +02:00
Benjamin Trent	0e1efa6533	[ML] generalize pytorch sentiment analysis to text classification (#77084 ) * [ML] generalize pytorch sentiment analysis to text classification * Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/TextClassificationConfig.java	2021-09-01 08:45:13 -04:00
István Zoltán Szabó	8aed99fc02	[DOCS] Adds links that point to loss function to ML API docs. (#76438 )	2021-08-23 13:09:37 +02:00
István Zoltán Szabó	9b0417f2df	[DOCS] Comments out links that points to regression loss functions (#76435 ) * [DOCS] Comments out links that points to regression loss functions. * Update docs/reference/ml/df-analytics/apis/get-trained-models.asciidoc	2021-08-12 18:33:42 +02:00
István Zoltán Szabó	ce537a33b6	[DOCS] Adds link that points to outlier detection example to GET DFA stats API docs. (#75689 )	2021-08-02 18:10:03 +02:00
István Zoltán Szabó	8d4fb3aa84	[DOCS] Changes link to outlier detection docs in PUTDFA API docs. (#75933 )	2021-08-02 13:45:37 +02:00
Lisa Cawley	02d851e50e	[DOCS] Drafts trained model deployment APIs (#75497 )	2021-07-26 09:49:37 -07:00
István Zoltán Szabó	7e7a386078	[DOCS] Comments out link that points to outlier detection example (#75687 )	2021-07-26 16:36:57 +02:00
István Zoltán Szabó	6a4de77e11	[DOCS] Adds classification and regression links back to DFA docs. (#74930 )	2021-07-08 16:37:16 +02:00
István Zoltán Szabó	841cfb9214	[DOCS] Adds outlier detection links to DFA API docs (#74748 )	2021-07-06 15:10:41 +02:00
István Zoltán Szabó	483d145f78	[DOCS] Fixes an attribute in PUT DFA API docs. (#74931 )	2021-07-05 17:08:11 +02:00
István Zoltán Szabó	6c6e6874ff	[DOCS] Removes link to classification and regression. (#74926 )	2021-07-05 16:28:14 +02:00
István Zoltán Szabó	a4f9f4fae1	[DOCS] Comments out links to outlier detection. (#74745 )	2021-06-30 14:24:34 +02:00
István Zoltán Szabó	1ce2308e2a	[DOCS] Adds max_trees hyperparameter to GET TM API docs (#72298 )	2021-05-06 08:18:19 +02:00
István Zoltán Szabó	ce9dd74cf5	[DOCS] Expands DFA and TM API docs with required privileges info (#71335 )	2021-04-28 08:33:42 +02:00
James Rodewig	693807a6d3	[DOCS] Fix double spaces (#71082 )	2021-03-31 09:57:47 -04:00
István Zoltán Szabó	1db2b85e45	[DOCS] Adds source index privileges required for Explain DFA API docs. (#70978 )	2021-03-30 10:42:48 +02:00
István Zoltán Szabó	9a8c6fb66f	[DOCS] Removes beta labels from DFA related docs. (#70808 )	2021-03-26 09:46:41 +01:00
Lisa Cawley	2caba7b11f	[DOCS] Edits machine learning settings (#69947 ) Co-authored-by: David Roberts <dave.roberts@elastic.co>	2021-03-09 10:59:12 -08:00
Lisa Cawley	c537e5f38c	[DOCS] Edits delete trained model alias API (#70119 )	2021-03-08 17:08:58 -08:00
István Zoltán Szabó	2ccc81081f	[DOCS] Adds hyperparameters option to the include setting of GET trained models API. (#69959 )	2021-03-04 16:43:06 +01:00
Benjamin Trent	2279cafb4e	[ML] adding new _preview endpoint for data frame analytics (#69453 ) This commit adds a new `_preview` endpoint for data frame analytics. This allows users to see the data on which their model will be trained. This is especially useful in the arrival of custom feature processors. The API design is a similar to datafeed `_preview` and data frame analytics `_explain`.	2021-03-01 12:25:50 -05:00
Lisa Cawley	138224b398	[DOCS] Edits trained model alias API (#69491 )	2021-02-24 08:17:49 -08:00
Dimitris Athanasiou	7fb98c0d3c	[ML] Add runtime mappings to data frame analytics source config (#69183 ) Users can now specify runtime mappings as part of the source config of a data frame analytics job. Those runtime mappings become part of the mapping of the destination index. This ensures the fields are accessible in the destination index even if the relevant data frame analytics job gets deleted. Closes #65056	2021-02-19 16:29:19 +02:00
Benjamin Trent	0af38bba9e	[ML] add new delete trained model aliases API (#69195 ) In addition to creating and re-assigning model aliases, users should be able to delete existing and unused model aliases.	2021-02-18 13:12:07 -05:00
Benjamin Trent	26eef892df	[ML] adds new trained model alias API to simplify trained model updates and deployments (#68922 ) A `model_alias` allows trained models to be referred by a user defined moniker. This not only improves the readability and simplicity of numerous API calls, but it allows for simpler deployment and upgrade procedures for trained models. Previously, if you referenced a model ID directly within an ingest pipeline, when you have a new model that performs better than an earlier referenced model, you have to update the pipeline itself. If this model was used in numerous pipelines, ALL those pipelines would have to be updated. When using a `model_alias` in an ingest pipeline, only that `model_alias` needs to be updated. Then, the underlying referenced model will change in place for all ingest pipelines automatically. An additional benefit is that the model referenced is not changed until it is fully loaded into cache, this way throughput is not hampered by changing models.	2021-02-18 09:41:50 -05:00
Lisa Cawley	a1fb2c3606	[DOCS] Fixes n_gram_encoding in data frame analytics APIs (#69084 )	2021-02-16 14:02:00 -08:00
Valeriy Khakhutskyy	78368428b3	[ML] Add early stopping DFA configuration parameter (#68099 ) The PR adds early_stopping_enabled optional data frame analysis configuration parameter. The enhancement was already described in elastic/ml-cpp#1676 and so I mark it here as non-issue.	2021-02-01 11:41:28 +01:00
Dimitris Athanasiou	5c961c1c81	[ML] Expand regression/classification hyperparameters (#67950 ) Expands data frame analytics regression and classification analyses with the followin hyperparameters: - alpha - downsample_factor - eta_growth_rate_per_tree - max_optimization_rounds_per_hyperparameter - soft_tree_depth_limit - soft_tree_depth_tolerance	2021-01-26 12:56:41 +02:00
István Zoltán Szabó	addb5cbd3a	[DOCS] Adds custom feature processors description to PUT DFA API (#67424 ) Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2021-01-19 09:47:32 +01:00

1 2 3 4

177 Commits