elasticsearch

Commit Graph

Author	SHA1	Message	Date
Lisa Cawley	6b7320790f	[DOCS] Updates example output for start trained model deployment API (#86824 )	2022-05-17 07:27:44 -07:00
Dimitris Athanasiou	68c51f3ada	[ML] Rename threading params in _start trained model deployment API (#86597 ) When starting a trained model deployment the user can tweak performance by setting the `model_threads` and `inference_threads` parameters. These parameters are hard to understand and cause confusion. This commit renames these as well as the fields where their values are reported in the stats API. - `model_threads` => `number_of_allocations` - `inference_threads` => `threads_per_allocation` Now the terminology is as follows. A model deployment starts with a requested `number_of_allocations`. Each allocation means the model gets another thread for executing parallel inference requests. Thus, more allocations should increase throughput. In its turn, each allocation is may be using a number of threads to parallelize each individual inference request. This is the `threads_per_allocation` setting and increases inference speed (which might also result in improved throughput).	2022-05-10 17:41:00 +03:00
Lisa Cawley	89a3e18e10	[DOCS] Add preview admonition to infer API (#86486 )	2022-05-05 13:49:02 -07:00
Benjamin Trent	a907f0bb6f	[ML] add new trained_models/{model_id}/_infer endpoint for all supervised models and deprecate deployment infer api (#86361 ) This commit adds a new `_ml/trained_models/{model_id}/_infer` API. This api works for both native NLP models and supervised models trained via Data Frame analytics. The format of the API is the same as the old `_ml/trained_models/{model_id}/deployment/_infer`. Taking a `docs` and an `inference_config` parameter. This PR also deprecates the old experimental `_ml/trained_models/{model_id}/deployment/_infer` API. The biggest difference is that the response now nests all results under an "inference_results" object. closes: https://github.com/elastic/elasticsearch/issues/86032	2022-05-05 14:58:59 -04:00
Benjamin Trent	25d1afbe6f	[ML] rename trained model allocations to assignments (#85503 ) This renames the internal concept of a trained model allocation into an assignment. Now models are assigned to a node and routes created for inference. Not "allocated". This is an internal rename only. The user facing concepts of trained models and deployments are untouched.	2022-04-18 11:35:10 -04:00
Dimitris Athanasiou	5d670e45ac	Revert "[ML] Only one of `inference_threads` and `model_threads` may be great… (#84794 )" (#85089 ) This reverts commit `4eaedb265d`. On further investigation of how to improve allocation of trained models, we concluded that being able to set `inference_threads` in combination with `model_threads` is fundamental for scalability.	2022-03-18 09:41:27 +02:00
Dimitris Athanasiou	4eaedb265d	[ML] Only one of `inference_threads` and `model_threads` may be great… (#84794 ) Starting a trained model deployment the user may set values for `inference_threads` of `model_threads`. The first improves latency whereas the latter improves throughput. It is easier to reason on how a model allocation uses resources if we ensure only one of those two may be greater than one. In addition, it allows us to distribute the cores of the ML nodes in the cluster across the model allocations in the future. This commit adds a validation that prevents both `inference_threads` and `model_threads` to be greater than one.	2022-03-09 16:33:35 +02:00
David Kyle	1473b09415	[ML] Add NLP inference configs to the inference processor docs (#82320 )	2022-01-11 08:50:45 +00:00
David Kyle	d1ee756da8	[ML][DOCS] Add note about max values of thread settings (#81367 )	2021-12-14 13:07:34 +00:00
Lisa Cawley	429bdd9afc	[DOCS] Move trained model APIs out of dataframe analytics (#81315 )	2021-12-03 09:21:09 -08:00

10 Commits