elasticsearch/docs/reference/ml/trained-models/apis/start-trained-model-deploym...

[role="xpack"]
[[start-trained-model-deployment]]
= Start trained model deployment API
[subs="attributes"]
++++
<titleabbrev>Start trained model deployment</titleabbrev>
++++

experimental::[]

Starts a new trained model deployment.

[[start-trained-model-deployment-request]]
== {api-request-title}

`POST _ml/trained_models/<model_id>/deployment/_start`

[[start-trained-model-deployment-prereq]]
== {api-prereq-title}
Requires the `manage_ml` cluster privilege. This privilege is included in the
`machine_learning_admin` built-in role.

[[start-trained-model-deployment-desc]]
== {api-description-title}

Currently only `pytorch` models are supported for deployment. When deployed,
the model attempts allocation to every machine learning node.

[[start-trained-model-deployment-path-params]]
== {api-path-parms-title}

`<model_id>`::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]

[[start-trained-model-deployment-query-params]]
== {api-query-parms-title}

`inference_threads`::
(Optional, integer)
Sets the number of threads used by the inference process. This generally increases
the inference speed. The inference process is a compute-bound process; any number 
greater than the number of available CPU cores on the machine does not increase the 
inference speed.
Defaults to 1.

`model_threads`::
(Optional, integer)
Indicates how many threads are used when sending inference requests to
the model. Increasing this value generally increases the throughput. Defaults to
1.

`queue_capacity`::
(Optional, integer)
Controls how many inference requests are allowed in the queue at a time. Once the
number of requests exceeds this value, new requests are rejected with a 429 error.
Defaults to 1024.

`timeout`::
(Optional, time)
Controls the amount of time to wait for the model to deploy. Defaults
to 20 seconds.

`wait_for`::
(Optional, string)
Specifies the allocation status to wait for before returning. Defaults to
`started`. The value `starting` indicates deployment is starting but not yet on
any node. The value `started` indicates the model has started on at least one
node. The value `fully_allocated` indicates the deployment has started on all
valid nodes.

[[start-trained-model-deployment-example]]
== {api-examples-title}

The following example starts a new deployment for a
`elastic__distilbert-base-uncased-finetuned-conll03-english` trained model:

[source,console]
--------------------------------------------------
POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_start?wait_for=started&timeout=1m
--------------------------------------------------
// TEST[skip:TBD]

The API returns the following results:

[source,console-result]
----
{
    "allocation": {
        "task_parameters": {
            "model_id": "elastic__distilbert-base-uncased-finetuned-conll03-english",
            "model_bytes": 265632637
        },
        "routing_table": {
            "uckeG3R8TLe2MMNBQ6AGrw": {
                "routing_state": "started",
                "reason": ""
            }
        },
        "allocation_state": "started",
        "start_time": "2021-11-02T11:50:34.766591Z"
    }
}
----
[DOCS] Drafts trained model deployment APIs (#75497) 2021-07-27 00:49:37 +08:00			`[role="xpack"]`
			`[[start-trained-model-deployment]]`
			`= Start trained model deployment API`
			`[subs="attributes"]`
			`++++`
			`<titleabbrev>Start trained model deployment</titleabbrev>`
			`++++`

[DOCS] Fixes start and stop trained model deployment APIs (#80978) 2021-11-25 02:09:45 +08:00			`experimental::[]`

[ML] adding some initial document for our pytorch NLP model support (#78270) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations 2021-09-28 00:46:13 +08:00			`Starts a new trained model deployment.`

[DOCS] Drafts trained model deployment APIs (#75497) 2021-07-27 00:49:37 +08:00			`[[start-trained-model-deployment-request]]`
			`== {api-request-title}`

[DOCS] Fixes typo in start trained models API (#80368) 2021-11-05 05:23:03 +08:00			`POST _ml/trained_models/<model_id>/deployment/_start`
[ML] adding some initial document for our pytorch NLP model support (#78270) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations 2021-09-28 00:46:13 +08:00
[DOCS] Drafts trained model deployment APIs (#75497) 2021-07-27 00:49:37 +08:00			`[[start-trained-model-deployment-prereq]]`
			`== {api-prereq-title}`
[ML] adding some initial document for our pytorch NLP model support (#78270) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations 2021-09-28 00:46:13 +08:00			Requires the `manage_ml` cluster privilege. This privilege is included in the
			`machine_learning_admin` built-in role.
[DOCS] Drafts trained model deployment APIs (#75497) 2021-07-27 00:49:37 +08:00
			`[[start-trained-model-deployment-desc]]`
			`== {api-description-title}`

[ML] adding some initial document for our pytorch NLP model support (#78270) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations 2021-09-28 00:46:13 +08:00			Currently only `pytorch` models are supported for deployment. When deployed,
			`the model attempts allocation to every machine learning node.`
[DOCS] Drafts trained model deployment APIs (#75497) 2021-07-27 00:49:37 +08:00
			`[[start-trained-model-deployment-path-params]]`
			`== {api-path-parms-title}`

			`<model_id>`::
			`(Required, string)`
			`include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]`

			`[[start-trained-model-deployment-query-params]]`
			`== {api-query-parms-title}`

[ML] adds new params to the start trained model deployment docs (#80016) 2021-10-28 23:23:25 +08:00			`inference_threads`::
			`(Optional, integer)`
			`Sets the number of threads used by the inference process. This generally increases`
			`the inference speed. The inference process is a compute-bound process; any number`
			`greater than the number of available CPU cores on the machine does not increase the`
			`inference speed.`
			`Defaults to 1.`

[DOCS] Fixes start and stop trained model deployment APIs (#80978) 2021-11-25 02:09:45 +08:00			`model_threads`::
			`(Optional, integer)`
			`Indicates how many threads are used when sending inference requests to`
			`the model. Increasing this value generally increases the throughput. Defaults to`
			`1.`

[ML] adds new params to the start trained model deployment docs (#80016) 2021-10-28 23:23:25 +08:00			`queue_capacity`::
			`(Optional, integer)`
			`Controls how many inference requests are allowed in the queue at a time. Once the`
			`number of requests exceeds this value, new requests are rejected with a 429 error.`
			`Defaults to 1024.`

[DOCS] Fixes start and stop trained model deployment APIs (#80978) 2021-11-25 02:09:45 +08:00			`timeout`::
			`(Optional, time)`
			`Controls the amount of time to wait for the model to deploy. Defaults`
			`to 20 seconds.`

			`wait_for`::
			`(Optional, string)`
			`Specifies the allocation status to wait for before returning. Defaults to`
			`started`. The value `starting` indicates deployment is starting but not yet on
			any node. The value `started` indicates the model has started on at least one
			node. The value `fully_allocated` indicates the deployment has started on all
			`valid nodes.`

[DOCS] Drafts trained model deployment APIs (#75497) 2021-07-27 00:49:37 +08:00			`[[start-trained-model-deployment-example]]`
			`== {api-examples-title}`
[ML] add allocation state reason and support for partial model allocations (#76925) Previously, if a model failed to be allocated on any node, the deployment failed. This commit allows for an allocation to be partially_started and indicates its current state via a new state value in the deployment stats API. Additionally, when starting a deployment, the user may specify to wait_for starting, partially_started, started and the API will block (as long as timeout doesn't expire) until that state is reached. 2021-09-08 03:23:13 +08:00
[ML] adding some initial document for our pytorch NLP model support (#78270) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations 2021-09-28 00:46:13 +08:00			`The following example starts a new deployment for a`
[ML] adds new params to the start trained model deployment docs (#80016) 2021-10-28 23:23:25 +08:00			`elastic__distilbert-base-uncased-finetuned-conll03-english` trained model:
[ML] adding some initial document for our pytorch NLP model support (#78270) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations 2021-09-28 00:46:13 +08:00
			`[source,console]`
			`--------------------------------------------------`
			`POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_start?wait_for=started&timeout=1m`
			`--------------------------------------------------`
			`// TEST[skip:TBD]`

			`The API returns the following results:`

			`[source,console-result]`
			`----`
			`{`
			`"allocation": {`
			`"task_parameters": {`
			`"model_id": "elastic__distilbert-base-uncased-finetuned-conll03-english",`
			`"model_bytes": 265632637`
			`},`
			`"routing_table": {`
			`"uckeG3R8TLe2MMNBQ6AGrw": {`
			`"routing_state": "started",`
			`"reason": ""`
			`}`
			`},`
[ML] Report start_time for trained model deployments and allocations (#80188) Adds `start_time` to the get deployment stats API for the deployment and each allocation. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> 2021-11-02 23:12:46 +08:00			`"allocation_state": "started",`
			`"start_time": "2021-11-02T11:50:34.766591Z"`
[ML] adding some initial document for our pytorch NLP model support (#78270) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations 2021-09-28 00:46:13 +08:00			`}`
			`}`
			`----`