Adds `start_time` to the get deployment stats API for the deployment
and each allocation.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
the infer endpoint has changed its format.
Also, the results format for the various tasks have changed. This updates the docs to match what is currently in 8.0.0.
* [ML] add documentation for get deployment stats API
* Apply suggestions from code review
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Zero-Shot classification allows for text classification tasks without a pre-trained collection of target labels.
This is achieved through models trained on the Multi-Genre Natural Language Inference (MNLI) dataset. This dataset pairs text sequences with "entailment" clauses. An example could be:
"Throughout all of history, man kind has shown itself resourceful, yet astoundingly short-sighted" could have been paired with the entailment clauses: ["This example is history", "This example is sociology"...].
This training set combined with the attention and semantic knowledge in modern day NLP models (BERT, BART, etc.) affords a powerful tool for ad-hoc text classification.
See https://arxiv.org/abs/1909.00161 for a deeper explanation of the MNLI training and how zero-shot works.
The zeroshot classification task is configured as follows:
```js
{
// <snip> model configuration </snip>
"inference_config" : {
"zero_shot_classification": {
"classification_labels": ["entailment", "neutral", "contradiction"], // <1>
"labels": ["sad", "glad", "mad", "rad"], // <2>
"multi_label": false, // <3>
"hypothesis_template": "This example is {}.", // <4>
"tokenization": { /*<snip> tokenization configuration </snip>*/}
}
}
}
```
* <1> For all zero_shot models, there returns 3 particular labels when classification the target sequence. "entailment" is the positive case, "neutral" the case where the sequence isn't positive or negative, and "contradiction" is the negative case
* <2> This is an optional parameter for the default zero_shot labels to attempt to classify
* <3> When returning the probabilities, should the results assume there is only one true label or multiple true labels
* <4> The hypothesis template when tokenizing the labels. When combining with `sad` the sequence looks like `This example is sad.`
For inference in a pipeline one may provide label updates:
```js
{
//<snip> pipeline definition </snip>
"processors": [
//<snip> other processors </snip>
{
"inference": {
// <snip> general configuration </snip>
"inference_config": {
"zero_shot_classification": {
"labels": ["humanities", "science", "mathematics", "technology"], // <1>
"multi_label": true // <2>
}
}
}
}
//<snip> other processors </snip>
]
}
```
* <1> The `labels` we care about, these replace the default ones if they exist.
* <2> Should the results allow multiple true labels
Similarly one may provide label changes against the `_infer` endpoint
```js
{
"docs":[{ "text_field": "This is a very happy person"}],
"inference_config":{"zero_shot_classification":{"labels": ["glad", "sad", "bad", "rad"], "multi_label": false}}
}
```
This commit removes the ability to set the vocabulary location in the model config.
This opts instead for sane defaults to be set and used. Wrapping this up in an
API.
The index is now always the internally managed .ml-inference-native index
and the document ID is always <model_id>_vocabulary
This API only works for pytorch/nlp type models.
Previously, if a model failed to be allocated on any node, the deployment failed.
This commit allows for an allocation to be partially_started and indicates its
current state via a new state value in the deployment stats API.
Additionally, when starting a deployment, the user may specify to wait_for
starting, partially_started, started and the API will block (as long as timeout doesn't expire) until that state is reached.
This new parameter is a boolean parameter that allows
users to put in a compressed model without it having
to be inflated on the master node during the put
request
This is useful for system/module set up and then later
having the model validated and fully parsed when it
is being loaded on a node for usage
This commit adds a new `_preview` endpoint for data frame analytics.
This allows users to see the data on which their model will be trained. This is especially useful
in the arrival of custom feature processors.
The API design is a similar to datafeed `_preview` and data frame analytics `_explain`.
Users can now specify runtime mappings as part of the source config
of a data frame analytics job. Those runtime mappings become part of
the mapping of the destination index. This ensures the fields are
accessible in the destination index even if the relevant data frame
analytics job gets deleted.
Closes#65056
A `model_alias` allows trained models to be referred by a user defined moniker.
This not only improves the readability and simplicity of numerous API calls, but it allows for simpler deployment and upgrade procedures for trained models.
Previously, if you referenced a model ID directly within an ingest pipeline, when you have a new model that performs better than an earlier referenced model, you have to update the pipeline itself. If this model was used in numerous pipelines, ALL those pipelines would have to be updated.
When using a `model_alias` in an ingest pipeline, only that `model_alias` needs to be updated. Then, the underlying referenced model will change in place for all ingest pipelines automatically.
An additional benefit is that the model referenced is not changed until it is fully loaded into cache, this way throughput is not hampered by changing models.
The PR adds early_stopping_enabled optional data frame analysis configuration parameter. The enhancement was already described in elastic/ml-cpp#1676 and so I mark it here as non-issue.