The feature branch contains changes to configure PyTorch models with a
TrainedModelConfig and defines a format to store the binary models.
The _start and _stop deployment actions control the model lifecycle
and the model can be directly evaluated with the _infer endpoint.
2 Types of NLP tasks are supported: Named Entity Recognition and Fill Mask.
The feature branch consists of these PRs: #73523, #72218, #71679#71323, #71035, #71177, #70713
A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder.
The inference processor can still override whatever is set as the default in the trained model config.
When `PUT` is called to store a trained model, it is useful to return the newly create model config. But, it is NOT useful to return the inflated definition.
These definitions can be large and returning the inflated definition causes undo work on the server and client side.
This adds the `PUT` API for creating trained models that support our format.
This includes
* HLRC change for the API
* API creation
* Validations of model format and call