This commit builds on our suite of LTR feature extractors. Now, instead
of just runtime-fields and document fields, query interaction features
are supported.
A user can store a query (for features that don't need `_search` time
information) or supply one via the `inference_config` object in the
`inference_rescorer` object.
An example stored configuration:
```
{
//<snip>
"input": {"field_names": ["cost", "product"]},
"inference_config": {
"learn_to_rank": {
"feature_extractors": [{"query_extractor": {"feature_name": "two", "query": {"script_score": {"query": {"match_all":{}}, "script": {"source": "return 2.0;"}}}}}]
}
}
//</snip>
}
```
The above will provide the document/runtime fields `cost` and `product`
to the model at inference time, along with an extracted feature called
`"two"`.
However the more general usage would be features required at search
time.
```
POST _search
{
"query": {"match": {"field": {"query": "quick brown fox"}}},
"rescorer": {
"window_size": 10,
"inference": {
"model_id": "ltr_model",
"inference_config": {
"learn_to_rank": {"feature_extractors":[{"query_extractor": {"feature_name": "field_bm25", "query": {"match": {"field": {"query": "quick brown fox"}}}}]}
}
}
}
}
```
All queries are grabbed as early as possible within the search request.
This way the appropriate rewrites can occur. Additionally, the parsed
queries from the rescorer are provided via DFS, so that term-stats can
be gathered without any additional configuration by the user. This means
that terms only used via a `feature_extractor` can have accurate term
statistics when using DFS.