2019-08-16 21:46:06 +08:00
|
|
|
[role="xpack"]
|
|
|
|
[[query-dsl-pinned-query]]
|
|
|
|
=== Pinned Query
|
|
|
|
Promotes selected documents to rank higher than those matching a given query.
|
|
|
|
This feature is typically used to guide searchers to curated documents that are
|
2021-07-27 20:55:07 +08:00
|
|
|
promoted over and above any "organic" matches for a search.
|
2019-08-16 21:46:06 +08:00
|
|
|
The promoted or "pinned" documents are identified using the document IDs stored in
|
|
|
|
the <<mapping-id-field,`_id`>> field.
|
|
|
|
|
|
|
|
==== Example request
|
|
|
|
|
2019-09-09 22:45:37 +08:00
|
|
|
[source,console]
|
2019-08-16 21:46:06 +08:00
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
2020-07-22 00:24:26 +08:00
|
|
|
"query": {
|
|
|
|
"pinned": {
|
|
|
|
"ids": [ "1", "4", "100" ],
|
|
|
|
"organic": {
|
|
|
|
"match": {
|
|
|
|
"description": "iphone"
|
2019-08-16 21:46:06 +08:00
|
|
|
}
|
2020-07-22 00:24:26 +08:00
|
|
|
}
|
2019-08-16 21:46:06 +08:00
|
|
|
}
|
2020-07-22 00:24:26 +08:00
|
|
|
}
|
|
|
|
}
|
2019-08-16 21:46:06 +08:00
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
[[pinned-query-top-level-parameters]]
|
|
|
|
==== Top-level parameters for `pinned`
|
|
|
|
|
|
|
|
`ids`::
|
2021-07-27 20:55:07 +08:00
|
|
|
(Optional, array) <<mapping-id-field, Document IDs>> listed in the order they are to appear in results.
|
|
|
|
Required if `docs` is not specified.
|
|
|
|
`docs`::
|
|
|
|
(Optional, array) Documents listed in the order they are to appear in results.
|
|
|
|
Required if `ids` is not specified.
|
|
|
|
You can specify the following attributes for each document:
|
|
|
|
+
|
|
|
|
--
|
|
|
|
`_id`::
|
|
|
|
(Required, string) The unique <<mapping-id-field, document ID>>.
|
|
|
|
|
|
|
|
`_index`::
|
Make _index optional for pinned query docs (#97450)
Currently pinned queries require either the `ids` or `docs` parameter.
`docs` allows pinning documents from specific indices. However for
`docs` the `_index` field is always required:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
```
returns an error:
```
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22
}
],
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22,
"caused_by": {
"type": "x_content_parse_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Required [_index]"
}
}
},
"status": 400
}
```
The proposal here is to make `_index` optional. I don't think we have a
strong requirement for making `_index` required, when it was initially
introduced in https://github.com/elastic/elasticsearch/pull/74873, we
mostly wanted the ability to pin docs from specific indices.
Making `_index` optional can give more flexibility to use a combination
of pinned documents from specific indices or just document ids. This
change can also help with pinned query rules. Currently pinned query
rules can accept either `ids` or `docs`. If multiple pinned query rules
match and they use a combination of `ids` and `docs`, we cannot build a
pinned query and we would need to return an error. This is because a
pinned query cannot accept both `ids` and `docs`. By making `_index`
optional we would no longer need to return an error when pinned query
rules use a combination of `ids` and `docs`, because we can easily
translate `ids` in `docs`.
The following pinned queries would be equivalent:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"ids": [1]
}
}
}
```
The scores should be consistent when using a combination of _docs that
might use _index or not - see example
<details> <summary>Example </summary>
```
PUT test-1/_doc/1 { "title": "doc 1" }
PUT test-1/_doc/2 { "title": "doc 2" }
PUT test-2/_doc/1 { "title": "doc 1" }
PUT test-2/_doc/3 { "title": "lalala" }
POST test-1,test-2/_search { "query": { "pinned": {
"organic": { "query_string": { "query": "lalala"
} }, "docs": [ { "_id": "2", "_index": "test-1" },
{ "_id": "1" } ] } } }
```
response:
```
{ "took": 1, "timed_out": false, "_shards": { "total": 2,
"successful": 2, "skipped": 0, "failed": 0 }, "hits": {
"total": { "value": 4, "relation": "eq" },
"max_score": 1.7014124e+38, "hits": [ { "_index":
"test-1", "_id": "2", "_score": 1.7014124e+38,
"_source": { "title": "doc 2" } }, {
"_index": "test-1", "_id": "1", "_score": 1.7014122e+38,
// same score as doc with id 1 from test-2 "_source": {
"title": "doc 1" } }, { "_index": "test-2",
"_id": "1", "_score": 1.7014122e+38, // same score as doc with
id 1 from test-1 "_source": { "title": "doc 1"
} }, { "_index": "test-2", "_id": "3",
"_score": 0.8025915, // organic result "_source": {
"title": "lalala" } } ] } }
```
</details>
For query rules, if we have two query rules that both match and use a
combination of `ids` and `pinned`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_index": "singers", "_id": "1" }
]
}
},
{
"rule_id": "2",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"ids": [
2
]
}
}
]
}
```
and the following query:
```
POST singers/_search
{
"query": {
"rule_query": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"match_criteria": {
"query_string": "country"
},
"ruleset_id": "test-ruleset"
}
}
}
```
then this would get translated into the following pinned query:
```
POST singers/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"docs": [
{ "_index": "singers", "_id": "1" },
{"_id": 2 }
]
}
}
}
```
I think we can also simplify the pinned query rule so that it only
receives `docs`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_id": "1" },
{ "_id": "2", "_index": "singers" }
]
}
}
]
}
```
2023-09-07 16:39:56 +08:00
|
|
|
(Optional, string) The index that contains the document.
|
2021-07-27 20:55:07 +08:00
|
|
|
--
|
2019-08-16 21:46:06 +08:00
|
|
|
`organic`::
|
2021-07-27 20:55:07 +08:00
|
|
|
Any choice of query used to rank documents which will be ranked below the "pinned" documents.
|
|
|
|
|
|
|
|
==== Pin documents in a specific index
|
|
|
|
|
|
|
|
If you're searching over multiple indices, you can pin a document within a specific index using `docs`:
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"pinned": {
|
|
|
|
"docs": [
|
|
|
|
{
|
Make _index optional for pinned query docs (#97450)
Currently pinned queries require either the `ids` or `docs` parameter.
`docs` allows pinning documents from specific indices. However for
`docs` the `_index` field is always required:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
```
returns an error:
```
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22
}
],
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22,
"caused_by": {
"type": "x_content_parse_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Required [_index]"
}
}
},
"status": 400
}
```
The proposal here is to make `_index` optional. I don't think we have a
strong requirement for making `_index` required, when it was initially
introduced in https://github.com/elastic/elasticsearch/pull/74873, we
mostly wanted the ability to pin docs from specific indices.
Making `_index` optional can give more flexibility to use a combination
of pinned documents from specific indices or just document ids. This
change can also help with pinned query rules. Currently pinned query
rules can accept either `ids` or `docs`. If multiple pinned query rules
match and they use a combination of `ids` and `docs`, we cannot build a
pinned query and we would need to return an error. This is because a
pinned query cannot accept both `ids` and `docs`. By making `_index`
optional we would no longer need to return an error when pinned query
rules use a combination of `ids` and `docs`, because we can easily
translate `ids` in `docs`.
The following pinned queries would be equivalent:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"ids": [1]
}
}
}
```
The scores should be consistent when using a combination of _docs that
might use _index or not - see example
<details> <summary>Example </summary>
```
PUT test-1/_doc/1 { "title": "doc 1" }
PUT test-1/_doc/2 { "title": "doc 2" }
PUT test-2/_doc/1 { "title": "doc 1" }
PUT test-2/_doc/3 { "title": "lalala" }
POST test-1,test-2/_search { "query": { "pinned": {
"organic": { "query_string": { "query": "lalala"
} }, "docs": [ { "_id": "2", "_index": "test-1" },
{ "_id": "1" } ] } } }
```
response:
```
{ "took": 1, "timed_out": false, "_shards": { "total": 2,
"successful": 2, "skipped": 0, "failed": 0 }, "hits": {
"total": { "value": 4, "relation": "eq" },
"max_score": 1.7014124e+38, "hits": [ { "_index":
"test-1", "_id": "2", "_score": 1.7014124e+38,
"_source": { "title": "doc 2" } }, {
"_index": "test-1", "_id": "1", "_score": 1.7014122e+38,
// same score as doc with id 1 from test-2 "_source": {
"title": "doc 1" } }, { "_index": "test-2",
"_id": "1", "_score": 1.7014122e+38, // same score as doc with
id 1 from test-1 "_source": { "title": "doc 1"
} }, { "_index": "test-2", "_id": "3",
"_score": 0.8025915, // organic result "_source": {
"title": "lalala" } } ] } }
```
</details>
For query rules, if we have two query rules that both match and use a
combination of `ids` and `pinned`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_index": "singers", "_id": "1" }
]
}
},
{
"rule_id": "2",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"ids": [
2
]
}
}
]
}
```
and the following query:
```
POST singers/_search
{
"query": {
"rule_query": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"match_criteria": {
"query_string": "country"
},
"ruleset_id": "test-ruleset"
}
}
}
```
then this would get translated into the following pinned query:
```
POST singers/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"docs": [
{ "_index": "singers", "_id": "1" },
{"_id": 2 }
]
}
}
}
```
I think we can also simplify the pinned query rule so that it only
receives `docs`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_id": "1" },
{ "_id": "2", "_index": "singers" }
]
}
}
]
}
```
2023-09-07 16:39:56 +08:00
|
|
|
"_index": "my-index-000001", <1>
|
2021-07-27 20:55:07 +08:00
|
|
|
"_id": "1"
|
|
|
|
},
|
|
|
|
{
|
Make _index optional for pinned query docs (#97450)
Currently pinned queries require either the `ids` or `docs` parameter.
`docs` allows pinning documents from specific indices. However for
`docs` the `_index` field is always required:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
```
returns an error:
```
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22
}
],
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22,
"caused_by": {
"type": "x_content_parse_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Required [_index]"
}
}
},
"status": 400
}
```
The proposal here is to make `_index` optional. I don't think we have a
strong requirement for making `_index` required, when it was initially
introduced in https://github.com/elastic/elasticsearch/pull/74873, we
mostly wanted the ability to pin docs from specific indices.
Making `_index` optional can give more flexibility to use a combination
of pinned documents from specific indices or just document ids. This
change can also help with pinned query rules. Currently pinned query
rules can accept either `ids` or `docs`. If multiple pinned query rules
match and they use a combination of `ids` and `docs`, we cannot build a
pinned query and we would need to return an error. This is because a
pinned query cannot accept both `ids` and `docs`. By making `_index`
optional we would no longer need to return an error when pinned query
rules use a combination of `ids` and `docs`, because we can easily
translate `ids` in `docs`.
The following pinned queries would be equivalent:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"ids": [1]
}
}
}
```
The scores should be consistent when using a combination of _docs that
might use _index or not - see example
<details> <summary>Example </summary>
```
PUT test-1/_doc/1 { "title": "doc 1" }
PUT test-1/_doc/2 { "title": "doc 2" }
PUT test-2/_doc/1 { "title": "doc 1" }
PUT test-2/_doc/3 { "title": "lalala" }
POST test-1,test-2/_search { "query": { "pinned": {
"organic": { "query_string": { "query": "lalala"
} }, "docs": [ { "_id": "2", "_index": "test-1" },
{ "_id": "1" } ] } } }
```
response:
```
{ "took": 1, "timed_out": false, "_shards": { "total": 2,
"successful": 2, "skipped": 0, "failed": 0 }, "hits": {
"total": { "value": 4, "relation": "eq" },
"max_score": 1.7014124e+38, "hits": [ { "_index":
"test-1", "_id": "2", "_score": 1.7014124e+38,
"_source": { "title": "doc 2" } }, {
"_index": "test-1", "_id": "1", "_score": 1.7014122e+38,
// same score as doc with id 1 from test-2 "_source": {
"title": "doc 1" } }, { "_index": "test-2",
"_id": "1", "_score": 1.7014122e+38, // same score as doc with
id 1 from test-1 "_source": { "title": "doc 1"
} }, { "_index": "test-2", "_id": "3",
"_score": 0.8025915, // organic result "_source": {
"title": "lalala" } } ] } }
```
</details>
For query rules, if we have two query rules that both match and use a
combination of `ids` and `pinned`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_index": "singers", "_id": "1" }
]
}
},
{
"rule_id": "2",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"ids": [
2
]
}
}
]
}
```
and the following query:
```
POST singers/_search
{
"query": {
"rule_query": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"match_criteria": {
"query_string": "country"
},
"ruleset_id": "test-ruleset"
}
}
}
```
then this would get translated into the following pinned query:
```
POST singers/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"docs": [
{ "_index": "singers", "_id": "1" },
{"_id": 2 }
]
}
}
}
```
I think we can also simplify the pinned query rule so that it only
receives `docs`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_id": "1" },
{ "_id": "2", "_index": "singers" }
]
}
}
]
}
```
2023-09-07 16:39:56 +08:00
|
|
|
"_id": "4" <2>
|
2021-07-27 20:55:07 +08:00
|
|
|
}
|
|
|
|
],
|
|
|
|
"organic": {
|
|
|
|
"match": {
|
|
|
|
"description": "iphone"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
Make _index optional for pinned query docs (#97450)
Currently pinned queries require either the `ids` or `docs` parameter.
`docs` allows pinning documents from specific indices. However for
`docs` the `_index` field is always required:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
```
returns an error:
```
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22
}
],
"type": "parsing_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"line": 10,
"col": 22,
"caused_by": {
"type": "x_content_parse_exception",
"reason": "[10:22] [pinned] failed to parse field [docs]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Required [_index]"
}
}
},
"status": 400
}
```
The proposal here is to make `_index` optional. I don't think we have a
strong requirement for making `_index` required, when it was initially
introduced in https://github.com/elastic/elasticsearch/pull/74873, we
mostly wanted the ability to pin docs from specific indices.
Making `_index` optional can give more flexibility to use a combination
of pinned documents from specific indices or just document ids. This
change can also help with pinned query rules. Currently pinned query
rules can accept either `ids` or `docs`. If multiple pinned query rules
match and they use a combination of `ids` and `docs`, we cannot build a
pinned query and we would need to return an error. This is because a
pinned query cannot accept both `ids` and `docs`. By making `_index`
optional we would no longer need to return an error when pinned query
rules use a combination of `ids` and `docs`, because we can easily
translate `ids` in `docs`.
The following pinned queries would be equivalent:
```
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"docs": [
{ "_id": "1" }
]
}
}
}
GET test/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"query": "something"
}
},
"ids": [1]
}
}
}
```
The scores should be consistent when using a combination of _docs that
might use _index or not - see example
<details> <summary>Example </summary>
```
PUT test-1/_doc/1 { "title": "doc 1" }
PUT test-1/_doc/2 { "title": "doc 2" }
PUT test-2/_doc/1 { "title": "doc 1" }
PUT test-2/_doc/3 { "title": "lalala" }
POST test-1,test-2/_search { "query": { "pinned": {
"organic": { "query_string": { "query": "lalala"
} }, "docs": [ { "_id": "2", "_index": "test-1" },
{ "_id": "1" } ] } } }
```
response:
```
{ "took": 1, "timed_out": false, "_shards": { "total": 2,
"successful": 2, "skipped": 0, "failed": 0 }, "hits": {
"total": { "value": 4, "relation": "eq" },
"max_score": 1.7014124e+38, "hits": [ { "_index":
"test-1", "_id": "2", "_score": 1.7014124e+38,
"_source": { "title": "doc 2" } }, {
"_index": "test-1", "_id": "1", "_score": 1.7014122e+38,
// same score as doc with id 1 from test-2 "_source": {
"title": "doc 1" } }, { "_index": "test-2",
"_id": "1", "_score": 1.7014122e+38, // same score as doc with
id 1 from test-1 "_source": { "title": "doc 1"
} }, { "_index": "test-2", "_id": "3",
"_score": 0.8025915, // organic result "_source": {
"title": "lalala" } } ] } }
```
</details>
For query rules, if we have two query rules that both match and use a
combination of `ids` and `pinned`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_index": "singers", "_id": "1" }
]
}
},
{
"rule_id": "2",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"ids": [
2
]
}
}
]
}
```
and the following query:
```
POST singers/_search
{
"query": {
"rule_query": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"match_criteria": {
"query_string": "country"
},
"ruleset_id": "test-ruleset"
}
}
}
```
then this would get translated into the following pinned query:
```
POST singers/_search
{
"query": {
"pinned": {
"organic": {
"query_string": {
"default_field": "name",
"query": "country"
}
},
"docs": [
{ "_index": "singers", "_id": "1" },
{"_id": 2 }
]
}
}
}
```
I think we can also simplify the pinned query rule so that it only
receives `docs`:
```
PUT _query_rules/test-ruleset
{
"ruleset_id": "test-ruleset",
"rules": [
{
"rule_id": "1",
"type": "pinned",
"criteria": [
{
"type": "exact",
"metadata": "query_string",
"value": "country"
}
],
"actions": {
"docs": [
{ "_id": "1" },
{ "_id": "2", "_index": "singers" }
]
}
}
]
}
```
2023-09-07 16:39:56 +08:00
|
|
|
|
|
|
|
<1> The document with id `1` from `my-index-000001` will be the first result.
|
|
|
|
<2> When `_index` is missing, all documents with id `4` from the queried indices will be pinned with the same score.
|