2015-08-06 23:24:29 +08:00
|
|
|
[[doc-values]]
|
|
|
|
=== `doc_values`
|
|
|
|
|
|
|
|
Most fields are <<mapping-index,indexed>> by default, which makes them
|
|
|
|
searchable. The inverted index allows queries to look up the search term in
|
|
|
|
unique sorted list of terms, and from that immediately have access to the list
|
|
|
|
of documents that contain the term.
|
|
|
|
|
|
|
|
Sorting, aggregations, and access to field values in scripts requires a
|
2021-03-31 21:57:47 +08:00
|
|
|
different data access pattern. Instead of looking up the term and finding
|
2015-08-06 23:24:29 +08:00
|
|
|
documents, we need to be able to look up the document and find the terms that
|
2015-12-04 12:53:48 +08:00
|
|
|
it has in a field.
|
2015-08-06 23:24:29 +08:00
|
|
|
|
|
|
|
Doc values are the on-disk data structure, built at document index time, which
|
2015-11-10 06:25:07 +08:00
|
|
|
makes this data access pattern possible. They store the same values as the
|
|
|
|
`_source` but in a column-oriented fashion that is way more efficient for
|
|
|
|
sorting and aggregations. Doc values are supported on almost all field types,
|
2019-11-18 23:37:51 +08:00
|
|
|
with the __notable exception of `text` and `annotated_text` fields__.
|
2015-08-06 23:24:29 +08:00
|
|
|
|
2022-01-25 16:24:12 +08:00
|
|
|
<<number,Numeric types>>, <<date,date types>>, the <<boolean,boolean type>>,
|
2022-02-02 18:56:19 +08:00
|
|
|
<<ip,ip type>>, <<geo-point,geo_point type>> and the <<keyword,keyword type>>
|
2022-01-24 15:57:11 +08:00
|
|
|
can also be queried using term or range-based queries
|
Allow docvalues-only search on number types (#82409)
Allows searching on number field types (long, short, int, float, double, byte, half_float) when those fields are not
indexed (index: false) but just doc values are enabled.
This enables searches on archive data, which has access to doc values but not index structures. When combined with
searchable snapshots, it allows downloading only data for a given (doc value) field to quickly filter down to a select set
of documents.
Note to reviewers:
I have split isSearchable into two separate methods isIndexed and isSearchable on MappedFieldType. The former one is
about whether actual indexing data structures have been used (postings or points), and the latter one on whether you
can run queries on the given field (e.g. used by field caps). For number field types, queries are now allowed whenever
points are available or when doc values are available (i.e. searchability is expanded).
Relates #81210 and #52728
2022-01-13 23:23:01 +08:00
|
|
|
when they are not <<mapping-index,indexed>> but only have doc values enabled.
|
|
|
|
Query performance on doc values is much slower than on index structures, but
|
|
|
|
offers an interesting tradeoff between disk usage and query performance for
|
|
|
|
fields that are only rarely queried and where query performance is not as
|
|
|
|
important.
|
|
|
|
|
2015-08-06 23:24:29 +08:00
|
|
|
All fields which support doc values have them enabled by default. If you are
|
|
|
|
sure that you don't need to sort or aggregate on a field, or access the field
|
|
|
|
value from a script, you can disable doc values in order to save disk space:
|
|
|
|
|
2019-09-06 22:55:16 +08:00
|
|
|
[source,console]
|
2015-08-06 23:24:29 +08:00
|
|
|
--------------------------------------------------
|
2020-07-28 02:46:39 +08:00
|
|
|
PUT my-index-000001
|
2015-08-06 23:24:29 +08:00
|
|
|
{
|
|
|
|
"mappings": {
|
2019-01-22 22:13:52 +08:00
|
|
|
"properties": {
|
|
|
|
"status_code": { <1>
|
|
|
|
"type": "keyword"
|
|
|
|
},
|
|
|
|
"session_id": { <2>
|
|
|
|
"type": "keyword",
|
|
|
|
"doc_values": false
|
2015-08-06 23:24:29 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2019-09-06 22:55:16 +08:00
|
|
|
|
2015-08-06 23:24:29 +08:00
|
|
|
<1> The `status_code` field has `doc_values` enabled by default.
|
|
|
|
<2> The `session_id` has `doc_values` disabled, but can still be queried.
|
|
|
|
|
2021-01-14 22:39:31 +08:00
|
|
|
NOTE: You cannot disable doc values for <<wildcard-field-type,`wildcard`>>
|
|
|
|
fields.
|