Commit Graph

741 Commits

Author SHA1 Message Date
Christos Soulios 666f4acab2
Fix typo in fields doc (#64600) 2020-11-04 19:51:14 +02:00
Mayya Sharipova 0ffbcd3b3c
Disable using unsigned_long in scripts (#64523)
Relates to #64361
2020-11-03 14:20:46 -05:00
Christos Soulios 4dc833fa44
Add doc_count field mapper (#64503)
Bucket aggregations compute bucket doc_count values by incrementing the doc_count by 1 for every document collected in the bucket.

When using summary fields (such as aggregate_metric_double) one field may represent more than one document. To provide this functionality we have implemented a new field mapper (named doc_count field mapper). This field is a positive integer representing the number of documents aggregated in a single summary field.

Bucket aggregations will check if a field of type doc_count exists in a document and will take this value into consideration when computing doc counts.
2020-11-03 17:47:17 +02:00
James Rodewig 1ea83359bb
[DOCS] Fix case for 'Boolean' (#64299) 2020-10-29 09:04:43 -04:00
Andrew Kroh 24cae6d7f8
[DOCS] Sort field data types in docs (#64288)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2020-10-28 12:13:01 -04:00
Adrien Grand 62348b6a8a
Document standard metadata entries. (#61941)
We standardize on some metadata entries that we plan to later leverage
in Kibana in order to provide a better out-of-the-box experience, e.g.
different visualizations make sense on gauges and counters.
2020-10-12 09:49:39 +02:00
Mayya Sharipova c45724079c
Fix fields retrieval on unsinged_long field (#63119)
This fixes fields retrieval on unsigned_long field

1) For docvalue_fields a custom UnsignedLongLeafFieldData::getLeafValueFetcher
is implemented that correctly retrieves doc values.

2) For stored fields, an error was fixed in UnsignedLongFieldMapper
 how stored values were stored. Before they were incorrectly
stored in the shifted format, now they are stored as original
values in String format.

Relates to #60050
2020-10-06 05:44:50 -04:00
Alan Woodward 981258b02b
Remove TypeFieldMapper (#62838)
We don't need a special TypeFieldMapper for anything in particular; all access
to the type field can be done via a TypeFieldType that issues appropriate
deprecation warnings.

Relates to #41059
2020-09-30 15:47:29 +01:00
Mayya Sharipova ff55296f7a
Introduce 64-bit unsigned long field type (#60050)
This field type supports
- indexing of integer values from [0, 18446744073709551615]
- precise queries (term, range)
- precise sort and terms aggregations
- other aggregations are based on conversion of long values
  to double and can be imprecise for large values.

Closes #32434
2020-09-23 12:06:21 -04:00
Christoph Büscher ea2dbd93b4
Add field type for version strings (#59773)
This PR adds a new 'version' field type that allows indexing string values
representing software versions similar to the ones defined in the Semantic
Versioning definition (semver.org). The field behaves very similar to a
'keyword' field but allows efficient sorting and range queries that take into
accound the special ordering needed for version strings. For example, the main
version parts are sorted numerically (ie 2.0.0 < 11.0.0) whereas this wouldn't
be possible with 'keyword' fields today.

Valid version values are similar to the Semantic Versioning definition, with the
notable exception that in addition to the "main" version consiting of
major.minor.patch, we allow less or more than three numeric identifiers, i.e.
"1.2" or "1.4.6.123.12" are treated as valid too.

Relates to #48878
2020-09-21 11:04:22 +02:00
Christos Soulios b857768bb5
Histogram field type support for min/max aggregations (#62532)
Implement min/max aggregations for histogram fields.

Closes #60951
2020-09-19 23:34:43 +03:00
Wylie Conlon 4be761fde4
[DOCS] Update range field type docs (#62112) 2020-09-16 09:07:51 -04:00
James Rodewig 95fccbebbb [DOCS] Fix keyword xref 2020-09-02 11:46:40 -04:00
Julie Tibshirani ceb4c02ee8
Link to the keyword family page from the field types docs. (#61819)
We now link to the top-level keyword type family page instead of its individual
subsections. This better fits the page format, where each type name is a link.
2020-09-01 16:21:25 -07:00
James Rodewig f881a695e1
[DOCS] Add redirects for wildcard and constant keyword (#61815) 2020-09-01 15:32:35 -04:00
James Rodewig 5857c02b12
[DOCS] Combine keyword family docs (#61662) 2020-09-01 14:51:05 -04:00
James Rodewig 38b438dd86
[DOCS] Change 'data type' to 'field type' (#61633) 2020-08-27 09:44:35 -04:00
James Rodewig 49350ddae8
[DOCS] Reorg field data types page (#61117) 2020-08-26 14:01:34 -04:00
James Rodewig c688cb6bfd
[DOCS] Fix hyphenation for "time series" (#61472) 2020-08-24 10:34:41 -04:00
James Rodewig 751798f95f [DOCS] Fix indentation in wildcard type docs 2020-08-21 12:29:06 -04:00
jessepeixoto 9db974aed7
[DOCS] Fix query example for wildcard datatype (#61398) 2020-08-21 12:24:21 -04:00
Ryan Ernst fc9644dc5c
Add note about negative epoch times (#61379)
This commit adds a reminder to date type documentation that negative
epoch times are not supported.

relates #40983
2020-08-20 13:51:57 -07:00
James Rodewig 2a49ba3252
[DOCS] Document empty string boolean value as `false` (#61341) 2020-08-19 12:56:57 -04:00
James Rodewig a94e5cb7c4
[DOCS] Replace Wikipedia links with attribute (#61171) 2020-08-17 09:44:24 -04:00
James Rodewig bb5a4b3f82
[DOCS] Cross-link `copy_to` and search speed docs (#60926) 2020-08-10 14:34:59 -04:00
Alan Woodward 0e3f7c2fb2
Cut over IPFieldMapper to parametrized form (#60602)
This commit makes IpFieldMapper extend ParametrizedFieldMapper. It also
updates the IpFieldMapper docs to add the ignore_malformed parameter,
which was not previously documented.
2020-08-10 10:57:06 +01:00
James Rodewig 6b9b8c5e31
[DOCS] Move script and stored fields content to search fields page (#60826)
Changes:

* Moves `Retrieve selected fields` to its own page and adds a title abbreviation.
* Adds existing script and stored fields content to `Retrieve selected fields`
* Adds a xref for `Retrieve selected fields` to `Search your data`
* Adds related redirects and updates existing xrefs
2020-08-06 12:45:03 -04:00
James Rodewig 29e957ecf8
[DOCS] Remove metrics sidebar in `_source` docs (#60777) 2020-08-05 15:57:29 -04:00
James Rodewig 56c778235c
[DOCS] Fix metadata field refs (#60764) 2020-08-05 13:21:00 -04:00
Martijn van Groningen de5cf82bba
Fix mistake in notes around dynamic template validation. (#60726)
The double bracket notation is incorrect.
2020-08-05 14:39:43 +02:00
James Rodewig ae01606785
[DOCS] Replace `twitter` dataset in docs (#60604) 2020-08-03 12:49:56 -04:00
Alexander Reelsen c7ac9e7073
[DOCS] http -> https, remove outdated plugin docs (#60380)
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.

While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.

In addition a few community links have been removed, as they do not seem
to exist anymore.
2020-07-31 15:58:38 -04:00
Julie Tibshirani 8a89d95372
Add search `fields` parameter to support high-level field retrieval. (#60100)
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.

Addresses #49028 and #55363.
2020-07-27 13:25:55 -07:00
James Rodewig 441c3a21b1
[DOCS] Update my-index examples (#60132)
Changes the following example index names to `my-index-000001` for consistency:

* `my-index`
* `my_index`
* `myindex`
2020-07-27 14:46:39 -04:00
James Rodewig 2774cd6938
[DOCS] Swap `[float]` for `[discrete]` (#60124)
Changes instances of `[float]` in our docs for `[discrete]`.

Asciidoctor prefers the `[discrete]` tag for floating headings:
https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks
2020-07-23 11:48:22 -04:00
James Rodewig 80b674fb25
[DOCS] Reformat snippets to use two-space indents (#59973) 2020-07-21 12:24:26 -04:00
Julie Tibshirani 6b21a4a87a
Add 'point' to the top-level field type docs. (#59731)
Before it was missing from the list. This PR also renames the 'geo data types'
section to 'spatial data types' and consolidates the geo and cartesian types
into that section.
2020-07-20 16:29:32 -07:00
James Rodewig aa3ddfeefb
[DOCS] Move highlighting docs to separate page (#59768)
Moves the highlighting docs from the deprecated 'Request Body Search'
chapter to the new subpage of the 'Run a search chapter' section.

No substantive changes were made to the content.
2020-07-17 10:15:20 -04:00
Christos Soulios 2976ba471a
Histogram integration on Histogram field type (#58930)
Implements histogram aggregation over histogram fields as requested in #53285.
2020-07-13 17:07:16 +03:00
James Rodewig cef242db20
[DOCS] Document custom routing support for data streams (#59323) 2020-07-09 16:35:44 -04:00
James Rodewig 2be9db01c8
[DOCS] Replace `datatype` with `data type` (#58972) 2020-07-07 13:52:10 -04:00
James Rodewig a00de7ec8e
[DOCS] Remove problematic terms (#58832) 2020-07-01 11:23:57 -04:00
Przemyslaw Gomulka ed43839a60
Update format.asciidoc to describe strict_date_optional_time_nanos (#57527)
closes #57019
2020-06-26 08:29:52 +02:00
James Rodewig 7826bbee87
[DOCS] Move search API's `docvalue_fields` examples (#57760)
Changes:

* Condenses and relocates the `docvalue_fields` example to the 'Run a search' 
   page.
* Adds docs for the `docvalue_fields` request body parameter.
* Updates several related xrefs.

Co-authored-by: debadair <debadair@elastic.co>
2020-06-11 10:57:15 -04:00
James Rodewig 51e3d5ab63
[DOCS] Fix source filtering xrefs (#57720) 2020-06-05 08:46:26 -04:00
James Rodewig 62e2778efb
[DOCS] Edit validation section of dynamic templates docs (#57510) (#57557)
Co-authored-by: Jess <13388033+liebeslied@users.noreply.github.com>
2020-06-02 16:43:32 -04:00
Tal Levy cfb36c3fcc
Update geo_point docs for geo_shape queries (#57487)
This commit highlights the ability for geo_point fields to be
used in geo_shape queries. It also adds an explicit geo_point
example in the geo_shape query documentation

Closes #56927.
2020-06-02 12:56:13 -07:00
Lisa Cawley 8b9293b3bf
[DOCS] Replace docdir attribute with es-repo-dir (#57489) 2020-06-01 15:55:05 -07:00
Adam Locke d77388f919
[DOCS] Add links to `flattened` datatype (#56794)
* Changes for #52239.

* Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page.

* Moving tip after the introduction and clarifying limits.

* Update docs/reference/mapping.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/mapping/types/nested.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-05-19 13:40:26 -04:00
James Rodewig 65cc09920f [DOCS] Remove outdated links for `similarity` mapping param args (#56925) 2020-05-19 11:20:19 -04:00
Julie Tibshirani bb04fbcd96
For constant_keyword, make sure exists query handles missing values. (#55757)
It's possible for a constant_keyword to have a 'null' value before any documents
are seen that contain a value for the field. In this case, no documents have a
value for the field, and 'exists' queries should return no documents.
2020-05-04 09:06:34 -07:00
Christos Soulios caf6c5ac19
Histogram field type support for ValueCount and Avg aggregations (#55933)
Implements value_count and avg aggregations over Histogram fields as discussed in #53285

- value_count returns the sum of all counts array of the histograms
- avg computes a weighted average of the values array of the histogram by multiplying each value with its associated element in the counts array
2020-05-04 10:24:35 +03:00
Christos Soulios cefc6af25b
Histogram field type support for Sum aggregation (#55681)
Implements Sum aggregation over Histogram fields by summing the value of each bucket multiplied by their count as requested in #53285
2020-04-29 11:09:25 +03:00
Ben Skelker a085183d6e
[DOCS] Add `ip_range` datatype to core datatypes range list (#55446) 2020-04-20 08:54:19 -04:00
markharwood 35188e0091
Remove normalizer support from wildcard field while we decide on approach for handling case insensitvity (#55294)
Closes #55288
2020-04-17 10:47:33 +01:00
Ignacio Vera 6182db5b77
Add new point field. (#53804)
This commit adds a new point field that is able to index arbitrary pair of values (x/y)
 in the cartesian space. It only supports filtering using shape queries at the moment.
2020-04-07 13:08:02 +02:00
markharwood d83798f237
Add pre-configured “lowercase” normalizer (#53882)
Add pre-configured “lowercase” normalizer
Includes tests that user-defined "lowercase" normalizer overrides the default one.

Closes #53872
2020-04-03 10:12:06 +01:00
bellengao 5865a899d8
[DOCS] Correct field name in date_nanos doc (#54556) 2020-04-01 14:22:04 +02:00
markharwood 2537e02a7d
Wildcard field - add normalizer support (#53851)
* Add support for normalisation to wildcard field

* Tidied imports

* Added docs about params

* Fix outdated error message

* Avoid normaliser butchering wildcard query special characters

* Fix broken test expectations

* Fix wrong toString method

* Address review comments - common method for normalising wildcard patterns and checkCompatibility

* Remove unused import
2020-03-24 09:35:29 +00:00
Julie Tibshirani 7161bd44cb
Remove the top-level 'mapping type' section. (#53374)
It seemed confusing for users that our top-level mapping page still had a
prominent section named 'Mapping Type'. This PR reworks the docs to remove this
reference and adds a note about types removal (similar to the note we added to
other APIs like put mapping).
2020-03-23 12:26:22 -07:00
Lisa Cawley 0ea4324e22
[DOCS] Adds data nanos transform limitation (#53826) 2020-03-23 09:48:00 -07:00
C.J. Jameson 34feb3cde1
[DOCS] Clarify routing enforcement in docs (#53945)
Removes a mention of the `_doc` mapping type that's
no longer applicable now that mapping types are
removed/deprecated.
2020-03-23 12:34:49 -04:00
markharwood 024deb41a4
Wildcard field docs formatting fix
Bullet points weren't rendering correctly
2020-03-18 15:38:00 +00:00
Alan Woodward 534a4a9b32
Remove [removal-of-types] docs page, and point to 7x docs (#53670)
Given that types have been removed entirely in 8.0, we don't need a detailed page
explaining what they are and why they are going away. This commit replaces the
page with a short paragraph saying that types are no longer supported, and a link
to the [removal-of-types] page in 7x

Relates to #41059
2020-03-18 14:13:00 +00:00
James Rodewig f2a2dcbb8a
[DOCS] Streamline `analyzer` mapping parm def (#51874)
Simplifies the `analyzer` mapping parameter definition to remove
duplicated analysis content and examples.
2020-03-18 09:42:25 -04:00
Alan Woodward 795a92707f
Remove deprecation warning when doc scripts refer to '_type' field (#53605)
We currently emit a warning in 8x when a script refers to the _type field for
a document. However, in 8x this field no longer exists, so the deprecation
warning is not required.

Relates to #41059
2020-03-18 11:50:36 +00:00
markharwood a2a4756736
New wildcard field optimised for wildcard queries (#49993)
Indexes values using size 3 ngrams and also stores the full original as a binary doc value.
Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values.  Also supports aggregations and sorting.
2020-03-16 09:54:10 +00:00
James Rodewig febe7af62e
[DOCS] Clarify `max_shingle_size` parm def (#53480)
Rewrites the `search_as_you_type` field datatype's `max_shingle_size`
mapping parameter to improve clarity and better communicate trade-offs
regarding index size.

Relates to [elastic/kibana#55161][0].

Closes #51774.

[0]: https://github.com/elastic/kibana/pull/55161#discussion_r368107177
2020-03-13 04:07:40 -04:00
Ignacio Vera dba2a6e199
remove sneaked placeholder from histogram docs (#53391) 2020-03-11 13:47:01 +01:00
James Rodewig 9f641dc07d
[DOCS] Fix several Asciidoctor double arrow replacements (#52827)
Per the [Asciidoctor docs][0], Asciidoctor replaces the following
syntax with double arrows in the rendered HTML:

* => renders as ⇒
* <= renders as ⇐

This escapes several unintended replacements, such as in the Painless
docs.

Where appropriate, it also replaces some double arrow instances with
single arrows for consistency.

[0]: https://asciidoctor.org/docs/user-manual/#replacements
2020-03-04 08:42:37 -05:00
Adrien Grand 0b2a62822c
Introduce a `constant_keyword` field. (#49713)
This field is a specialization of the `keyword` field for the case when all
documents have the same value. It typically performs more efficiently than
keywords at query time by figuring out whether all or none of the documents
match at rewrite time, like `term` queries on `_index`.

The name is up for discussion. I liked including `keyword` in it, so that we
still have room for a `singleton_numeric` in the future. However I'm unsure
whether to call it `singleton`, `constant` or something else, any opinions?

For this field there is a choice between
 1. accepting values in `_source` when they are equal to the value configured
    in mappings, but rejecting mapping updates
 2. rejecting values in `_source` but then allowing updates to the value that
    is configured in the mapping
This commit implements option 1, so that it is possible to reindex from/to an
index that has the field mapped as a keyword with no changes to the source.
2020-03-02 16:46:46 +01:00
James Rodewig 340c08a6a8
[DOCS] Correct guidance for `index_options` mapping parm (#52899)
Adds a warning admonition stating that the `index_options` mapping
parameter is intended only for `text` fields.

Removes an outdated statement regarding default values for numeric
and other datatypes.
2020-03-02 07:37:48 -05:00
Martijn van Groningen 31b29875c9
Add validation for dynamic templates (#51233)
Tries to load a `Mapper` instance for the mapping snippet of a dynamic template.
This should catch things like using an analyzer that is undefined or mapping attributes that are unused.

This is best effort:
* If `{{name}}` placeholder is used in the mapping snippet then validation is skipped.
* If `match_mapping_type` is not specified then validation is performed for all mapping types.
  If parsing succeeds with a single mapping type then this the dynamic mapping is considered valid.

If is detected that a dynamic template mapping snippet is invalid at mapping update time then the mapping update is failed for indices created on 8.0.0-alpha1 and later. For indices created on prior version a deprecation warning is omitted instead. In 7.x clusters the mapping update will never fail in case of an invalid dynamic template mapping snippet and a deprecation warning will always be omitted.

Closes #17411
Closes #24419

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2020-02-27 11:52:27 +01:00
James Rodewig 12ed6f12e9
[DOCS] Document `include_in_*` nested mapping parms (#52648)
Adds documentation for the `include_in_parent` and `include_in_root`
mapping parameters for the `nested` mapping datatype.
2020-02-25 07:12:34 -05:00
Ignacio Vera 3b004b1936
Add support for multipoint shape queries (#52564) 2020-02-24 12:42:17 +01:00
Ignacio Vera e2b410e15e
Add support for multipoint geoshape queries (#52133)
Currently multi-point queries are not supported when indexing your data using BKD-backed geoshape strategy. This commit removes this limitation.
2020-02-20 08:53:01 +01:00
Valentin Crettaz 3c21eca5e9 [DOCS] Clarify that "now" cannot be used in `date_range` at index time (#52446)
`date_range` fields do not accept `"now"` as a value of either bounds at indexing time.

This corrects an error in the range data type mapping docs.
2020-02-19 12:55:06 -05:00
James Rodewig 27f45aeb9d
[DOCS] Remove 'analyzed string' references (#51946)
The `string` field datatype was replaced by the `text` and `keyword`
field datatypes in [5.0][0].

This removes several outdated references to 'analyzed string' fields.

[0]:https://www.elastic.co/guide/en/elasticsearch/reference/5.0/breaking_50_mapping_changes.html#_string_fields_replaced_by_textkeyword_fields
2020-02-14 12:33:21 -05:00
Igor Motov 0898df4aac
Add histogram field type support to boxplot aggs (#52265)
Add support for the histogram field type to boxplot aggs.

Closes #52233
Relates to #33112
2020-02-13 08:59:44 -05:00
Ryan Ernst f7c989731e
Fix incorrect date nanos docs example (#52249)
The example of how to access the nano value of a date_nanos field has
been broken since it was created. This commit fixes it to use the
correct scripting methods.

closes #51931
2020-02-12 15:54:39 -08:00
Marios Trivyzas a8b39ed842
Add a cluster setting to disallow expensive queries (#51385)
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have 
usually slow performance cannot be executed and an error message
is returned.

- Queries that need to do linear scans to identify matches:
  - Script queries
- Queries that have a high up-front cost:
  - Fuzzy queries
  - Regexp queries
  - Prefix queries (without index_prefixes enabled
  - Wildcard queries
  - Range queries on text and keyword fields
- Joining queries
  - HasParent queries
  - HasChild queries
  - ParentId queries
  - Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
  - Script score queries
  - Percolate queries

Closes: #29050
2020-02-12 18:06:04 +01:00
Mayya Sharipova 620996287a
Remove docs related to index time boosting (#51704)
As there is no really index time boosting,
as boost is only applied during query time,
this removes mentions of index time boosting.
2020-01-31 09:01:52 -05:00
Yannick Welsch a57a9a31c3 Stricter checks of setup and teardown in docs tests (#51430)
Made checks stricter after backporting PR.
2020-01-28 17:53:57 +01:00
junmuz 06cec760da [DOCS] Correct typo in `ignore_malformed` mapping parm docs (#50780) 2020-01-13 09:49:10 -05:00
James Rodewig d3cd4fbd62 [DOCS] Fix typo in mapping date format docs 2020-01-08 07:55:21 -06:00
arkel-s f20367e405 [DOCS] Add example format for `date_optional_time` (#50458)
Adds an example format for `date_optional_time` to the `format` mapping
parameter docs.

Closes #50457
2020-01-07 10:07:29 -06:00
James Rodewig e8a6d4a3fb
[DOCS] Remove unneeded redirects (#50476)
The docs/reference/redirects.asciidoc file stores a list of relocated or
deleted pages for the Elasticsearch Reference documentation.

This prunes several older redirects that are no longer needed and
don't require work to fix broken links in other repositories.
2019-12-26 07:49:41 -05:00
Xiang Dai 432bd0e92c Fix docs typos (#50365)
Fixes a few typos in the docs.

Signed-off-by: Xiang Dai 764524258@qq.com
2019-12-23 10:35:14 -05:00
Adrien Grand 2d627ba757
Add per-field metadata. (#49419)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2019-12-18 17:27:38 +01:00
James Rodewig 230d4765d3
[DOCS] Add identifier mapping tip to numeric and keyword datatype docs (#49933)
Users often mistakenly map numeric IDs to numeric datatypes. However,
this is often slow for the `term` and other term-level queries.

The "Tune for search speed" docs includes advice for mapping numeric
IDs to `keyword` fields. However, this tip is not included in the
`numeric` or `keyword` field datatype doc pages.

This rewords the tip in the "Tune for search speed" docs, relocates it
to the `numeric` field docs, and reuses it using tagged regions.
2019-12-17 09:31:07 -05:00
Ignacio Vera 7c559be31c
"CONTAINS" support for BKD-backed geo_shape and shape fields (#50141)
Lucene 8.4 added support for "CONTAINS", therefore in this commit those
changes are integrated in Elasticsearch. This commit contains as well a
bug fix when querying with a geometry collection with "DISJOINT" relation.
2019-12-16 07:43:42 +01:00
Adrien Grand 1329acc094
Upgrade to lucene 8.4.0-snapshot-662c455. (#50016)
Lucene 8.4 is about to be released so we should check it doesn't cause problems
with Elasticsearch.
2019-12-10 17:09:36 +01:00
Ignacio Vera eade4f03f4
New Histogram field mapper that supports percentiles aggregations. (#48580)
This commit adds  a new histogram field mapper that consists in a pre-aggregated format of numerical data to be used in percentiles aggregations.
2019-11-28 13:58:20 +01:00
Jim Ferenczi c2deb287f1
Add a cluster setting to disallow loading fielddata on _id field (#49166)
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.

Closes #43599
2019-11-27 13:38:09 +01:00
Julie Tibshirani 548fcd09fb
Stop ignoring types warnings in REST tests. (#49333)
In 7.x we added logic to the REST test harness to ignore warnings related to
types removal. This allowed us to continue to run mixed-cluster tests that
included 6.x nodes.

Now that master is on 8.x, we've no longer need to include 6.x nodes in testing
and have removed almost all typed calls. The logic to ignore warnings can
therefore be removed.
2019-11-20 08:23:56 -08:00
Mayya Sharipova 19b6b3693a
Increase the number of vector dims to 2048 (#46895) 2019-11-20 07:43:14 -05:00
Julie Tibshirani dd6f0a35e4
Remove the 'experimental' marking from vector fields. (#49120)
We wrapped up the API changes we wanted to make, and vector fields can now be
considered GA.
2019-11-18 11:57:18 -08:00
Antoine Garcia 529ebea2b9 [Docs] Specify field types not supporting doc values (#49041)
The `string` type (with option `analyzed`) has been replaced by `text` after `6.0`, 
also the `annonated_text` field do not support doc values and should be mentioned.
2019-11-18 16:37:51 +01:00
Julie Tibshirani 460d545921
Remove support for sparse vectors. (#48781)
Follow up to #48368. This PR removes support for `sparse_vector` on new
indices. On 7.x indices a `sparse_vector` can still be defined, but it is not
possible to index or search on it.
2019-11-14 16:54:48 -05:00
Julie Tibshirani 7d500bacee
Ensure parameters are updated when merging flattened mappings. (#48971)
This PR makes the following two fixes around updating flattened fields:

* Make sure that the new value for ignore_above is immediately taken into
  affect. Previously we recorded the new value but did not use it when parsing
  documents.
* Allow depth_limit to be updated dynamically. It seems plausible that a user
  might want to tweak this setting as they encounter more data.
2019-11-12 13:44:53 -08:00
Alan Woodward 37a997a9a9
Remove `include_type_name` parameter from REST layer (#48632)
This commit removes support for the include_type_name parameter from all
REST actions that receive or return mapping configurations.

Relates to #41059
2019-11-01 09:30:47 +00:00
lgypro 95c7e6ebff [Docs] Fix syntax error leading to wrong doc ID (#48554)
In order to index a document with id 2, the "&" should be replaced by "?"
2019-10-29 10:26:33 +01:00
Julie Tibshirani 7e6199e7f2
Correct outdated information in _index docs. (#48436)
This PR makes the following updates:
* Update the supported query types to include `prefix` and `wildcard`.
* Specify that queries accept index aliases.
* Clarify that when querying on a remote index name, the separator `:` must be
  present.
2019-10-24 11:00:02 -07:00
Julie Tibshirani 5478fff640
Deprecate the sparse_vector field type. (#48315)
We have not seen much adoption of this experimental field type, and don't see a
clear use case as it's currently designed. This PR deprecates the field type in
7.x. It will be removed from 8.0 in a follow-up PR.
2019-10-22 18:06:50 -07:00
Christoph Büscher c60139a345
Clarify mapping types that support ignore_malformed (#48206)
The `ignore_malformed` setting only works on selected mapping types, otherwise
we throw an mapper_parsing_exception. We should add a list of all the mapping
types that support it, since the number of types not supporting it seems larger.

Closes #47166
2019-10-18 20:39:07 +02:00
Alan Woodward 566e1b7d33
Remove type field from DocWriteRequest and associated Response objects (#47671)
This commit removes the type field from index, update and delete requests, and their
associated responses.

Relates to #41059
2019-10-11 10:23:55 +01:00
Julie Tibshirani f3420afc8e
Mention ip fields in the global ordinals docs. (#47045)
Although they do not support eager_global_ordinals, ip fields use global
ordinals for certain aggregations like 'terms'.

This commit also corrects a reference to the sampler aggregation.
2019-09-24 12:38:23 -07:00
Alan Woodward c1f99e2d75
Remove `_type` from SearchHit (#46942)
This commit removes the `_type` field from all search hit responses.

Relates to #41059
2019-09-23 19:14:54 +01:00
Christoph Büscher 2351aa3efb
Disallow `_field_names` enabled setting (#46681)
After deprecating the `enabled` setting for `_field_names`starting with 7.5, 
this change disallows the setting on new indices in 8.0. The setting continues
to work for indices created in 7.x where we continue emitting a deprecation warning.

Relates to #42854
2019-09-23 14:13:44 +02:00
Alan Woodward 7c90801aff
Remove types from Get/MultiGet (#46587)
This commit removes types from the ShardGetService, and propagates this API change
up through the Transport and Rest actions for Get and MultiGet

Relates to #41059
2019-09-20 14:22:57 +01:00
James Rodewig e355759086
[DOCS] Replace "// CONSOLE" comments with [source,console] (#46679) 2019-09-13 11:23:53 -04:00
Christoph Büscher d0a7bbcb69
Deprecate `_field_names` disabling (#42854)
Currently we allow `_field_names` fields to be disabled explicitely, but since
the overhead is negligible now we decided to keep it turned on by default and
deprecate the `enable` option on the field type. This change adds a deprecation
warning whenever this setting is used, going forward we want to ignore and finally
remove it.

Closes #27239
2019-09-11 14:55:30 +02:00
Julie Tibshirani cb321575d0
Expand documentation around global ordinals. (#46517)
This commit updates the eager_global_ordinals documentation to give more
background on what global ordinals are and when they are used. The docs also now
mention that global ordinal loading may be expensive, and describes the cases
where in which loading them can be avoided.
2019-09-10 11:01:45 -07:00
Julie Tibshirani c8fca03e2d
Use a literal block in the field data docs. (#46469)
Currently we use `quote`, which renders a bit strangely on the website.
2019-09-09 11:10:54 -07:00
James Rodewig e43be90e6c
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) 2019-09-06 14:05:36 -04:00
Anton 8167000951 [Docs] Fix typo in field-names-field.asciidoc (#46430) 2019-09-06 18:04:57 +02:00
James Rodewig 97802d8aff
[DOCS] Change // CONSOLE comments to [source,console] (#46441) 2019-09-06 10:55:16 -04:00
James Rodewig 466c59a4a7
[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295) 2019-09-05 16:47:18 -04:00
Julie Tibshirani 126f87b36b
First round of optimizations for vector functions. (#46294)
This PR merges the `vectors-optimize-brute-force` feature branch, which makes
the following changes to how vector functions are computed:
* Precompute the L2 norm of each vector at indexing time. (#45390)
* Switch to ByteBuffer for vector encoding. (#45936)
* Decode vectors and while computing the vector function. (#46103) 
* Use an array instead of a List for the query vector. (#46155)
* Precompute the normalized query vector when using cosine similarity. (#46190)

Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
2019-09-03 16:30:12 -07:00
Jim Ferenczi 55d4581ee4
Remove classic similarity (#46078)
This commit removes the `classic` similarity from code and docs in master (8.0). The `classic` similarity cannot be used on indices created after 7.0.

Closes #46058
2019-08-29 22:26:35 +02:00
Nick Knize c89b66afcb
[SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108)
This commit adds a new ShapeQueryBuilder to the xpack spatial module for
querying arbitrary Cartesian geometries indexed using the new shape field
type.

The query builder extends AbstractGeometryQueryBuilder and leverages the
ShapeQueryProcessor added in the previous field mapper commit.

Tests are provided in ShapeQueryTests in the same manner as
GeoShapeQueryTests and docs are updated to explain how the query works.
2019-08-08 15:28:35 -05:00
Julie Tibshirani 5fc788012d
Correct a code snippet in removal_of_types. (#45118)
Previously, the reindex examples did not include `_doc` as the destination type.
This would result in the reindex failing with the error "Rejecting mapping
update to [users] as the final mapping would have more than 1 type: [_doc,
user]".

Relates to #43100.
2019-08-06 13:54:53 -07:00
James Rodewig 8b152d6d79
Rename "indices APIs" to "index APIs" (#44863) 2019-08-02 14:09:46 -04:00
Nick Knize b07310022d
[SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980)
This commit adds a new ShapeFieldMapper to the xpack spatial module for
indexing arbitrary cartesian geometries using a new field type called shape.
The indexing approach leverages lucene's new XYShape field type which is
backed by BKD in the same manner as LatLonShape but without the WGS84
latitude longitude restrictions. The new field mapper builds on and
extends the refactoring effort in AbstractGeometryFieldMapper and accepts
shapes in either GeoJSON or WKT format (both of which support non geospatial geometries).

Tests are provided in the ShapeFieldMapperTest class in the same manner
as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests.
Documentation for how to use the new field type and what parameters are
accepted is included. The QueryBuilder for searching indexed shapes is provided in a separate commit.
2019-07-30 21:52:59 -05:00
David Turner 534d2e502b
Expand docs on force-merge and global ordinals (#44684)
Some small clarifications about force-merging and global ordinals, particularly
that global ordinals are cheap on a single-segment index and how this relates
to frozen indices.

Fixes #41687
2019-07-23 07:32:44 +01:00
James Rodewig c2aabb5398
[DOCS] Make field datatype titles consistent (#43933)
* [DOCS] Make field datatype titles consistent

* Add titleabbrev for array
2019-07-22 08:51:34 -04:00
James Rodewig ea1adb61c2
[DOCS] Update anchors and links for Elasticsearch API relocation (#44500) 2019-07-19 09:16:35 -04:00
Mayya Sharipova 159345c493
Add positions info into term_vector doc (#44379) 2019-07-16 16:22:11 -04:00
Mark Walkom 72f7b02320 [DOCS] Update id-field.asciidoc (#42482)
Adding a note around the size limit for `_id`
2019-07-16 14:57:28 +02:00
Nikita Glashenko a85199286d Support WKT point conversion to geo_point type (#44107)
This PR adds support for parsing geo_point values from WKT POINT format.
Also, a few minor bugs in geo_point parsing were fixed.

Closes #41821
2019-07-12 11:44:59 -04:00
James Rodewig f339df59e0
[DOCS] Clarify array is not a field datatype (#43931) 2019-07-08 08:56:51 -04:00
Mayya Sharipova 66e1e5643f
Add dims parameter to dense_vector mapping (#43444)
Typically, dense vectors of both documents and queries must have the same
number of dimensions. Different number of dimensions among documents
or query vector indicate an error. This PR enforces that all vectors
for the same field have the same number of dimensions. It also enforces
that query vectors have the same number of dimensions.
2019-07-02 16:21:10 -04:00
Alexander Reelsen d52972e9e2
Update docs to refer to 6.8 instead of 6.7 (#43685)
A few places in the documentation had mentioned 6.7 as the version to
upgrade from, when doing an upgrade to 7.0. While this is technically
possible, this commit will replace all those mentions to 6.8, as this is
the latest version with the latest bugfixes, deprecation checks and
ugprade assistant features - which should be the one used for upgrades.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-07-02 09:06:14 +02:00
Julie Tibshirani f3317eb82d
Add support for 'flattened object' fields. (#42541)
This commit merges the `object-fields` feature branch. The new 'flattened
object' field type allows an entire JSON object to be indexed into a field, and
provides limited search functionality over the field's contents.
2019-06-28 15:33:24 +03:00
Henning Andersen b92de2845b
Enabled cannot be updated (#43701)
Removed the invalid tip that enabled can be updated for existing fields
and clarified instead that it cannot.

Related to #33566 and #33933
2019-06-28 12:58:22 +02:00
Julie Tibshirani 98ed5e985f
Make the ignore_above docs tests more robust. (#43349)
It is possible for internal ML indices like `.data-frame-notifications-1` to leak,
causing other docs tests to fail when they accidentally search over these
indices. This PR updates the ignore_above tests to only search a specific index.
2019-06-27 08:27:01 +03:00
James Rodewig 97f70c5e27
[DOCS] Rewrite term-level queries overview (#43337) 2019-06-21 11:53:01 -04:00
Igor Motov 77639213bb
Docs: Add description of the coerce parameter in geo_shape mapper (#43340)
Explains the effect of the coerce parameter on the geo_shape field.

Relates #35059
2019-06-20 05:15:59 -07:00
Andrei Stefan c5190106cb
Remove mentions of "fields with the same name in the same index" (#43077)
Together with types removal, any mention of "fields with the same name in the same index" doesn't make sense anymore.
2019-06-20 10:15:43 +03:00
Mayya Sharipova 952ddf247a
Move dense_vector and sparse_vector to module (#43280) 2019-06-18 08:15:46 -04:00
Julie Tibshirani 2eca4a614b
Clarify that inner_hits must be used to access nested fields. (#42724)
This PR updates the docs for `docvalue_fields` and `stored_fields` to clarify
that nested fields must be accessed through `inner_hits`. It also tweaks the
nested fields documentation to make this point more visible.

Addresses #23766.
2019-05-31 08:53:59 -07:00
Julie Tibshirani 2bc436309d
Clarify the settings around limiting nested mappings. (#42686)
* Previously, we mentioned multiple times that each nested object was indexed as its own document. This is repetitive, and is also a bit confusing in the context of `index.mapping.nested_fields.limit`, as that applies to the number of distinct `nested` types in the mappings, not the number of nested objects. We now just describe the issue once at the beginning of the section, to illustrate why `nested` types can be expensive.
* Reference the ongoing example to clarify the meaning of the two settings.

Addresses #28363.
2019-05-30 09:23:38 -07:00
Julie Tibshirani a1e78585d1 Fix a callout in the field alias docs. 2019-05-28 17:40:07 -07:00
James Rodewig 79a3de4152
[DOCS] Set explicit anchors for Asciidoctor (#42521) 2019-05-28 14:20:42 -04:00
Julie Tibshirani 148df31639
Fix a rendering issue in the geo envelope docs. (#42332)
Previously the formatting information didn't display in the docs, and the
sentence just rendered as "bounding rectangle in the format :".
2019-05-22 09:19:14 -07:00
Julie Tibshirani d75d61499c
Clarify that path_match also considers object fields. (#41658)
The `path_match` and `path_unmatch` parameters in dynamic templates match on
object fields in addition to leaf fields. This is not obvious and can cause
surprising errors when a template is meant for a leaf field, but there are
object fields that match. This PR adds a note to the docs to describe the
current behavior.
2019-05-06 11:52:22 -07:00
Julie Tibshirani 8449eff026
Clarify _doc is a permanent part of certain document APIs. (#41727)
We received some feedback that it is not completely clear why `_doc` is present
in the typeless document APIs:

> The new index APIs are PUT {index}/_doc/{id} in case of explicit ids and POST
{index}/_doc for auto-generated ids."_ Isn't this contradicting? Specifying
*types in requests is deprecated*, but we are supposed to still mention *_doc*
in write requests?

This PR updates the 'removal of types' documentation to try to clarify that
`_doc` now represents the endpoint name, as opposed to a type.
2019-05-06 10:38:54 -07:00
James Rodewig 8e3ff6793f
[DOCS] Escape commas in deprecated[] for Asciidoctor migration (#41598) 2019-04-30 15:50:52 -04:00
James Rodewig adf67053f4
[DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:19:09 -04:00
Christoph Büscher 34ae1f9c7d
[Docs] Fix common word repetitions (#39703) 2019-04-25 20:47:03 +02:00
James Rodewig 5a44cab87f
[DOCS] Document limits for JSON objects with `ignore_malformed` mapping setting (#40976) 2019-04-10 14:46:51 -04:00
Adrien Grand 89d6580885
Update headline of the "removal of types" doc page to match changes in 7.0. (#40868)
Currently it describes what broke in 6.0.
2019-04-10 11:33:49 +02:00
Julie Tibshirani 75fe3d0c84
Some clarifications in the 'enabled' documentation. (#40989)
This PR makes a few clarifications to the docs for the `enabled` setting:
- Replace references to 'mapping type' with 'mapping' or 'mapping definition'.
- In code examples, clarify that the disabled fields have type `object`.
- Add a section on how disabled fields can hold non-object data.
2019-04-09 10:49:01 -07:00
Alexander Reelsen 6562c62bc0
Fix dense/sparse vector limit documentation (#40852)
The documentation stated a wrong limit of dense/sparse vector sizes.
This was changed in #40597 but the documentation was not fixed.
2019-04-05 09:35:10 +02:00
Jim Ferenczi ffc0b9bb17
remove experimental label from search_as_you_type documentation (#40744) 2019-04-03 09:41:50 +02:00
Andy Bristol 6bba9fc83b
search as you type fieldmapper (#35600)
Adds the search_as_you_type field type that acts like a text field optimized
for as-you-type search completion. It creates a couple subfields that analyze
the indexed terms as shingles, against which full terms are queried, and a
prefix subfield that analyze terms as the largest shingle size used and
edge-ngrams, against which partial terms are queried

Adds a match_bool_prefix query type that creates a boolean clause of a term
query for each term except the last, for which a boolean clause with a prefix
query is created.

The match_bool_prefix query is the recommended way of querying a search as you
type field, which will boil down to term queries for each shingle of the input
text on the appropriate shingle field, and the final (possibly partial) term
as a term query on the prefix field. This field type also supports phrase and
phrase prefix queries however
2019-03-27 10:03:30 -07:00
Julie Tibshirani 91181e6779
Document the limitation around field aliases and percolator. (#40073)
Currently if a field alias is updated, any percolator queries that contain the
alias will still refer to its old target. This PR documents the issue while we
look into addressing it.

Relates to #37212.
2019-03-15 10:34:58 -07:00
Mayya Sharipova 3260fd1fc8
Distance measures for dense and sparse vectors (#37947)
* Distance measures for dense and sparse vectors

Introduce painless functions of
cosineSimilarity and dotProduct distance
measures for dense and sparse vector fields.

```js
{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": "cosineSimilarity(params.queryVector, doc['my_dense_vector'].value)",
        "params": {
          "queryVector": [4, 3.4, -1.2]
        }
      }
    }
  }
}
```

```js
{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": "cosineSimilaritySparse(params.queryVector, doc['my_sparse_vector'].value)",
        "params": {
          "queryVector": {"2": -0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
        }
      }
    }
  }
}
```

Closes #31615
2019-02-20 07:01:17 -05:00
Alexander Reelsen 5f7168ea74
Remove joda time mentions in documentation (#38720)
This is the forward port of #38720 (not containing the 7.0 migration docs)
2019-02-14 10:18:48 +01:00
Julie Tibshirani cb88e7ed03
Update the removal of types docs with the new 6.7 behavior. (#38869)
Follow-up to #38825, where we made a tweak to the deprecation behavior.
2019-02-13 14:43:35 -08:00
Alexander Reelsen 87f3579125
Add nanosecond field mapper (#37755)
This adds a dedicated field mapper that supports nanosecond resolution -
at the price of a reduced date range.

When using the date field mapper, the time is stored as milliseconds since the epoch
in a long in lucene. This field mapper stores the time in nanoseconds
since the epoch - which means its range is much smaller, ranging roughly from
1970 to 2262.

Note that aggregations will still be in milliseconds.
However docvalue fields will have full nanosecond resolution

Relates #27330
2019-02-04 11:31:16 +01:00
Nick Knize 603cdf40f1
Update geo_shape docs to include unsupported features (#38138)
There are a two major features that are not yet supported by BKD Backed geo_shape: MultiPoint queries, and CONTAINS relation. It is important we are explicitly clear in the documentation that using the new approach may not work for users that depend on these features. This commit adds an IMPORTANT NOTE section to geo_shape docs that explicitly highlights these missing features and what should be done if they are an absolute necessity.
2019-02-01 10:41:41 -06:00
Julie Tibshirani 91b79ebed4
Update 'removal of types' docs to reflect the new plan. (#38003) 2019-01-31 10:26:24 -08:00
Adrien Grand 3332e332c7
Fix typo in docs. (#38018)
This has been introduced in #37871.
2019-01-31 09:51:03 +01:00
Adrien Grand b63b50b945
Give precedence to index creation when mixing typed templates with typeless index creation and vice-versa. (#37871)
Currently if you mix typed templates and typeless index creation or typeless
templates and typed index creation then you will end up with an error because
Elasticsearch tries to create an index that has multiple types: `_doc` and
the explicit type name that you used.

This commit proposes to give precedence to the index creation call so that
the type from the template will be ignored if the index creation call is
typeless while the template is typed, and the type from the index creation
call will be used if there is a typeless template.

This is consistent with the fact that index creation already "wins" if a field
is defined differently in the index creation call and in a template: the
definition from the index creation call is used in such cases.

Closes #37773
2019-01-30 10:28:24 +01:00
Mayya Sharipova a30ce6a00a
Rename feature, feature_vector and feature_query (#37794)
Ranaming as follows:
feature -> rank_feature
feature_vector -> rank_features
feature query -> rank_feature query

Ranaming is done to distinguish from other vector types.

Closes #36723
2019-01-24 19:18:48 -05:00
Christoph Büscher b3f9becf5f
Modify removal_of_types.asciidoc (#37648)
After switching the default behaviour of "include_type_name" to "false" in 7.0,
some parts of the types removal documentation can be adapted as well.
2019-01-23 09:48:00 +01:00
Christoph Büscher 34f2d2ec91
Remove remaining occurances of "include_type_name=true" in docs (#37646) 2019-01-22 15:13:52 +01:00
Christoph Büscher 25aac4f77f
Remove `include_type_name` in asciidoc where possible (#37568)
The "include_type_name" parameter was temporarily introduced in #37285 to facilitate
moving the default parameter setting to "false" in many places in the documentation
code snippets. Most of the places can simply be reverted without causing errors.
In this change I looked for asciidoc files that contained the
"include_type_name=true" addition when creating new indices but didn't look
likey they made use of the "_doc" type for mappings. This is mostly the case
e.g. in the analysis docs where index creating often only contains settings. I
manually corrected the use of types in some places where the docs still used an
explicit type name and not the dummy "_doc" type.
2019-01-18 09:34:11 +01:00
Julie Tibshirani 36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Peter Dyson 96cfa000a5
[DOCS] copy_to only works one level deep, not recursively (#37249) 2019-01-13 16:24:34 +10:00
Josh Soref edb48321ba [DOCS] Various spelling corrections (#37046) 2019-01-07 14:44:12 +01:00
Nick Knize ec0dc2c0e9
[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#36751)
* [Geo] Expose BKDBackedGeoShapes as new VECTOR strategy

This commit exposes lucene's LatLonShape field as a new
strategy in GeoShapeFieldMapper. To use the new indexing
approach, strategy should be set to "vector" in the
geo_shape field mapper. If the tree parameter is set
the mapper will throw an IAE. Note the following:

When using vector strategy:

* geo_shape query does not support querying by POINT,
MULTIPOINT, or GEOMETRYCOLLECTION.
* LINESTRING and MULTILINESTRING queries do not support
WITHIN relation.
* CONTAINS relation is not supported.
* The tree, precision, tree_levels, distance_error_pct,
and points_only parameters will not throw an exception
but they have no effect and will be marked as
deprecated..

All other features are supported.

* revert change to PercolatorFieldMapper

* fix ExistsQuery for geo_shape vector strategy

* add deprecation logging for tree, precision, tree_levels, distance_error_pct, and points_only

* initial update to geoshape docs, including mapping migration updates

* initial support for GeoCollection queries

* fix docs and javadoc errors

* clean up geocollection queries

* set deprecated mapping tests to NOTCONSOLE

* fix geo-shape mapper asciidoc mapping and test warnings

* add support for point queries using LatLonShapeBoundingBoxQuery

* update GeoShapeQueryBuilderTests to include POINT queries for VECTOR strategy. Other comment cleanups

* add lucene geometry build testing to ShapeBuilder tests

* remove deprecated prefix tree mapping from geo-shape.asciidoc

* refactor GeoShapeFieldMapper into LegacyGeoShapeFieldMapper and GeoShapeFieldMapper

Both classes derive from BaseGeoShapeFieldMapper that provides shared parameters:
coerce, ignoreMalformed, ignore_z_value, orientation.

* update docs to remove vector strategy

* fix GeometryCollectionBuilder#buildLucene to return the object created by the shape builder

* fix LineLength failure in GeoJsonShapeParserTests

* ShapeMapper refactor changes from PR feedback

* fix typo in geo-shape.asciidoc

* ignore circle test in docs

* update indexing-approach ref to geoshape-indexing-approach

* add warnings check for LegacyGeoShapeFieldMapper to AbstractBuilderTestCase

* fix deprecatedParameters setup

* update indexing approach

* fixing unexpected warnings failures

* move orientation back to field type

* remove if in LegacyGeoShapeFieldMapper#doXContent. Fix GeoShapeFieldMapper to work with double array as a point

* fix indexing-approach link in circle section of geoshape docs

* add strategy to deprecation warnings check

* fix test failures

* fix typo in QueryStringQueryBuilderTests

* fix total hits to totalHits().value

* fix version number

* add version check to BaseGeoShapeFieldMapper

* fix line length!

* revert version check in BaseGeoShapeFieldMapper

* Fix serialization of mappings of legacy shapes.
2018-12-18 09:54:56 -06:00
Nicholas Knize 96d279ed83 Revert "[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)"
This reverts commit 5bc7822562.
2018-12-17 20:09:46 -06:00
Nick Knize 5bc7822562
[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)
This commit  exposes lucene's LatLonShape field as the
default type in GeoShapeFieldMapper. To use the new 
indexing approach, simply set "type" : "geo_shape" in 
the mappings without setting any of the strategy, precision, 
tree_levels, or distance_error_pct parameters. Note the 
following when using the new indexing approach:

* geo_shape query does not support querying by 
MULTIPOINT.
* LINESTRING and MULTILINESTRING queries do not 
yet support WITHIN relation.
* CONTAINS relation is not yet supported.
The tree, precision, tree_levels, distance_error_pct, 
and points_only parameters are deprecated.
2018-12-17 14:38:14 -06:00
Mayya Sharipova bda03163e7 Make vector fields experimental feature
Relates to #33022
2018-12-13 07:17:52 -05:00
Mayya Sharipova b5d532f9e3
Vector field (#33022)
1. Dense vector

PUT dindex
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_vector": {
          "type": "dense_vector"
        },
        "my_text" : {
          "type" : "keyword"
        }
      }
    }
  }
}

PUT dinex/_doc/1
{
  "my_text" : "text1",
  "my_vector" : [ 0.5, 10, 6 ]
}

2. Sparse vector

PUT sindex
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_vector": {
          "type": "sparse_vector"
        },
        "my_text" : {
          "type" : "keyword"
        }
      }
    }
  }
}

PUT sindex/_doc/1
{
  "my_text" : "text1",
  "my_vector" : {"1": 0.5, "99": -0.5,  "5": 1}
}
2018-12-12 21:20:53 -05:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Alan Woodward 73ceaad03a
Update to lucene-8.0.0-snapshot-c78429a554 (#36212)
Includes:

* A fix for a bug in Intervals.or() (https://issues.apache.org/jira/browse/LUCENE-8586)
* The ability to disable offset mangling in WordDelimiterGraphFilter
        (https://issues.apache.org/jira/browse/LUCENE-8509)
* BM25Similarity no longer multiplies scores by k1 + 1
2018-12-05 12:43:56 +00:00
Guido Lena Cota 89fae42833 (Minor) Fix some typos (#36180) 2018-12-04 11:10:30 +01:00
Peter Dyson 1f25a0bd31 [Docs] Add example for updating meta field (#35893) 2018-11-28 12:04:57 +01:00
Alan Woodward be8097f9ce
Improve docs for index_prefixes option (#35778)
This commit moves the documentation and examples for the `index_prefixes`
option on text fields to its own file, to bring it in line with other mapping 
parameters, and expands a bit on both.
2018-11-22 09:20:46 +00:00
Alan Woodward 26cc8ff8c3
Add pointer to the index-phrases option in shingle filter docs (#35771)
We should be discouraging the use of shingle filters and instead pointing users to the
index-phrases parameter on text fields.
2018-11-21 15:27:11 +00:00
Takuro Wada 7b2d547e8e [Docs] Delete inappropriate backtick (#35722) 2018-11-20 10:08:32 +01:00
Julie Tibshirani ec53288fc0
Remove include_type_name from the relevant APIs. (#35192)
We've decided that the bulk, delete, get, index, update, and search APIs should not
contain this request parameter, and we will instead accept both typed and typeless calls.
2018-11-06 14:33:48 -08:00
Julie Tibshirani 70da490f34
Remove some documentation that only makes sense with multiple types. (#35066)
* Remove a tip about ignore_above that only makes sense with multiple types.
* Remove a line from the percolator documentation that refers to multiple types.
2018-10-30 10:19:12 -07:00
Julie Tibshirani f854330e06
Make sure to use the type _doc in the REST documentation. (#34662)
* Replace custom type names with _doc in REST examples.
* Avoid using two mapping types in the percolator docs.
* Rename doc -> _doc in the main repository README.
* Also replace some custom type names in the HLRC docs.
2018-10-22 11:54:04 -07:00
Igor Motov 94bde37bcf
Geo: Don't flip longitude of envelopes crossing dateline (#34535)
When a envelope that crosses the dateline is specified as a part of
geo_shape query is parsed it shouldn't have its left and right points
flipped.

Fixes #34418
2018-10-19 13:53:54 -04:00
Daniel Mitterdorfer 02fb5aa4ec
Remove leftover doc about format being updatable
With this commit we remove a leftover in the docs about the `format`
field being updatable. This is not true since we removed support for
updates in #25285.

Closes #33986
Relates #25285
Relates #34006
2018-09-25 10:13:23 +02:00
markharwood 2fa09f062e
New plugin - Annotated_text field type (#30364)
New plugin for annotated_text field type.
Largely a copy of `text` field type but adds ability to include markdown-like syntax in the text.
The “AnnotatedText” class parses text+markup and converts into plain text and AnnotationTokens.
The annotation token values are injected unchanged alongside the regular text tokens to provide a
form of additional indexed overlay useful in positional searches and highlighting.
Annotated_text fields do not support fielddata as we want to phase this out.
Also includes a new "annotated" highlighter type that retains annotations and merges in search
hits as additional annotation markup.

Closes #29467
2018-09-18 10:25:27 +01:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Pablo Musa a88f8789a0
Highlight that index_phrases only works if no slop is used (#33303)
Highlight that `index_phrases` only works if no slop is used at query time.
2018-08-31 14:48:55 +02:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Dimitrios Liappis abb4c183f1
Clarify ignore_above behavior with arrays of strings
Currently docs don't explain how `ignore_above` behaves with arrays of
strings.

Clarify how `ignore_above` applies for arrays of strings and
also note that all string(s) will still be visible in the
`_source` field.

Relates #33057
2018-08-22 18:18:30 +03:00
Julie Tibshirani 815c56b677
Fix an inaccuracy in the dynamic templates documentation. (#32890) 2018-08-20 11:00:11 -07:00
Julie Tibshirani 0f0068b91c
Ensure that field aliases cannot be used in multi-fields. (#32219) 2018-07-20 00:18:54 -07:00
Julie Tibshirani 15ff3da653
Add support for field aliases. (#32172)
* Add basic support for field aliases in index mappings. (#31287)
* Allow for aliases when fetching stored fields. (#31411)
* Add tests around accessing field aliases in scripts. (#31417)
* Add documentation around field aliases. (#31538)
* Add validation for field alias mappings. (#31518)
* Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671)
* Make sure that field-level security is enforced when using field aliases. (#31807)
* Add more comprehensive tests for field aliases in queries + aggregations. (#31565)
* Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)
2018-07-18 09:33:09 -07:00
Nik Everett 0522c6644d Docs: Remove duplicate test setup
The range docs had an introductory section that described how to set up
and index *and* a test setup section in `docs/build.gradle` that
duplicated that section. This is bad because these section can (and do)
drift from one another. This change removes the setup in build.gradle
and marks the introductor snippet with `// TESTSETUP` so it is used on
all the snippets.
2018-06-28 10:59:35 -04:00
Peter Dyson e7a7b9689d [Docs] Mention ip_range datatypes on ip type page (#31416)
A link to the ip_range datatype page provides a way for newer users to know
it exists if they land directly on the ip datatype page first via a search.
2018-06-20 13:04:03 +02:00
Julie Tibshirani 3f5ebb862d
Clarify that IP range data can be specified in CIDR notation. (#31374) 2018-06-18 08:21:41 -07:00