Commit Graph

1070 Commits

Author SHA1 Message Date
James Rodewig 590b7f1cb8 [DOCS] Fix typo in search your data docs 2020-08-25 17:00:45 -04:00
James Rodewig 915b353f36
[DOCS] Display point in time API docs (#61527) 2020-08-25 11:03:41 -04:00
Nhat Nguyen 879279c9b4
Introduce point in time APIs in x-pack basic (#61062)
This commit introduces a new API that manages point-in-times in x-pack 
basic. Elasticsearch pit (point in time) is a lightweight view into the
state of the data as it existed when initiated. A search request by
default executes against the most recent point in time. In some cases,
it is preferred to perform multiple search requests using the same point
in time. For example, if refreshes happen between search_after requests,
then the results of those requests might not be consistent as changes
happening between searches are only visible to the more recent point in
time.

A point in time must be opened before being used in search requests. The 
`keep_alive` parameter tells Elasticsearch how long it should keep a
point in time around.

```
POST /my_index/_pit?keep_alive=1m
```

The response from the above request includes a `id`, which should be 
passed to the `id` of the `pit` parameter of search requests.

```
POST /_search
{
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "pit": {
            "id":  "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
            "keep_alive": "1m"
    }
}
```

Point-in-times are automatically closed when the `keep_alive` is 
elapsed. However, keeping point-in-times has a cost; hence,
point-in-times should be closed as soon as they are no longer used in
search requests.

```
DELETE /_pit
{
    "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
```

#### Notable works in this change:

- Move the search state to the coordinating node: #52741
- Allow searches with a specific reader context: #53989
- Add the ability to acquire readers in IndexShard: #54966

Relates #46523
Relates #26472

Co-authored-by: Jim Ferenczi <jimczi@apache.org>
2020-08-24 20:24:35 -04:00
James Rodewig d03012fbd1
[DOCS] Fix typo in profile API docs (#61445) (#61502)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: shashikumarec088 <shashikumarec088@gmail.com>
2020-08-24 15:30:25 -04:00
James Rodewig fdc4e83050
[DOCS] Combine `Search your data` files (#61477)
No-op changes to:

* Move `Search your data` source files into the same directory
* Rename `Search your data` source files based on page ID
* Remove unneeded includes
* Remove the `Request` dir
2020-08-24 11:22:56 -04:00
James Rodewig c688cb6bfd
[DOCS] Fix hyphenation for "time series" (#61472) 2020-08-24 10:34:41 -04:00
James Rodewig d46931840b
[DOCS] Prune `Search your data` content (#61303)
Changes:
* Removes narrative around URI searches. These aren't commonly used in production. The `q` param is already covered in the search API docs: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-search.html#search-api-query-params-q
* Adds a common options section that highlights narrative docs for query DSL, aggregations, multi-index search, search fields, pagination, sorting, and async search.
* Adds a `Search shard routing` page. Moves narrative docs for adaptive replica selection, preference, routing , and shard limits to that section.
* Moves search timeout and cancellation content to the `Search your data` page.
* Creates a `Search multiple data streams and indices` page. Moves related narrative docs for multi-target syntax searches and `indices_boost` to that page.
* Removes narrative examples for the `search_type` parameters. Moves documentation for this parameter to the search API docs.
2020-08-24 08:38:20 -04:00
Nhat Nguyen 2a3a8dd296 Fix anchor doc for msearch cancellation paragraph
Relates #61418
2020-08-21 12:11:00 -04:00
James Rodewig c21930e4ce
[DOCS] Remove URI search examples from API reference (#61423) 2020-08-21 10:57:35 -04:00
Nhat Nguyen 35ccd06918
Add cancellation doc for multi search (#61418)
Relates #61337
2020-08-21 10:10:05 -04:00
James Rodewig a94e5cb7c4
[DOCS] Replace Wikipedia links with attribute (#61171) 2020-08-17 09:44:24 -04:00
James Rodewig 39f92f2a02
[DOCS] Fix typo in suggester docs (#61077) (#61204)
Co-authored-by: Arash Layeghi <arashlayeghi57@gmail.com>
2020-08-17 09:14:37 -04:00
James Rodewig ea2836275c
[DOCS] Fix index boost snippet (#61023)
Updates the `indices_boost` snippet to use the `my-index-000001` index.

Removes a related REST test.
2020-08-12 09:07:57 -04:00
James Rodewig 9b9e0b7b16
[DOCS] Remove search request body page (#60972) 2020-08-11 12:05:54 -04:00
James Rodewig 00881d9f0d
[DOCS] Move post filter/rescore content to new page (#60903) 2020-08-11 08:51:22 -04:00
James Rodewig 50ab1b22aa
[DOCS] Move `min_score` docs to search API page (#60895)
Reformats the `min_score` docs as a param definition on the
search API reference page.
2020-08-10 09:27:23 -04:00
James Rodewig 0dc3364f84
[Docs] Combine highlighting docs files (#60849) 2020-08-10 08:24:48 -04:00
James Rodewig ba88f0bd6a
[DOCS] Move inner hits content to separate page (#60840)
Moves inner hits content from the deprecated 'Request Body Search'
chapter to a separate page.
2020-08-06 13:47:06 -04:00
James Rodewig 6b9b8c5e31
[DOCS] Move script and stored fields content to search fields page (#60826)
Changes:

* Moves `Retrieve selected fields` to its own page and adds a title abbreviation.
* Adds existing script and stored fields content to `Retrieve selected fields`
* Adds a xref for `Retrieve selected fields` to `Search your data`
* Adds related redirects and updates existing xrefs
2020-08-06 12:45:03 -04:00
James Rodewig 929033f9dd
[DOCS] Move named query content to bool query (#60748) 2020-08-05 13:27:10 -04:00
James Rodewig 56c778235c
[DOCS] Fix metadata field refs (#60764) 2020-08-05 13:21:00 -04:00
James Rodewig 4407402924
[DOCS] Refactor snippets for `Search your data` (#60701)
Changes:
* Moves sample data to reusable REST test
* Add xref to pagination docs
* Removes duplicated results
* Updates the wildcard example
2020-08-05 09:32:11 -04:00
James Rodewig c375df3c48
[DOCS] Add soft redirect for sliced scroll (#60699) 2020-08-05 09:00:29 -04:00
James Rodewig a4dc336c16
[DOCS] Replace `twitter` dataset in search/agg docs (#60667) 2020-08-04 13:31:52 -04:00
Alexander Reelsen c7ac9e7073
[DOCS] http -> https, remove outdated plugin docs (#60380)
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.

While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.

In addition a few community links have been removed, as they do not seem
to exist anymore.
2020-07-31 15:58:38 -04:00
James Rodewig aec26b1a23
[DOCS] Move search pagination content to one page (#60515) 2020-07-31 11:43:06 -04:00
James Rodewig f320567e5a
[DOCS] Merge search topic and overview pages (#60459) 2020-07-30 15:01:21 -04:00
James Rodewig af73504865
[DOCS] Move field collapse content to separate page (#60424) 2020-07-30 09:03:50 -04:00
Julie Tibshirani 8a89d95372
Add search `fields` parameter to support high-level field retrieval. (#60100)
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.

Addresses #49028 and #55363.
2020-07-27 13:25:55 -07:00
James Rodewig 441c3a21b1
[DOCS] Update my-index examples (#60132)
Changes the following example index names to `my-index-000001` for consistency:

* `my-index`
* `my_index`
* `myindex`
2020-07-27 14:46:39 -04:00
James Rodewig d5b03f668b
[DOCS] Move search sort docs to separate page (#60123)
Moves the search sort docs from the deprecated 'Request Body Search'
page to a new subpage of 'Run a search'.

No substantive changes were made to the content.
2020-07-23 12:58:57 -04:00
James Rodewig 2774cd6938
[DOCS] Swap `[float]` for `[discrete]` (#60124)
Changes instances of `[float]` in our docs for `[discrete]`.

Asciidoctor prefers the `[discrete]` tag for floating headings:
https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks
2020-07-23 11:48:22 -04:00
bellengao a9e52194ad
[DOCS] Correct the default value of `ignore_throttled` param (#60036) 2020-07-22 16:34:50 -04:00
James Rodewig c05c8bde81
[DOCS] Update search docs to use `my-index` dataset (#60005) 2020-07-21 15:52:00 -04:00
James Rodewig 3113f9495d
[DOCS] Introduce basic ECS logs test (#59713)
Adds a new `my-index-00001` REST test for docs snippets.

This test can serve as a lightweight replacement for
our existing `twitter` REST tests.

The new dataset is:

* Based on Apache logs, which is better aligned with Elastic use cases
* Compliant with ECS
* Similar to the existing `twitter` data set, containing the same field data types
* Lightweight, which should keep existing test runtimes roughly the same

Also updates the search API reference docs to use the new test.
2020-07-21 12:55:51 -04:00
James Rodewig 80b674fb25
[DOCS] Reformat snippets to use two-space indents (#59973) 2020-07-21 12:24:26 -04:00
James Rodewig 8170cb9cf0
[DOCS] Remove collapsible examples (#59820)
Snippets are now visible without additional clicks.
2020-07-20 08:42:56 -04:00
James Rodewig aa3ddfeefb
[DOCS] Move highlighting docs to separate page (#59768)
Moves the highlighting docs from the deprecated 'Request Body Search'
chapter to the new subpage of the 'Run a search chapter' section.

No substantive changes were made to the content.
2020-07-17 10:15:20 -04:00
James Rodewig 69899dc2cc
[DOCS] Add data streams to validate query API (#59420) 2020-07-13 12:30:54 -04:00
James Rodewig aa6cb874b9
[DOCS] Add data streams to field caps API docs (#59326) 2020-07-09 16:41:10 -04:00
James Rodewig 2be9db01c8
[DOCS] Replace `datatype` with `data type` (#58972) 2020-07-07 13:52:10 -04:00
James Rodewig 03d90c4945
[DOCS] Add data streams to rank eval API docs (#59069) 2020-07-07 13:16:53 -04:00
James Rodewig 9af4c1aa0e [DOCS] Fix `scroll` param typo 2020-07-02 08:43:45 -04:00
debadair 92851b422f
[DOCS] Fix cannot must typo. (#58884) 2020-07-01 17:44:35 -07:00
James Rodewig c7ca1d5941 [DOCS] Make `<target>` defs consistent 2020-06-30 15:53:32 -04:00
James Rodewig 3d77914db7
[DOCS] Add data streams to count API (#58771) 2020-06-30 15:01:37 -04:00
James Rodewig a7aa3da3bf
[DOCS] Add data streams to multi search API docs (#58610)
Makes the existing multi search API docs aware of data streams.
2020-06-26 17:06:58 -04:00
markharwood cdc1be144b
Field capabilities - make `keyword` a family of field types (#58315)
Introduces a new method on `MappedFieldType` to return a family type name which defaults to the field type.
Changes `wildcard` and `constant_keyword` field types to return `keyword` for field capabilities.

Relates to #53175
2020-06-24 11:37:16 +01:00
James Rodewig f00c8abe20
[DOCS] Add data streams to search docs (#58278)
Changes:

* Adds additional examples to the `Search a data stream` section of
  `Use a data stream`
* Updates existing search docs to make them aware of data streams
2020-06-18 08:43:02 -04:00
Jim Ferenczi 90c9b95ca0
Allow index filtering in field capabilities API (#57276)
* Add index filtering in field capabilities API

This change allows to use an `index_filter` in the
field capabilities API. Indices are filtered from
the response if the provided query rewrites to `match_none`
on every shard:

````
GET metrics-*
{
  "index_filter": {
    "bool": {
      "must": [
        "range": {
          "@timestamp": {
            "gt": "2019"
          }
        }
      }
  }
}
````

The filtering is done on a best-effort basis, it uses the can match phase
to rewrite queries to `match_none` instead of fully executing the request.
The first shard that can match the filter is used to create the field
capabilities response for the entire index.

Closes #56195
2020-06-17 22:53:53 +02:00
James Rodewig c0191eb84c
[DOCS] Fix routing param in search API docs (#58267) 2020-06-17 15:05:20 -04:00
Adam Locke 961a85e1e6
[DOCS] Add documentation for near real-time search (#57560)
* Adding documentation for near real-time search.

* Adding link to NRT topic and clarifying some text.

* Adding diagrams and incorporating changes from David T.
2020-06-15 14:49:33 -04:00
James Rodewig 7826bbee87
[DOCS] Move search API's `docvalue_fields` examples (#57760)
Changes:

* Condenses and relocates the `docvalue_fields` example to the 'Run a search' 
   page.
* Adds docs for the `docvalue_fields` request body parameter.
* Updates several related xrefs.

Co-authored-by: debadair <debadair@elastic.co>
2020-06-11 10:57:15 -04:00
James Rodewig 0d081de7ad
[DOCS] Fix source-related search API params (#57691)
Cleans up the reference documentation for the following
search API parameters:

* `_source` query parameter
* `_source_excludes` query parameter
* `_source_includes` query parameter
* `_source` request body parameter
* `hits._source` response property
2020-06-09 12:44:57 -04:00
James Rodewig 51e3d5ab63
[DOCS] Fix source filtering xrefs (#57720) 2020-06-05 08:46:26 -04:00
James Rodewig 7f201d7f4f
[DOCS] Move source filtering examples (#57689)
Moves the source filtering example snippets form the "Request body
search" API docs page to the "Return fields in a search" section of the
"Run a search" page.
2020-06-04 15:10:18 -04:00
James Rodewig 09980ca517
[DOCS] Reformat whitespace in search API docs (#57667)
Changes the search API docs to use:

* Consistent indentation in param definitions
* Two-space indentation in JSON snippets
2020-06-04 09:47:18 -04:00
Julie Tibshirani de9b91fe48
Add a reference on returning fields during a search. (#57500)
This PR adds a section to the new 'run a search' reference that explains
the options for returning fields. Previously each option was only listed as a
separate request parameter and it was hard to know what was available.
2020-06-03 09:33:26 -07:00
James Rodewig 69b79d21fe
[DOCS] Add clear scroll API reference docs (#57367) 2020-06-03 11:42:43 -04:00
James Rodewig 3bb11cf269
[DOCS] Refactor admons for multi-parameter options (#57491)
Several APIs support options that can be specified as a query parameter or a
request body parameter.

Currently, this is documented using notes, which can get rather lengthy. This
replaces those multiple notes with a single note and a footnote.
2020-06-02 11:58:46 -04:00
James Rodewig 0496f9ab3b
[DOCS] Add scroll API reference docs (#57153)
Changes:

* Adds API reference docs for the scroll API
* Documents several related parameters in the search API docs
2020-06-02 09:33:11 -04:00
Julie Tibshirani f46b04956b
Avoid unnecessary use of stored_fields in our docs. (#57488)
Generally we don't advocate for using `stored_fields`, and we're interested in
eventually removing the need for this parameter. So it's best to avoid using
stored fields in our docs examples when it's not actually necessary.

Individual changes:
* Avoid using 'stored_fields' in our docs.
* When defining script fields in top-hits, de-emphasize stored fields.
2020-06-01 17:29:48 -07:00
Lisa Cawley 8b9293b3bf
[DOCS] Replace docdir attribute with es-repo-dir (#57489) 2020-06-01 15:55:05 -07:00
Christoph Büscher 3d4f9fedaf
Check for negative "from" values in search request body (#54953)
Today we already disallow negative values for the "from" parameter in the search
API when it is set as a request parameter and setting it on the
SearchSourceBuilder, but it is still parsed without complaint from a search
body, leading to differing exceptions later. This PR changes this behavior to be
the same regardless of setting the value directly, as url parameter or in the
search body. While we silently accepted "-1" as meaning "unset" and used the
default value of 0 so far, any negative from-value is now disallowed.

Closes #54897
2020-05-28 16:25:19 +02:00
Nik Everett 9aaab6efdd
Save memory on numeric sig terms when not top (#56789)
This saves memory when running numeric significant terms which are not
at the top level by merging its collection into numeric terms and relying
on the optimization that we made in #55873.
2020-05-27 10:53:09 -04:00
James Rodewig 68ed00e2d2 [DOCS] Fix deep paging recommendations
Corrects recommendation to reference the `search_after` parameter,
not API.
2020-05-21 14:25:04 -04:00
Théophile Helleboid - chtitux 309d86df97
[DOCS] Fix typo in search API `explain` param def (#56991)
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-05-20 09:21:28 -04:00
James Rodewig 59fa4ccba3 [DOCS] Fix JS client attribute in docs
The Elasticsearch-JS client only produces documentation for major
versions (e.g., n.x, 7.x, 6.x).

However, the `{jsclient}` attribute uses the current `{branch}`, which
can result in broken links in minor version docs.

This swaps the `{jsclient}` attribute for `{jsclient-current}`,
which is less likely to break across versions.
2020-05-19 16:55:07 -04:00
Tomas Della Vedova 0e98652ed1
[DOCS] Add JS client helper links to docs (#55216)
Adds links for the Elasticsearch-js client to the bulk and scroll docs.
2020-05-19 16:17:24 -04:00
James Rodewig 56d7af09e7
[DOCS] Add search pagination docs (#56785)
Reworks the `from / size` content to `Paginate search results`.

Moves those docs from the request body search API page (slated for
deletion) to the `Run a search` tutorial docs.

Also adds some notes to the `from` and `size` param docs.

Co-authored-by: debadair <debadair@elastic.co>
2020-05-15 17:22:40 -04:00
James Rodewig 34dd9d1772 [DOCS] Correct typo in "Run a search" docs 2020-05-13 10:14:49 -04:00
Nik Everett 4a8d93f55b
Add list of defered aggregations to the profiler (#56208)
This adds a few things to the `breakdown` of the profiler:
* `histogram` aggregations now contain `total_buckets` which is the
  count of buckets that they collected. This could be useful when
  debugging a histogram inside of another bucketing agg that is fairly
  selective.
* All bucketing aggs that can delay their sub-aggregations will now add
  a list of delayed sub-aggregations. This is useful because we
  sometimes have fairly involved logic around which sub-aggregations get
  delayed and this will save you from having to guess.
* Aggregtations wrapped in the `MultiBucketAggregatorWrapper` can't
  accurately add anything to the breakdown. Instead they the wrapper
  adds a marker entry `"multi_bucket_aggregator_wrapper": true` so we
  can be quickly pick out such aggregations when debugging.

It also fixes a bug where `_count` breakdown entries were contributing
to the overall `time_in_nanos`. They didn't add a large amount of time
so it is unlikely that this caused a big problem, but I was there.

To support the arbitrary breakdown data this reworks the profiler so
that the `breakdown` can contain any data that is supported by
`StreamOutput#writeGenericValue(Object)` and
`XContentBuilder#value(Object)`.
2020-05-13 08:30:38 -04:00
James Rodewig 7c449319a1
[DOCS] Relocate `shard allocation` module content (#56535) 2020-05-12 08:55:57 -04:00
James Rodewig c0e8c088fc
[DOCS] Relocate request body param docs to search API docs (#56436)
Moves documentation for the following request body parameters to the
search API reference docs:

* `explain`
* `query`
* `seq_no_primary_term`
* `version`

Removes documentation for these parameters from the Request body search
page[0].

[0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-body.html
2020-05-11 11:10:44 -04:00
James Rodewig b5f219cc6f
[DOCS] Relocate search API's request body parameters (#56304)
Changes:
* Moves the document request body parameters for the search API
  from the Request body search page to the Search API reference page.

* Relocates a search request body example from the Request body search
  page to the Search API reference page.

* Adds a note to any duplicated query and request body parameters.
2020-05-07 09:50:22 -04:00
Luca Cavanna 6a69ab4a3e
[DOCS] Async search: clarify behaviour when submit returns final results (#55934)
* [DOCS] Async search: clarify behaviour when submit returns final results

Closes #55636

* reword

* iter
2020-05-06 10:00:30 +02:00
James Rodewig 7bcb4b78f7 [DOCS] Correct `track_total_hits` param default value 2020-05-05 10:50:01 -04:00
James Rodewig 759752e6f2 [DOCS] Correct `track_total_hits` param def formatting 2020-05-05 10:43:45 -04:00
James Rodewig e6542c0823
[DOCS] Combine search API and URI search API reference docs (#55884)
The search API and URI search pages document the same `_search` API.
This combines the documentation from each page under the search API
docs.

Changes:

* Adds an abbreviated title for the search API page.
* Removes the following invalid query parameters:
  * `analyzer`
  * `analyze_wildcard`
  * `default_operator`
  * `df`
  * `lenient`
  * `suggest_mode`
  * `suggest_size`
* Removes the URI search docs page and adds a related redirect.
* Updates the headings of several examples

Co-authored-by: debadair <debadair@elastic.co>
2020-05-05 10:28:13 -04:00
James Rodewig a73fef3d62
[DOCS] Create top-level "Search your data" page (#56058)
* [DOCS] Create top-level "Search your data" page

**Goal**

Create a top-level search section. This will let us clean up our search
API reference docs, particularly content from [`Request body search`][0].

**Changes**

* Creates a top-level `Search your data` page. This page is designed to
  house concept and tutorial docs related to search.

* Creates a `Run a search` page under `Search your data`. For now, This
  contains a basic search tutorial. The goal is to add content from
  [`Request body search`][0] to this in the future.

* Relocates `Long-running searches` and `Search across clusters` under
  `Search your data`. Increments several headings in that content.

* Reorders the top-level TOC to move `Search your data` higher. Also
  moves the `Query DSL`, `EQL`, and `SQL access` chapters immediately
  after.

Relates to #48194

[0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-body.html
2020-05-05 09:46:29 -04:00
James Rodewig 5572ccebae
[DOCS] Add collapsible sections to search API response (#55887) 2020-05-04 16:32:19 -04:00
Luca Cavanna c04bf6e14a
[DOCS] Clarify async search response flags (#55574)
Relates to #55572
2020-04-29 15:21:26 +02:00
James Rodewig 26f1851ce0
[DOCS] Correct search API's timeout parm default (#55855) 2020-04-28 09:43:52 -04:00
Adrien Grand b614412569
Repurpose `ignore_throttled` to be only about frozen indices. (#55047)
This has no practical impact on users since frozen indices are the only
throttled indices today. However this has an impact on upcoming features
that would use search throttling.

Filtering out throttled indices made sense a couple years ago, but as
we're now improving support for slow requests with `_async_search` and
exploring ways to reduce storage costs, this feature has most likely
become a trap, that we'd like to not have with upcoming features that
would use search throttling.

Relates #54058
2020-04-28 13:43:35 +02:00
James Rodewig 4980ea7596
[DOCS] Document `max_concurrent_searches` default (#55116) 2020-04-15 10:02:33 -04:00
Julie Tibshirani 13053c6ad9
Remove the object format for indices_boost. (#55078)
This format has been deprecated since version 5.2.
2020-04-14 21:01:07 -07:00
Christoph Büscher 7b199dbcec
[Test] Don't expect specific scores in docs tests (#54297)
The failing suggester documentation test was expecting specific scores in the
test response, which is fragile implementation details that e.g. can change with
different lucene versions and generally shouldn't be done in documentation test.
Instead we usually replace the float values in the output response by the ones
in the actual response.

Closes #54257
2020-03-27 10:23:16 +01:00
David Turner 10f19703e8
Mute test failing in #54257 (#54258) 2020-03-26 11:10:00 +00:00
Luca Cavanna 1c482141ee
Async search: rename REST parameters (#54198)
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search.
Also it renames clean_on_completion to keep_on_completion and turns around its behaviour.

Closes #54069
2020-03-26 09:40:05 +01:00
Luca Cavanna 8c29035635
Async search: prevent users from overriding pre_filter_shard_size (#54088)
Submit async search forces pre_filter_shard_size for the underlying search that it creates.
With this commit we also prevent users from overriding such default as part of request validation.
2020-03-24 17:04:38 +01:00
Jim Ferenczi 04bd154037
Add heuristics to compute pre_filter_shard_size when unspecified (#53873)
This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase
automatically if one of these conditions is met:
  * The request targets more than 128 shards.
  * The request contains read-only indices.
  * The primary sort of the query targets an indexed field.
Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value.

Closes #39835
2020-03-23 19:06:32 +01:00
Luca Cavanna aa56f91fba
[DOCS] address timing issue in async search docs tests (#53910)
The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final.

With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response.

Closes #53887
Closes #53891
2020-03-23 14:17:53 +01:00
Luca Cavanna 1af04175a1
Async search: remove version from response (#53960)
The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available).

That said this commit clarifies this in the docs and removes the version field from the async search response
2020-03-23 13:42:10 +01:00
Mark Vieira 3cf3f60f93
Mute submit-async-search-date-histogram-example test 2020-03-20 11:19:03 -07:00
Luca Cavanna fc083493d2 [DOCS] correct async search note
The sort optimization kicks in whenever results are sorted by field.
2020-03-20 15:57:43 +01:00
Luca Cavanna 0a93a93069
[DOCS] add docs for async search (#53675)
Relates to #49091

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2020-03-20 14:04:33 +01:00
Julie Tibshirani c0c53b8724
Small corrections to stored_fields docs. (#53247)
* Fix a reference to the 'field' option.
* Remove claim about detecting script fields.
* Specify that object fields will just be ignored.
2020-03-09 10:57:23 -07:00
James Rodewig fe08296c31
[DOCS] Correct `hits.total.relation` response parm def (#52847)
Fixes a partially completed definition for the `hits.total.relation`
response parameter in the search API docs.
2020-03-04 08:22:40 -05:00
Josh Devins 4ff5e03c70
Adds recall@k metric to rank eval API (#52577)
This change adds the recall@k metric and refactors precision@k to match
the new metric.

Recall@k is an important metric to use for learning to rank (LTR)
use-cases. Candidate generation or first ranking phase ranking functions
are often optimized for high recall, in order to generate as many
relevant candidates in the top-k as possible for a second phase of
ranking. Adding this metric allows tuning that base query for LTR.

See: https://github.com/elastic/elasticsearch/issues/51676
2020-02-27 10:43:42 +01:00
James Rodewig 7f1d05c453
[DOCS] Correct multi search API docs (#52523)
* Adds an example request to the top of the page.
* Relocates several parameters erroneously listed under "Request body"
to the appropriate "Query parameters" section.
* Updates the "Request body" section to better document the NDJSON
  structure of msearch requests.
2020-02-24 07:41:53 -05:00
Marios Trivyzas 2eb986488a
[Docs] Clarify default value for `allow_no_indices` (#52635)
Add default value to each one of the usages of `allow_no_indices`
since it differs between different APIs.

Relates to: #52534
2020-02-24 11:37:29 +01:00
debadair c93b8b91c3
[DOCS] Fixed typo. (#52071) 2020-02-07 11:03:56 -08:00
Jess 97b12c11db [Docs] Small edits to Ranking Evaluation API docs (#51116)
Small updates to grammar, syntax, and unclear wordings.
2020-01-20 10:30:54 +01:00
James Rodewig cfddddda0b
[DOCS] Fix search request body links (#50500)
PR #44238 changed several links related to the Elasticsearch search request body API. This updates several places still using outdated links or anchors.

This will ultimately let us remove some redirects related to those link changes.
2019-12-26 14:20:51 -05:00
Xiang Dai 432bd0e92c Fix docs typos (#50365)
Fixes a few typos in the docs.

Signed-off-by: Xiang Dai 764524258@qq.com
2019-12-23 10:35:14 -05:00
James Rodewig a311018fbc
[DOCS] Remove outdated file scripts refererence (#50437)
File scripts were removed in 6.0 with #24627.

This removes an outdated file scripts reference from the conditional clauses section of the search templates docs.
2019-12-20 14:02:42 -05:00
Adrien Grand 2d627ba757
Add per-field metadata. (#49419)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2019-12-18 17:27:38 +01:00
Adrien Grand 1329acc094
Upgrade to lucene 8.4.0-snapshot-662c455. (#50016)
Lucene 8.4 is about to be released so we should check it doesn't cause problems
with Elasticsearch.
2019-12-10 17:09:36 +01:00
Mayya Sharipova fa8b48deef Optimize sort on numeric long and date fields.
This rewrites long sort as a `DistanceFeatureQuery`, which can
efficiently skip non-competitive blocks and segments of documents.
Depending on the dataset, the speedups can be 2 - 10 times.

The optimization can be disabled with setting the system property
`es.search.rewrite_sort` to `false`.

Optimization is skipped when an index has 50% or more data with
the same value.

Optimization is done through:
1. Rewriting sort as `DistanceFeatureQuery` which can
efficiently skip non-competitive blocks and segments of documents.

2. Sorting segments according to the primary numeric sort field(#44021)
This allows to skip non-competitive segments.

3. Using collector manager.
When we optimize sort, we sort segments by their min/max value.
As a collector expects to have segments in order,
we can not use a single collector for sorted segments.
We use collectorManager, where for every segment a dedicated collector
will be created.

4. Using Lucene's shared TopFieldCollector manager
This collector manager is able to exchange minimum competitive
score between collectors, which allows us to efficiently skip
the whole segments that don't contain competitive scores.

5. When index is force merged to a single segment, #48533 interleaving
old and new segments allows for this optimization as well,
as blocks with non-competitive docs can be skipped.

Closes #37043

Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
2019-11-26 09:24:25 -05:00
Mayya Sharipova e9ba252176 Revert "Optimize sort on long field (#48804)"
This reverts commit 79d9b365c4.
2019-11-26 09:23:27 -05:00
Mayya Sharipova 79d9b365c4
Optimize sort on long field (#48804)
* Optimize sort on numeric long and date fields (#39770)

Optimize sort on numeric long and date fields, when 
the system property `es.search.long_sort_optimized` is true.

* Skip optimization if the index has duplicate data (#43121)

Skip sort optimization if the index has 50% or more data
with the same value.
When index has a lot of docs with the same value, sort
optimization doesn't make sense, as DistanceFeatureQuery
will produce same scores for these docs, and Lucene
will use the second sort to tie-break. This could be slower
than usual sorting.

* Sort leaves on search according to the primary numeric sort field (#44021)

This change pre-sort the index reader leaves (segment) prior to search
when the primary sort is a numeric field eligible to the distance feature
optimization. It also adds a tie breaker on `_doc` to the rewritten sort
in order to bypass the fact that leaves will be collected in a random order.
I ran this patch on the http_logs benchmark and the results are very promising:

```
|                                       50th percentile latency | desc_sort_timestamp |    220.706 |      136544 |   136324 |     ms |
|                                       90th percentile latency | desc_sort_timestamp |    244.847 |      162084 |   161839 |     ms |
|                                       99th percentile latency | desc_sort_timestamp |    316.627 |      172005 |   171688 |     ms |
|                                      100th percentile latency | desc_sort_timestamp |    335.306 |      173325 |   172989 |     ms |
|                                  50th percentile service time | desc_sort_timestamp |    218.369 |     1968.11 |  1749.74 |     ms |
|                                  90th percentile service time | desc_sort_timestamp |    244.182 |      2447.2 |  2203.02 |     ms |
|                                  99th percentile service time | desc_sort_timestamp |    313.176 |     2950.85 |  2637.67 |     ms |
|                                 100th percentile service time | desc_sort_timestamp |    332.924 |     2959.38 |  2626.45 |     ms |
|                                                    error rate | desc_sort_timestamp |          0 |           0 |        0 |      % |
|                                                Min Throughput |  asc_sort_timestamp |   0.801824 |    0.800855 | -0.00097 |  ops/s |
|                                             Median Throughput |  asc_sort_timestamp |   0.802595 |    0.801104 | -0.00149 |  ops/s |
|                                                Max Throughput |  asc_sort_timestamp |   0.803282 |    0.801351 | -0.00193 |  ops/s |
|                                       50th percentile latency |  asc_sort_timestamp |    220.761 |     824.098 |  603.336 |     ms |
|                                       90th percentile latency |  asc_sort_timestamp |    251.741 |     853.984 |  602.243 |     ms |
|                                       99th percentile latency |  asc_sort_timestamp |    368.761 |     893.943 |  525.182 |     ms |
|                                      100th percentile latency |  asc_sort_timestamp |    431.042 |      908.85 |  477.808 |     ms |
|                                  50th percentile service time |  asc_sort_timestamp |    218.547 |     820.757 |  602.211 |     ms |
|                                  90th percentile service time |  asc_sort_timestamp |    249.578 |     849.886 |  600.308 |     ms |
|                                  99th percentile service time |  asc_sort_timestamp |    366.317 |     888.894 |  522.577 |     ms |
|                                 100th percentile service time |  asc_sort_timestamp |    430.952 |     908.401 |   477.45 |     ms |
|                                                    error rate |  asc_sort_timestamp |          0 |           0 |        0 |      % |
```

So roughly 10x faster for the descending sort and 2-3x faster in the ascending case. Note
that I indexed the http_logs with a single client in order to simulate real time-based indices
where document are indexed in their timestamp order.

Relates #37043

* Remove nested collector in docs response

As we don't use cancellableCollector anymore, it should be removed from
the expected docs response.

* Use collector manager for search when necessary (#45829)

When we optimize sort, we sort segments by their min/max value.
As a collector expects to have segments in order,
we can not use a single collector for sorted segments.
Thus for such a case, we use collectorManager,
where for every segment a dedicated collector will be created.

* Use shared TopFieldCollector manager

Use shared TopFieldCollector manager for sort optimization.
This collector manager is able to exchange minimum competitive
score between collectors

* Correct calculation of avg value to avoid overflow

* Optimize calculating if index has duplicate data
2019-11-26 09:07:39 -05:00
James Rodewig 1e45db49ec
[DOCS] Document `script_score` float precision limit (#49402)
All document scores are positive 32-bit floating point numbers. However, this
wasn't previously documented.

This can result in surprising behavior, such as precision loss, for users when
customizing scores using the function score query.

This commit updates an existing admonition in the function score query docs to
document the 32-bits precision limit. It also updates the search API reference
docs to note that `_score` is a 32-bit float.
2019-11-21 08:53:56 -05:00
Orhan Toy 53b1bc3933 [Docs] Fix _count HTTP method (#48979) 2019-11-12 15:44:57 +01:00
Patrick Maynard 1ab63cc0d7 [DOCS] Fix typo in search type docs (#48868) 2019-11-11 09:39:46 -05:00
Christoph Büscher 51f89a7184
Remove Ranking Evaluation API experimental status (#48603)
The API has been released long enough to remove the experimental status.
2019-10-29 20:55:48 +01:00
Ian Danforth 6717343b47 [Docs] Fix typo in suggesters search API doc (#48477) 2019-10-29 09:57:17 +01:00
James Rodewig e7e45c5c20
[DOCS] Fix note format in index suggestion docs (#48536) 2019-10-25 10:30:52 -05:00
Christoph Büscher a1ae813410
[Docs] Mention reserved completion suggestion characters (#48445)
We currently don't mention the three reserved characters anywhere. This change
adds a short note mentioning them

Closes #48341
2019-10-25 16:57:51 +02:00
James Rodewig f53eba024b
[DOCS] Remove binary gendered language (#48362) 2019-10-23 09:36:31 -05:00
Jim Ferenczi 8f9e77e6f1
Fix tag in the search request timeout option docs (#47776)
and add missing parentheses `search_timeout` param
2019-10-10 10:35:09 +02:00
James Rodewig e7ffacf8c0
[DOCS] Correct callouts in search template docs (#47655) 2019-10-07 09:25:03 -04:00
James Rodewig 2fd051497e
[DOCS] Add response body parms to search API docs (#47042) 2019-09-30 11:41:14 -04:00
István Zoltán Szabó d0faf354c6
[DOCS] Reformats Profile API (#47168)
* [DOCS] Reformats Profile API.

* [DOCS] Fixes failing docs test.
2019-09-27 10:34:30 +02:00
István Zoltán Szabó 36502b2460
[DOCS] Reformats ranking evaluation API (#46974)
* [DOCS] Reformats ranking evaluation API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-25 14:55:09 +02:00
István Zoltán Szabó 69422b97cf
[DOCS] Reformat suggesters page. (#47010) 2019-09-25 14:38:47 +02:00
Alan Woodward c1f99e2d75
Remove `_type` from SearchHit (#46942)
This commit removes the `_type` field from all search hit responses.

Relates to #41059
2019-09-23 19:14:54 +01:00
Alan Woodward b733f9e803
Remove types from explain API (#46926)
We no longer need a type to get the source of a document, so we can remove it from
the explain API as well.

Relates to #41059
2019-09-23 17:55:09 +01:00
István Zoltán Szabó 5dc4dc6e2e
[DOCS] Reformats Field capabilities API (#46866)
* [DOCS] Reformats Field capabilities API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-20 11:24:44 +02:00
István Zoltán Szabó b256462bef
[DOCS] Reformats explain API (#46857)
* [DOCS] Reformats explain API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-20 10:59:11 +02:00
James Rodewig 9f65af989d
[DOCS] Remove `lowercase_terms` parm from term suggester docs (#46879) 2019-09-19 15:56:24 -04:00
Takumasa Ochi 8b764a5209 Fix typos in `match` in profile API (#46723)
* Replace `matches` with correct `match`
* Use present tense consistently
* Replace `metric` with correct `match`
2019-09-19 16:05:46 +02:00
István Zoltán Szabó e0b19a8ae0
[DOCS] Reformats validate API (#46389)
* [DOCS] Reformats validate API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-18 14:29:48 +02:00
István Zoltán Szabó 4e11a19371
[DOCS] Reformats count API (#46377)
* [DOCS] Reformats count API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-17 09:53:03 +02:00
James Rodewig 5c78f606c2
[DOCS] Change // CONSOLE comments to [source,console] (#46440) 2019-09-09 10:45:37 -04:00
James Rodewig e43be90e6c
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) 2019-09-06 14:05:36 -04:00
James Rodewig 466c59a4a7
[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295) 2019-09-05 16:47:18 -04:00
James Rodewig f5827ba0ae
[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159) 2019-09-04 12:51:02 -04:00
István Zoltán Szabó 4a0713aa0b
[DOCS] Reformats search template and multi search template APIs (#46236)
* [DOCS] Reformats search template and multi search template APIs.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 15:12:49 +02:00
István Zoltán Szabó ded27911dd
[DOCS] Reformats search shards API (#46240)
* [DOCS] Reformats search shards API
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 11:34:30 +02:00
István Zoltán Szabó c5c033cc1f
[DOCS] Reformats request body search API (#46254)
* [DOCS] Reformats request body search API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 10:52:17 +02:00
István Zoltán Szabó f6466f4840
[DOCS] Reformats multi search API (#46256)
* [DOCS] Reformats multi search API.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 10:14:30 +02:00
István Zoltán Szabó a6e915b05a
[DOCS] Reformats URI search request (#45844)
* [DOCS] Reformats URI search request.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

Co-Authored-By: debadair <debadair@elastic.co>
2019-08-29 10:04:49 +02:00
James Rodewig 46d7849032
Change `{var}` convention to `<var>` (#45904) 2019-08-23 10:57:20 -04:00
Nathan Howard df51be533f Adding a warning to from-size.asciidoc
Customers occasionally discover a known behavior in Elasticsearch's pagination that does not appear to be documented. This warning is intended to educate customers of this behavior while still highlighting alternative solutions.
2019-08-22 19:07:14 -07:00
István Zoltán Szabó 912d740802
[DOCS] Reformats search API (#45786)
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-08-22 15:04:20 +02:00
James Rodewig 26323f0db3
[DOCS] Add template docs to scripts. Reorder template examples. (#45817)
* [DOCS] Add template docs to scripts. Reorder template examples.

* Adds a 'Search template' section to the 'How to use scripts' chapter.
  This links to the 'Search template' chapter for detailed info and
  examples.

* Reorders and retitles several examples in the 'Search template'
  chapter. This is primarily to make examples for storing, deleting, and
  using search templates more prominent.

* Change <templatename> to <templateid>
2019-08-22 08:40:09 -04:00
Jonathan Hult 1930267809 [DOCS] Fix typo in highlighting doc (#45707) 2019-08-20 07:27:27 -04:00
James Rodewig 66b8261e1b
[DOCS] Add diagrams to cross-cluster search documentation (#45569) 2019-08-15 10:59:58 -04:00
Emmanuel DEMEY 4e8a15ddfa Add snippet for the search_type query parameter (#43540) 2019-08-11 18:33:42 -04:00
Jesse Wright 3e7df14fc1 [Docs] Fix typo in rank-eval.asciidoc (#44978) 2019-07-31 12:38:26 +02:00