elasticsearch

Commit Graph

Author	SHA1	Message	Date
Adrien Grand	1329acc094	Upgrade to lucene 8.4.0-snapshot-662c455. (#50016 ) Lucene 8.4 is about to be released so we should check it doesn't cause problems with Elasticsearch.	2019-12-10 17:09:36 +01:00
Mayya Sharipova	fa8b48deef	Optimize sort on numeric long and date fields. This rewrites long sort as a `DistanceFeatureQuery`, which can efficiently skip non-competitive blocks and segments of documents. Depending on the dataset, the speedups can be 2 - 10 times. The optimization can be disabled with setting the system property `es.search.rewrite_sort` to `false`. Optimization is skipped when an index has 50% or more data with the same value. Optimization is done through: 1. Rewriting sort as `DistanceFeatureQuery` which can efficiently skip non-competitive blocks and segments of documents. 2. Sorting segments according to the primary numeric sort field(#44021) This allows to skip non-competitive segments. 3. Using collector manager. When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. We use collectorManager, where for every segment a dedicated collector will be created. 4. Using Lucene's shared TopFieldCollector manager This collector manager is able to exchange minimum competitive score between collectors, which allows us to efficiently skip the whole segments that don't contain competitive scores. 5. When index is force merged to a single segment, #48533 interleaving old and new segments allows for this optimization as well, as blocks with non-competitive docs can be skipped. Closes #37043 Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>	2019-11-26 09:24:25 -05:00
Mayya Sharipova	e9ba252176	Revert "Optimize sort on long field (#48804 )" This reverts commit `79d9b365c4`.	2019-11-26 09:23:27 -05:00
Mayya Sharipova	79d9b365c4	Optimize sort on long field (#48804 ) * Optimize sort on numeric long and date fields (#39770) Optimize sort on numeric long and date fields, when the system property `es.search.long_sort_optimized` is true. * Skip optimization if the index has duplicate data (#43121) Skip sort optimization if the index has 50% or more data with the same value. When index has a lot of docs with the same value, sort optimization doesn't make sense, as DistanceFeatureQuery will produce same scores for these docs, and Lucene will use the second sort to tie-break. This could be slower than usual sorting. * Sort leaves on search according to the primary numeric sort field (#44021) This change pre-sort the index reader leaves (segment) prior to search when the primary sort is a numeric field eligible to the distance feature optimization. It also adds a tie breaker on `_doc` to the rewritten sort in order to bypass the fact that leaves will be collected in a random order. I ran this patch on the http_logs benchmark and the results are very promising: ``` \| 50th percentile latency \| desc_sort_timestamp \| 220.706 \| 136544 \| 136324 \| ms \| \| 90th percentile latency \| desc_sort_timestamp \| 244.847 \| 162084 \| 161839 \| ms \| \| 99th percentile latency \| desc_sort_timestamp \| 316.627 \| 172005 \| 171688 \| ms \| \| 100th percentile latency \| desc_sort_timestamp \| 335.306 \| 173325 \| 172989 \| ms \| \| 50th percentile service time \| desc_sort_timestamp \| 218.369 \| 1968.11 \| 1749.74 \| ms \| \| 90th percentile service time \| desc_sort_timestamp \| 244.182 \| 2447.2 \| 2203.02 \| ms \| \| 99th percentile service time \| desc_sort_timestamp \| 313.176 \| 2950.85 \| 2637.67 \| ms \| \| 100th percentile service time \| desc_sort_timestamp \| 332.924 \| 2959.38 \| 2626.45 \| ms \| \| error rate \| desc_sort_timestamp \| 0 \| 0 \| 0 \| % \| \| Min Throughput \| asc_sort_timestamp \| 0.801824 \| 0.800855 \| -0.00097 \| ops/s \| \| Median Throughput \| asc_sort_timestamp \| 0.802595 \| 0.801104 \| -0.00149 \| ops/s \| \| Max Throughput \| asc_sort_timestamp \| 0.803282 \| 0.801351 \| -0.00193 \| ops/s \| \| 50th percentile latency \| asc_sort_timestamp \| 220.761 \| 824.098 \| 603.336 \| ms \| \| 90th percentile latency \| asc_sort_timestamp \| 251.741 \| 853.984 \| 602.243 \| ms \| \| 99th percentile latency \| asc_sort_timestamp \| 368.761 \| 893.943 \| 525.182 \| ms \| \| 100th percentile latency \| asc_sort_timestamp \| 431.042 \| 908.85 \| 477.808 \| ms \| \| 50th percentile service time \| asc_sort_timestamp \| 218.547 \| 820.757 \| 602.211 \| ms \| \| 90th percentile service time \| asc_sort_timestamp \| 249.578 \| 849.886 \| 600.308 \| ms \| \| 99th percentile service time \| asc_sort_timestamp \| 366.317 \| 888.894 \| 522.577 \| ms \| \| 100th percentile service time \| asc_sort_timestamp \| 430.952 \| 908.401 \| 477.45 \| ms \| \| error rate \| asc_sort_timestamp \| 0 \| 0 \| 0 \| % \| ``` So roughly 10x faster for the descending sort and 2-3x faster in the ascending case. Note that I indexed the http_logs with a single client in order to simulate real time-based indices where document are indexed in their timestamp order. Relates #37043 * Remove nested collector in docs response As we don't use cancellableCollector anymore, it should be removed from the expected docs response. * Use collector manager for search when necessary (#45829) When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. Thus for such a case, we use collectorManager, where for every segment a dedicated collector will be created. * Use shared TopFieldCollector manager Use shared TopFieldCollector manager for sort optimization. This collector manager is able to exchange minimum competitive score between collectors * Correct calculation of avg value to avoid overflow * Optimize calculating if index has duplicate data	2019-11-26 09:07:39 -05:00
James Rodewig	1e45db49ec	[DOCS] Document `script_score` float precision limit (#49402 ) All document scores are positive 32-bit floating point numbers. However, this wasn't previously documented. This can result in surprising behavior, such as precision loss, for users when customizing scores using the function score query. This commit updates an existing admonition in the function score query docs to document the 32-bits precision limit. It also updates the search API reference docs to note that `_score` is a 32-bit float.	2019-11-21 08:53:56 -05:00
Orhan Toy	53b1bc3933	[Docs] Fix _count HTTP method (#48979 )	2019-11-12 15:44:57 +01:00
Patrick Maynard	1ab63cc0d7	[DOCS] Fix typo in search type docs (#48868 )	2019-11-11 09:39:46 -05:00
Christoph Büscher	51f89a7184	Remove Ranking Evaluation API experimental status (#48603 ) The API has been released long enough to remove the experimental status.	2019-10-29 20:55:48 +01:00
Ian Danforth	6717343b47	[Docs] Fix typo in suggesters search API doc (#48477 )	2019-10-29 09:57:17 +01:00
James Rodewig	e7e45c5c20	[DOCS] Fix note format in index suggestion docs (#48536 )	2019-10-25 10:30:52 -05:00
Christoph Büscher	a1ae813410	[Docs] Mention reserved completion suggestion characters (#48445 ) We currently don't mention the three reserved characters anywhere. This change adds a short note mentioning them Closes #48341	2019-10-25 16:57:51 +02:00
James Rodewig	f53eba024b	[DOCS] Remove binary gendered language (#48362 )	2019-10-23 09:36:31 -05:00
Jim Ferenczi	8f9e77e6f1	Fix tag in the search request timeout option docs (#47776 ) and add missing parentheses `search_timeout` param	2019-10-10 10:35:09 +02:00
James Rodewig	e7ffacf8c0	[DOCS] Correct callouts in search template docs (#47655 )	2019-10-07 09:25:03 -04:00
James Rodewig	2fd051497e	[DOCS] Add response body parms to search API docs (#47042 )	2019-09-30 11:41:14 -04:00
István Zoltán Szabó	d0faf354c6	[DOCS] Reformats Profile API (#47168 ) * [DOCS] Reformats Profile API. * [DOCS] Fixes failing docs test.	2019-09-27 10:34:30 +02:00
István Zoltán Szabó	36502b2460	[DOCS] Reformats ranking evaluation API (#46974 ) * [DOCS] Reformats ranking evaluation API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-25 14:55:09 +02:00
István Zoltán Szabó	69422b97cf	[DOCS] Reformat suggesters page. (#47010 )	2019-09-25 14:38:47 +02:00
Alan Woodward	c1f99e2d75	Remove `_type` from SearchHit (#46942 ) This commit removes the `_type` field from all search hit responses. Relates to #41059	2019-09-23 19:14:54 +01:00
Alan Woodward	b733f9e803	Remove types from explain API (#46926 ) We no longer need a type to get the source of a document, so we can remove it from the explain API as well. Relates to #41059	2019-09-23 17:55:09 +01:00
István Zoltán Szabó	5dc4dc6e2e	[DOCS] Reformats Field capabilities API (#46866 ) * [DOCS] Reformats Field capabilities API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-20 11:24:44 +02:00
István Zoltán Szabó	b256462bef	[DOCS] Reformats explain API (#46857 ) * [DOCS] Reformats explain API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-20 10:59:11 +02:00
James Rodewig	9f65af989d	[DOCS] Remove `lowercase_terms` parm from term suggester docs (#46879 )	2019-09-19 15:56:24 -04:00
Takumasa Ochi	8b764a5209	Fix typos in `match` in profile API (#46723 ) * Replace `matches` with correct `match` * Use present tense consistently * Replace `metric` with correct `match`	2019-09-19 16:05:46 +02:00
István Zoltán Szabó	e0b19a8ae0	[DOCS] Reformats validate API (#46389 ) * [DOCS] Reformats validate API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-18 14:29:48 +02:00
István Zoltán Szabó	4e11a19371	[DOCS] Reformats count API (#46377 ) * [DOCS] Reformats count API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-17 09:53:03 +02:00
James Rodewig	5c78f606c2	[DOCS] Change // CONSOLE comments to [source,console] (#46440 )	2019-09-09 10:45:37 -04:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	466c59a4a7	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 )	2019-09-05 16:47:18 -04:00
James Rodewig	f5827ba0ae	[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159 )	2019-09-04 12:51:02 -04:00
István Zoltán Szabó	4a0713aa0b	[DOCS] Reformats search template and multi search template APIs (#46236 ) * [DOCS] Reformats search template and multi search template APIs. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-04 15:12:49 +02:00
István Zoltán Szabó	ded27911dd	[DOCS] Reformats search shards API (#46240 ) * [DOCS] Reformats search shards API Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-04 11:34:30 +02:00
István Zoltán Szabó	c5c033cc1f	[DOCS] Reformats request body search API (#46254 ) * [DOCS] Reformats request body search API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-04 10:52:17 +02:00
István Zoltán Szabó	f6466f4840	[DOCS] Reformats multi search API (#46256 ) * [DOCS] Reformats multi search API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-04 10:14:30 +02:00
István Zoltán Szabó	a6e915b05a	[DOCS] Reformats URI search request (#45844 ) * [DOCS] Reformats URI search request. Co-Authored-By: James Rodewig <james.rodewig@elastic.co> Co-Authored-By: debadair <debadair@elastic.co>	2019-08-29 10:04:49 +02:00
James Rodewig	46d7849032	Change `{var}` convention to `<var>` (#45904 )	2019-08-23 10:57:20 -04:00
Nathan Howard	df51be533f	Adding a warning to from-size.asciidoc Customers occasionally discover a known behavior in Elasticsearch's pagination that does not appear to be documented. This warning is intended to educate customers of this behavior while still highlighting alternative solutions.	2019-08-22 19:07:14 -07:00
István Zoltán Szabó	912d740802	[DOCS] Reformats search API (#45786 ) Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-08-22 15:04:20 +02:00
James Rodewig	26323f0db3	[DOCS] Add template docs to scripts. Reorder template examples. (#45817 ) * [DOCS] Add template docs to scripts. Reorder template examples. * Adds a 'Search template' section to the 'How to use scripts' chapter. This links to the 'Search template' chapter for detailed info and examples. * Reorders and retitles several examples in the 'Search template' chapter. This is primarily to make examples for storing, deleting, and using search templates more prominent. * Change <templatename> to <templateid>	2019-08-22 08:40:09 -04:00
Jonathan Hult	1930267809	[DOCS] Fix typo in highlighting doc (#45707 )	2019-08-20 07:27:27 -04:00
James Rodewig	66b8261e1b	[DOCS] Add diagrams to cross-cluster search documentation (#45569 )	2019-08-15 10:59:58 -04:00
Emmanuel DEMEY	4e8a15ddfa	Add snippet for the search_type query parameter (#43540 )	2019-08-11 18:33:42 -04:00
Jesse Wright	3e7df14fc1	[Docs] Fix typo in rank-eval.asciidoc (#44978 )	2019-07-31 12:38:26 +02:00
James Rodewig	fab98dfa55	[DOCS] Remove heading offsets for REST APIs (#44568 ) Several files in the REST APIs nav section are included using :leveloffset: tags. This increments headings (h2 -> h3, h3 -> h4, etc.) in those files and removes the :leveloffset: tags. Other supporting changes: * Alphabetizes top-level REST API nav items. * Change 'indices APIs' heading to 'index APIs.' * Changes 'Snapshot lifecycle management' heading to sentence case.	2019-07-19 14:35:36 -04:00
James Rodewig	ea1adb61c2	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:16:35 -04:00
James Rodewig	724769071d	[DOCS] Move Elasticsearch APIs to REST APIs section. (#44238 ) (#44372 ) Moves the following API sections under the REST APIs navigations: - API Conventions - Document APIs - Search APIs - Index APIs (previously named Indices APIs) - cat APIs - Cluster APIs Other supporting changes: - Removes the previous index APIs page under REST APIs. Adds a redirect for the removed page. - Removes several [partintro] macros so the docs build correctly. - Changes anchors for pages that become sections of a parent page. - Adds several redirects for existing pages that become sections of a parent page. This commit re-applies changes from #44238. Changes from that PR were reverted due to broken links in several repos. This commit adds redirects for those broken links.	2019-07-17 08:49:22 -04:00
Julie Tibshirani	af0d951993	Correct a formatting mistake in the _field_caps docs. (#44303 ) The 'indices' block that was recently added should appear in the top-level of the response, as opposed to being nested under 'fields'.	2019-07-15 09:44:25 -07:00
John Murphy	8a5a01fc12	[DOCS] Add `lowercase` filter to phrase suggester example so searches are case insensitive (#44186 )	2019-07-11 15:08:22 -04:00
Jim Ferenczi	a614415838	Remove deprecated sort options: nested_path and nested_filter (#42809 ) This commit removes the nested_path and nested_filter options deprecated in 6x. This change also checks that the sort field has a [nested] option if it is under a nested object and throws an exception if it's not the case. Closes #27098	2019-06-27 17:30:02 +02:00
Tal Levy	13dde65e75	specifies which index to search in docs for various queries (#43307 ) the geo-bounding-box and phrase-suggest docs were susceptible to failing due to other indices in the cluster. This change restricts the queries to the index that is set up for the test. relates to #43271.	2019-06-18 08:18:50 -07:00

1 2 3 4 5 ...

863 Commits