elasticsearch

Commit Graph

Author	SHA1	Message	Date
Lisa Cawley	362ce41eaf	[DOCS] Updates ML links (#50387 )	2019-12-19 14:47:28 -08:00
István Zoltán Szabó	b8cae37374	[DOCS] Adds inference processor documentation (#50204 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-12-19 12:19:44 +01:00
Lee Hinman	5adbf67c08	Add ILM histore store index (#50287 ) * Add ILM histore store index This commit adds an ILM history store that tracks the lifecycle execution state as an index progresses through its ILM policy. ILM history documents store output similar to what the ILM explain API returns. An example document with ALL fields (not all documents will have all fields) would look like: ```json { "@timestamp": 1203012389, "policy": "my-ilm-policy", "index": "index-2019.1.1-000023", "index_age":123120, "success": true, "state": { "phase": "warm", "action": "allocate", "step": "ERROR", "failed_step": "update-settings", "is_auto-retryable_error": true, "creation_date": 12389012039, "phase_time": 12908389120, "action_time": 1283901209, "step_time": 123904107140, "phase_definition": "{\"policy\":\"ilm-history-ilm-policy\",\"phase_definition\":{\"min_age\":\"0ms\",\"actions\":{\"rollover\":{\"max_size\":\"50gb\",\"max_age\":\"30d\"}}},\"version\":1,\"modified_date_in_millis\":1576517253463}", "step_info": "{... etc step info here as json ...}" }, "error_details": "java.lang.RuntimeException: etc\n\tcaused by:etc etc etc full stacktrace" } ``` These documents go into the `ilm-history-1-00000N` index to provide an audit trail of the operations ILM has performed. This history storage is enabled by default but can be disabled by setting `index.lifecycle.history_index_enabled` to `false.` Resolves #49180	2019-12-18 16:09:59 -07:00
James Rodewig	b8a62ce8f7	[DOCS] Document `thread_pool` node stats (#50330 )	2019-12-18 16:57:38 -05:00
lcawl	d8a94f0397	[DOCS] Fixes security links	2019-12-18 11:51:03 -08:00
Lisa Cawley	68e02a19d8	[DOCS] Move machine learning results definitions into APIs (#50257 )	2019-12-18 09:50:31 -08:00
Igor Motov	a26e4d1e5e	Geo: Switch generated WKT to upper case (#50285 ) Switches generated WKT to upper case to conform to the standard recommendation. Relates #49568	2019-12-18 07:28:56 -10:00
James Rodewig	a762c29dcf	[DOCS] Clarify frozen indices are read-only (#50318 ) The freeze index API docs state that frozen indices are blocked for write operations. While this implies frozen indices are read-only, it does not explicitly use the term "read-only", which is found in other docs, such as the force merge docs. This adds the "ready-only" term to the freeze index API docs as well as other clarification.	2019-12-18 12:17:41 -05:00
Christoph Büscher	7f90ff64a3	[Docs] Remove `intervals` filter rule from allowed top-level rules (#50320 ) The `filter` rule is not allowed on the top-level of the query, so removing it from the list of allowed rules. Where it can be nested inside other rules, those rules already mention it.	2019-12-18 17:35:35 +01:00
Adrien Grand	2d627ba757	Add per-field metadata. (#49419 ) This PR adds per-field metadata that can be set in the mappings and is later returned by the field capabilities API. This metadata is completely opaque to Elasticsearch but may be used by tools that index data in Elasticsearch to communicate metadata about fields with tools that then search this data. A typical example that has been requested in the past is the ability to attach a unit to a numeric field. In order to not bloat the cluster state, Elasticsearch requires that this metadata be small: - keys can't be longer than 20 chars, - values can only be numbers or strings of no more than 50 chars - no inner arrays or objects, - the metadata can't have more than 5 keys in total. Given that metadata is opaque to Elasticsearch, field capabilities don't try to do anything smart when merging metadata about multiple indices, the union of all field metadatas is returned. Here is how the meta might look like in mappings: ```json { "properties": { "latency": { "type": "long", "meta": { "unit": "ms" } } } } ``` And then in the field capabilities response: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms" ] } } } } ``` When there are no conflicts, values are arrays of size 1, but when there are conflicts, Elasticsearch includes all unique values in this array, without giving ways to know which index has which metadata value: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms", "ns" ] } } } } ``` Closes #33267	2019-12-18 17:27:38 +01:00
Kevin Woblick	77d94caa70	[DOCS] Add warning about Docker port exposure (#50169 ) Docker bypasses the Uncomplicated Firewall (UFW) on Linux by editing the `iptables` config directly, which leads to the exposure of port 9200, even if you blocked it via UFW. This adds a warning along with work-arounds to the docs. Signed-off-by: Kovah <mail@kovah.de>	2019-12-18 09:03:44 -05:00
István Zoltán Szabó	50e26d40a2	[DOCS] Adds GET, GET stats and DELETE inference APIs (#50224 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-12-18 09:10:12 +01:00
Lisa Cawley	6d608e6a0d	[DOCS] Move transform resource definitions into APIs (#50108 )	2019-12-17 09:01:31 -08:00
James Rodewig	230d4765d3	[DOCS] Add identifier mapping tip to numeric and keyword datatype docs (#49933 ) Users often mistakenly map numeric IDs to numeric datatypes. However, this is often slow for the `term` and other term-level queries. The "Tune for search speed" docs includes advice for mapping numeric IDs to `keyword` fields. However, this tip is not included in the `numeric` or `keyword` field datatype doc pages. This rewords the tip in the "Tune for search speed" docs, relocates it to the `numeric` field docs, and reuses it using tagged regions.	2019-12-17 09:31:07 -05:00
Jim Ferenczi	804a5042e7	Optimize composite aggregation based on index sorting (#48399 ) Co-authored-by: Daniel Huang <danielhuang@tencent.com> This is a spinoff of #48130 that generalizes the proposal to allow early termination with the composite aggregation when leading sources match a prefix or the entire index sort specification. In such case the composite aggregation can use the index sort natural order to early terminate the collection when it reaches a composite key that is greater than the bottom of the queue. The optimization is also applicable when a query other than match_all is provided. However the optimization is deactivated for sources that match the index sort in the following cases: * Multi-valued source, in such case early termination is not possible. * missing_bucket is set to true	2019-12-17 14:02:06 +01:00
Lisa Cawley	207094cd67	[DOCS] Moves model snapshot resource definitions into APIs (#50157 ) Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>	2019-12-16 10:42:30 -08:00
Ignacio Vera	7c559be31c	"CONTAINS" support for BKD-backed geo_shape and shape fields (#50141 ) Lucene 8.4 added support for "CONTAINS", therefore in this commit those changes are integrated in Elasticsearch. This commit contains as well a bug fix when querying with a geometry collection with "DISJOINT" relation.	2019-12-16 07:43:42 +01:00
James Rodewig	6749b63ace	[DOCS] Add `index-extra-title-page.html` for direct HTML migration (#50189 )	2019-12-13 12:44:12 -05:00
Tim Brooks	0cedb9e251	Update remote cluster stats to support simple mode (#49961 ) Remote cluster stats API currently only returns useful information if the strategy in use is the SNIFF mode. This PR modifies the API to provide relevant information if the user is in the SIMPLE mode. This information is the configured addresses, max socket connections, and open socket connections.	2019-12-13 09:16:53 -07:00
James Rodewig	9907b0aab8	[DOCS] Reformat token count limit filter docs (#49835 )	2019-12-13 08:43:35 -05:00
James Rodewig	ff2259bf81	[DOCS] Document JVM node stats (#49500 ) * [DOCS] Document JVM node stats Documents the `jvm` parameters returned by the `_nodes/stats` API. Co-Authored-By: James Baiera <james.baiera@gmail.com>	2019-12-12 15:41:20 -05:00
Lisa Cawley	d442ff9223	[DOCS] Updates transform screenshots and text (#50059 )	2019-12-12 08:20:39 -08:00
James Rodewig	4dfc07c922	[DOCS] Reformat lowercase token filter docs (#49935 )	2019-12-12 09:39:06 -05:00
James Rodewig	2d9ee5ddfe	[DOCS] Correct percentile rank agg example response (#50052 ) The example snippets in the percentile rank agg docs use a test dataset named `latency`, which is generated from docs/gradle.build. At some point the dataset and example snippets were updated, but the text surrounding the snippets was not. This means the text and the example snippets shown no longer match up. This corrects that by changing the snippets using /TESTRESPONSE magic comments.	2019-12-12 08:38:48 -05:00
István Zoltán Szabó	3857e3d94f	[DOCS] Moves data frame analytics job resource definitions into APIs (#50021 )	2019-12-12 10:59:37 +01:00
Lisa Cawley	ca482127fa	[DOCS] Move job count resource definitions into API (#50057 ) Co-Authored-By: Przemysław Witek <przemyslaw.witek@elastic.co> Co-Authored-By: David Roberts <dave.roberts@elastic.co> Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>	2019-12-11 11:17:15 -08:00
Lisa Cawley	3d96e6b68e	[DOCS] Move datafeed resource definitions into APIs (#50005 ) Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>	2019-12-11 09:50:41 -08:00
Przemko Robakowski	64e1a774fc	CSV ingest processor (#49509 ) * CSV Processor for Ingest This change adds new ingest processor that breaks line from CSV file into separate fields. By default it conforms to RFC 4180 but can be tweaked. Closes #49113	2019-12-11 14:52:04 +01:00
Patryk Krawaczyński	de4f701a19	[DOCS] Document `index.queries.cache.enabled` as a static setting (#49886 )	2019-12-10 14:23:14 -05:00
Peter Johnson	263f5bd6b6	[Docs] Fix typo in function-score-query.asciidoc (#50030 )	2019-12-10 17:33:36 +01:00
Adrien Grand	1329acc094	Upgrade to lucene 8.4.0-snapshot-662c455. (#50016 ) Lucene 8.4 is about to be released so we should check it doesn't cause problems with Elasticsearch.	2019-12-10 17:09:36 +01:00
Lisa Cawley	3e6dc03de6	[DOCS] Removes realm type security setting (#50001 )	2019-12-10 08:03:43 -08:00
James Rodewig	0062d5f301	[DOCS] Remove shadow replica reference (#50029 ) Removes a reference to shadow replicas from the cat shards API docs and a comment in cluster/routing/UnassignedInfo.java. Shadow replicas were removed with #23906.	2019-12-10 09:30:04 -05:00
Dimitris Athanasiou	269425b54d	[ML] Introduce randomize_seed setting for regression and classification (#49990 ) This adds a new `randomize_seed` for regression and classification. When not explicitly set, the seed is randomly generated. One can reuse the seed in a similar job in order to ensure the same docs are picked for training.	2019-12-10 10:22:53 +02:00
James Rodewig	e520b85675	[DOCS] Skip synced flush docs tests (#49986 ) The current snippets in the synced flush docs can cause conflicts with other background syncs, such as the global checkpoint sync or retention lease sync, in the docs tests. This skips tests for those snippets to avoid conflicts.	2019-12-09 13:16:16 -05:00
Artur Carvalho	c21eb986b2	[Docs] Fix typo in getting-started.asciidoc (#49985 )	2019-12-09 16:25:21 +01:00
James Rodewig	4415f1a536	[DOCS] Correct inline shape snippets in shape query docs (#49921 ) In the shape query docs, the index mapping snippet uses the "geometry" shape field mapping. However, the doc index snippet uses the "location" property. This changes the "location" property to "geometry". It also adds a comment containing the search result snippet. This should prevent similar issues in the future.	2019-12-09 08:39:17 -05:00
Ryan Ernst	59a571edd5	Fix incorrect use of multiline NOTE in rpm docs (#49962 ) This was a copy/paste error from #49893. This commit converts the NOTE to use inline style instead of one needing closing linebreak.	2019-12-06 17:43:12 -08:00
Ryan Ernst	16a7a04664	Disable repo configuration for rpm based systems (#49893 ) This commit changes the recommended repository file for rpm based systems to be disabled by default. This is a safer practice so upgrades of the system do no accidentally upgrade elasticsearch itself. closes #30660	2019-12-06 15:54:30 -08:00
Lisa Cawley	0f51bc2f72	[DOCS] Move anomaly detection job resource definitions into APIs (#49700 ) Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>	2019-12-06 15:32:07 -08:00
Przemko Robakowski	c57032f622	Allow list of IPs in geoip ingest processor (#49573 ) * Allow list of IPs in geoip ingest processor This change lets you use array of IPs in addition to string in geoip processor source field. It will set array containing geoip data for each element in source, unless first_only parameter option is enabled, then only first found will be returned. Closes #46193	2019-12-06 21:57:06 +01:00
István Zoltán Szabó	e5d512a8ed	[DOCS] Fixes classification evaluation example response. (#49905 )	2019-12-06 13:24:22 +01:00
István Zoltán Szabó	37cc0b6c9e	[DOCS] Fixes attribute in transforms overview. (#49898 )	2019-12-06 10:23:01 +01:00
Hendrik Muhs	25474c62f1	[Transform][DOCS]rewrite client ip example to use continuous transform (#49822 ) adapt the transform example for suspicious client ips to use continuous transform	2019-12-06 08:19:21 +01:00
Orhan Toy	f002cd1b6f	[DOCS] Minor typo fixes in reindex.asciidoc (#49863 )	2019-12-05 20:24:22 +01:00
István Zoltán Szabó	f7a5b73972	[DOCS] Adds an example of preprocessing actions to the PUT DFA API docs (#49831 )	2019-12-05 14:15:19 +01:00
István Zoltán Szabó	c793e80d3b	[DOCS] Fixes typo in the ML anomaly detection time functions docs. (#49834 )	2019-12-05 09:57:01 +01:00
James Rodewig	72dd49ddcc	[DOCS] Document `minimum_should_match` defaults for `bool` query (#48865 ) Adds documentation for the `minimum_should_match` parameter to the `bool` query docs. Includes docs for the default values: - `1` if the `bool` query includes at least one `should` clause and no `must` or `filter` clauses - `0` otherwise	2019-12-04 12:44:13 -05:00
James Rodewig	e964a97005	[DOCS] Reformat length token filter docs (#49805 ) * Adds a title abbreviation * Updates the description and adds a Lucene link * Reformats the parameters section * Adds analyze, custom analyzer, and custom filter snippets Relates to #44726.	2019-12-04 09:58:19 -05:00
Alexander Reelsen	062f9f03bf	Docs: Fix & test more grok processor documentation (#49447 ) The documentation contained a small error, as bytes and duration was not properly converted to a number and thus remained a string. The documentation is now also properly tested by providing a full blown simulate pipeline example.	2019-12-03 11:47:27 +01:00
James Rodewig	1a574115c1	[DOCS] Document CCR compatibility requirements (#49776 ) * Creates a prerequisites section in the cross-cluster replication (CCR) overview. * Adds concise definitions for local and remote cluster in a CCR context. * Documents that the ES version of the local cluster must be the same or a newer compatible version as the remote cluster.	2019-12-02 15:52:13 -05:00
James Rodewig	6ea54eecf0	[DOCS] Reformat keep types and keep words token filter docs (#49604 ) * Adds title abbreviations * Updates the descriptions and adds Lucene links * Reformats parameter definitions * Adds analyze and custom analyzer snippets * Adds explanations of token types to keep types token filter and tokenizer docs	2019-12-02 09:22:21 -05:00
James Rodewig	37baa50815	[DOCS] Explicitly document enrich `target_field` includes `match_field` (#49407 ) When the enrich processor appends enrich data to an incoming document, it adds a `target_field` to contain the enrich data. This `target_field` contains both the `match_field` AND `enrich_fields` specified in the enrich policy. Previously, this was reflected in the documented example but not explicitly stated. This adds several explicit statements to the docs.	2019-12-02 09:12:21 -05:00
Andrei Stefan	ce727615c0	SQL: handle NULL arithmetic operations with INTERVALs (#49633 )	2019-12-02 16:05:05 +02:00
David Turner	69e0b1a0f4	Drop snapshot instructions for autobootstrap fix (#49755 ) The "Restore any snapshots as required" step is a trap: it's somewhere between tricky and impossible to restore multiple clusters into a single one. Also add a note about configuring discovery during a rolling upgrade to proscribe any rare cases where you might accidentally autobootstrap during the upgrade.	2019-12-02 12:43:18 +00:00
Tugberk Ugurlu	ac07248402	[Docs] Fix typo in templates.asciidoc (#49726 )	2019-11-29 18:43:45 +01:00
Henning Andersen	5b56a990b0	Deprecate sorting in reindex (#49458 ) Reindex sort never gave a guarantee about the order of documents being indexed into the destination, though it could give a sense of locality of source data. It prevents us from doing resilient reindex and other optimizations and it has therefore been deprecated. Related to #47567	2019-11-29 17:46:44 +01:00
Dimitris Athanasiou	bad07b76f7	[ML] Add optional source filtering during data frame reindexing (#49690 ) This adds a `_source` setting under the `source` setting of a data frame analytics config. The new `_source` is reusing the structure of a `FetchSourceContext` like `analyzed_fields` does. Specifying includes and excludes for source allows selecting which fields will get reindexed and will be available in the destination index. Closes #49531	2019-11-29 14:20:31 +02:00
Marios Trivyzas	8ca11f54cd	[Docs] Enhance rolling upgrade guide (#49686 ) Add a couple of pointers for the user to check the overall cluster health and the version of ES running on every node. Fixes: #49670	2019-11-28 17:00:41 +01:00
Ignacio Vera	eade4f03f4	New Histogram field mapper that supports percentiles aggregations. (#48580 ) This commit adds a new histogram field mapper that consists in a pre-aggregated format of numerical data to be used in percentiles aggregations.	2019-11-28 13:58:20 +01:00
Ryan Ernst	6c54b38a1b	Remove legacy referene to file scripts (#49339 ) This commit removes outdated documentation about a path setting for file scripts which no longer exist. closes #45827	2019-11-27 10:42:15 -08:00
Ryan Ernst	0042500026	Add JAVA_HOME env override location to docs (#49565 ) This commit clarifies how to override JAVA_HOME from the bundled jdk for deb and rpm installs, which each have their own file that is sourced upon service startup. closes #49068	2019-11-27 10:39:54 -08:00
Xiang Dai	7a7d15ba0b	[DOCS] Clarify how to update max memory size in bootstrap checks (#48975 )	2019-11-27 09:39:34 -05:00
bellengao	00cef95a77	[DOCS] Correct the request path for flush API docs (#49615 )	2019-11-27 09:26:57 -05:00
Yannick Welsch	a5f23758a1	[DOCS] Correct request path for synced flush API docs (#49631 ) Fixes an incorrect request path added with #46634	2019-11-27 08:43:21 -05:00
glerb	815ea928b2	[Docs] Correct typo in log file name (#49620 )	2019-11-27 14:38:19 +01:00
Martijn van Groningen	88aea2107d	Add templating support to pipeline processor. (#49030 ) This commit adds templating support to the pipeline processor's `name` option. Closes #39955	2019-11-27 13:45:11 +01:00
Jim Ferenczi	c2deb287f1	Add a cluster setting to disallow loading fielddata on _id field (#49166 ) This change adds a dynamic cluster setting named `indices.id_field_data.enabled`. When set to `false` any attempt to load the fielddata for the `_id` field will fail with an exception. The default value in this change is set to `false` in order to prevent fielddata usage on this field for future versions but it will be set to `true` when backporting to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472 is implemented. Closes #43599	2019-11-27 13:38:09 +01:00
Dimitrios Liappis	1c9efba809	Clarify gid used by docker image process and bind-mount method Fix reference about the uid:gid that Elasticsearch runs as inside the Docker container and add a packaging test to ensure that bind mounting a data dir with a random uid and gid:0 works as expected. Relates #49529 Closes #47929	2019-11-27 10:36:30 +02:00
Martijn van Groningen	4013e814e8	Add templating support to enrich processor (#49093 ) Adds support for templating to `field` and `target_field` options.	2019-11-27 07:52:42 +01:00
lcawl	3b3f3ca925	[DOCS] Fixes typo in ML resources	2019-11-26 10:28:18 -08:00
lcawl	63b944c00f	[DOCS] Fixes data type formatting	2019-11-26 08:21:39 -08:00
Mayya Sharipova	fa8b48deef	Optimize sort on numeric long and date fields. This rewrites long sort as a `DistanceFeatureQuery`, which can efficiently skip non-competitive blocks and segments of documents. Depending on the dataset, the speedups can be 2 - 10 times. The optimization can be disabled with setting the system property `es.search.rewrite_sort` to `false`. Optimization is skipped when an index has 50% or more data with the same value. Optimization is done through: 1. Rewriting sort as `DistanceFeatureQuery` which can efficiently skip non-competitive blocks and segments of documents. 2. Sorting segments according to the primary numeric sort field(#44021) This allows to skip non-competitive segments. 3. Using collector manager. When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. We use collectorManager, where for every segment a dedicated collector will be created. 4. Using Lucene's shared TopFieldCollector manager This collector manager is able to exchange minimum competitive score between collectors, which allows us to efficiently skip the whole segments that don't contain competitive scores. 5. When index is force merged to a single segment, #48533 interleaving old and new segments allows for this optimization as well, as blocks with non-competitive docs can be skipped. Closes #37043 Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>	2019-11-26 09:24:25 -05:00
Mayya Sharipova	e9ba252176	Revert "Optimize sort on long field (#48804 )" This reverts commit `79d9b365c4`.	2019-11-26 09:23:27 -05:00
Mayya Sharipova	79d9b365c4	Optimize sort on long field (#48804 ) * Optimize sort on numeric long and date fields (#39770) Optimize sort on numeric long and date fields, when the system property `es.search.long_sort_optimized` is true. * Skip optimization if the index has duplicate data (#43121) Skip sort optimization if the index has 50% or more data with the same value. When index has a lot of docs with the same value, sort optimization doesn't make sense, as DistanceFeatureQuery will produce same scores for these docs, and Lucene will use the second sort to tie-break. This could be slower than usual sorting. * Sort leaves on search according to the primary numeric sort field (#44021) This change pre-sort the index reader leaves (segment) prior to search when the primary sort is a numeric field eligible to the distance feature optimization. It also adds a tie breaker on `_doc` to the rewritten sort in order to bypass the fact that leaves will be collected in a random order. I ran this patch on the http_logs benchmark and the results are very promising: ``` \| 50th percentile latency \| desc_sort_timestamp \| 220.706 \| 136544 \| 136324 \| ms \| \| 90th percentile latency \| desc_sort_timestamp \| 244.847 \| 162084 \| 161839 \| ms \| \| 99th percentile latency \| desc_sort_timestamp \| 316.627 \| 172005 \| 171688 \| ms \| \| 100th percentile latency \| desc_sort_timestamp \| 335.306 \| 173325 \| 172989 \| ms \| \| 50th percentile service time \| desc_sort_timestamp \| 218.369 \| 1968.11 \| 1749.74 \| ms \| \| 90th percentile service time \| desc_sort_timestamp \| 244.182 \| 2447.2 \| 2203.02 \| ms \| \| 99th percentile service time \| desc_sort_timestamp \| 313.176 \| 2950.85 \| 2637.67 \| ms \| \| 100th percentile service time \| desc_sort_timestamp \| 332.924 \| 2959.38 \| 2626.45 \| ms \| \| error rate \| desc_sort_timestamp \| 0 \| 0 \| 0 \| % \| \| Min Throughput \| asc_sort_timestamp \| 0.801824 \| 0.800855 \| -0.00097 \| ops/s \| \| Median Throughput \| asc_sort_timestamp \| 0.802595 \| 0.801104 \| -0.00149 \| ops/s \| \| Max Throughput \| asc_sort_timestamp \| 0.803282 \| 0.801351 \| -0.00193 \| ops/s \| \| 50th percentile latency \| asc_sort_timestamp \| 220.761 \| 824.098 \| 603.336 \| ms \| \| 90th percentile latency \| asc_sort_timestamp \| 251.741 \| 853.984 \| 602.243 \| ms \| \| 99th percentile latency \| asc_sort_timestamp \| 368.761 \| 893.943 \| 525.182 \| ms \| \| 100th percentile latency \| asc_sort_timestamp \| 431.042 \| 908.85 \| 477.808 \| ms \| \| 50th percentile service time \| asc_sort_timestamp \| 218.547 \| 820.757 \| 602.211 \| ms \| \| 90th percentile service time \| asc_sort_timestamp \| 249.578 \| 849.886 \| 600.308 \| ms \| \| 99th percentile service time \| asc_sort_timestamp \| 366.317 \| 888.894 \| 522.577 \| ms \| \| 100th percentile service time \| asc_sort_timestamp \| 430.952 \| 908.401 \| 477.45 \| ms \| \| error rate \| asc_sort_timestamp \| 0 \| 0 \| 0 \| % \| ``` So roughly 10x faster for the descending sort and 2-3x faster in the ascending case. Note that I indexed the http_logs with a single client in order to simulate real time-based indices where document are indexed in their timestamp order. Relates #37043 * Remove nested collector in docs response As we don't use cancellableCollector anymore, it should be removed from the expected docs response. * Use collector manager for search when necessary (#45829) When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. Thus for such a case, we use collectorManager, where for every segment a dedicated collector will be created. * Use shared TopFieldCollector manager Use shared TopFieldCollector manager for sort optimization. This collector manager is able to exchange minimum competitive score between collectors * Correct calculation of avg value to avoid overflow * Optimize calculating if index has duplicate data	2019-11-26 09:07:39 -05:00
Martijn van Groningen	2ba00c8149	Introduce on_failure_pipeline ingest metadata inside on_failure block (#49076 ) In case an exception occurs inside a pipeline processor, the pipeline stack is kept around as header in the exception. Then in the on_failure processor the id of the pipeline the exception occurred is made accessible via the `on_failure_pipeline` ingest metadata. Closes #44920	2019-11-26 14:49:51 +01:00
Marios Trivyzas	f2aa7f0779	SQL: Add TRUNC alias for TRUNCATE (#49571 ) Add TRUNC as alias to already implemented TRUNCATE numeric function which is the flavour supported by Oracle and PostgreSQL. Relates to: #41195	2019-11-26 12:30:49 +01:00
Christoph Büscher	0162662eb8	[Docs] Correct `max_doc_freq` default value (#49536 ) The default is set to Integer.MAX_VALUE but is reported to be `0` in the docs. With the current implementation a value of 0 would mean all terms are filtered out, which is the opposite of "unbounded". Closes #49520	2019-11-26 10:46:44 +01:00
Lisa Cawley	9cc247d929	[DOCS] Fixes security links (#49563 )	2019-11-25 12:59:59 -08:00
James Rodewig	1471f34c54	[DOCS] Reformat delimited payload token filter docs (#49380 ) * Adds a title abbreviation * Relocates the older name deprecation warning * Updates the description and adds a Lucene link * Adds a note to explain payloads and how to store them * Adds analyze and custom analyzer snippets * Adds a 'Return stored payloads' example	2019-11-25 15:38:52 -05:00
James Rodewig	25ffce9391	[DOCS] Remove individual task retrieval from cat/tasks API (#49550 )	2019-11-25 10:31:13 -05:00
Kelly Campbell	1542a728c9	[DOCS] Correct GET path in cat tasks API docs (#49494 ) Previously, the request example included `GET _cat/_tasks`. However, the resource should be `tasks`, not `_tasks`.	2019-11-25 09:37:17 -05:00
Przemko Robakowski	04f6b6fdb2	[DOCS] IDs for doc snippets (#49008 ) * Ids for docs snippets * Ids for tests * Ids for docs snippets * ignoring build folder from idea * Ignoring build-eclipse	2019-11-25 15:30:00 +01:00
David Roberts	40c951d781	[ML] Add default categorization analyzer definition to ML info (#49545 ) The categorization job wizard in the ML UI will use this information when showing the effect of the chosen categorization analyzer on a sample of input.	2019-11-25 13:20:12 +00:00
Dimitris Athanasiou	5a6967af57	[ML][DOCS] Anomaly detection job retention days settings do not require restart (#49546 )	2019-11-25 15:12:41 +02:00
James Rodewig	642390c3a7	[DOCS] Fix edge n-gram tokenizer nav Adds a missing float tag to the edge n-gram tokenizer docs. This tag ensures the edge n-gram tokenizer docs display on the same page.	2019-11-22 15:51:52 -05:00
Jason Tedor	da20957e81	Replace required pipeline with final pipeline (#49470 ) This commit enhances the required pipeline functionality by changing it so that default/request pipelines can also be executed, but the required pipeline is always executed last. This gives users the flexibility to execute their own indexing pipelines, but also ensure that any required pipelines are also executed. Since such pipelines are executed last, we change the name of required pipelines to final pipelines.	2019-11-22 14:00:38 -05:00
Dimitris Athanasiou	0390ec3627	[ML] Explain data frame analytics API (#49455 ) This commit replaces the _estimate_memory_usage API with a new API, the _explain API. The API consolidates information that is useful before creating a data frame analytics job. It includes: - memory estimation - field selection explanation Memory estimation is moved here from what was previously calculated in the _estimate_memory_usage API. Field selection is a new feature that explains to the user whether each available field was selected to be included or not in the analysis. In the case it was not included, it also explains the reason why.	2019-11-22 20:08:14 +02:00
Lisa Cawley	a4efab6ab4	[DOCS] Merge rollup config details into API (#49412 )	2019-11-22 08:31:30 -08:00
James Rodewig	ddf5c0a76a	[DOCS] Reformat n-gram token filter docs (#49438 ) Reformats the edge n-gram and n-gram token filter docs. Changes include: * Adds title abbreviations * Updates the descriptions and adds Lucene links * Reformats parameter definitions * Adds analyze and custom analyzer snippets * Adds notes explaining differences between the edge n-gram and n-gram filters Additional changes: * Switches titles to use "n-gram" throughout. * Fixes a typo in the edge n-gram tokenizer docs * Adds an explicit anchor for the `index.max_ngram_diff` setting	2019-11-22 10:38:01 -05:00
István Zoltán Szabó	56888ff194	[DOCS] Removes the default size definition of thread pool types (#49442 ) Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-11-22 11:15:35 +01:00
István Zoltán Szabó	6dad2662ae	[DOCS] Removes data frame leftovers from transforms overview (#49434 )	2019-11-22 09:13:55 +01:00
James Rodewig	eca600326f	[DOCS] Document several missing thread pools (#48543 ) Adds documentation for the following thread pools: - fetch_shard_started - fetch_shard_store - flush - force_merge - management Closes #48524 Co-Authored-By: Jay Modi <jaymode@users.noreply.github.com>	2019-11-21 13:05:53 -05:00
Hendrik Muhs	65023eaf67	update the name of the audit index (#49432 ) small update to the name of the audit index changed in 7.5	2019-11-21 16:14:41 +01:00
James Rodewig	4db330d9e9	[DOCS] Replace cross-cluster search PNG images with SVGs (#49395 )	2019-11-21 09:05:33 -05:00
James Rodewig	1e45db49ec	[DOCS] Document `script_score` float precision limit (#49402 ) All document scores are positive 32-bit floating point numbers. However, this wasn't previously documented. This can result in surprising behavior, such as precision loss, for users when customizing scores using the function score query. This commit updates an existing admonition in the function score query docs to document the 32-bits precision limit. It also updates the search API reference docs to note that `_score` is a 32-bit float.	2019-11-21 08:53:56 -05:00
weizijun	22042cc199	Document all shard allocation filtering attributes (#46992 ) This commit adds coverage to the docs for some missing built-in shard allocation attributes.	2019-11-21 08:29:45 -05:00
Lisa Cawley	468dff79f6	[DOCS] Reformat rollup API docs (#49397 )	2019-11-20 10:43:53 -08:00
Julie Tibshirani	548fcd09fb	Stop ignoring types warnings in REST tests. (#49333 ) In 7.x we added logic to the REST test harness to ignore warnings related to types removal. This allowed us to continue to run mixed-cluster tests that included 6.x nodes. Now that master is on 8.x, we've no longer need to include 6.x nodes in testing and have removed almost all typed calls. The logic to ignore warnings can therefore be removed.	2019-11-20 08:23:56 -08:00
Lisa Cawley	ff2072e698	[DOCS] Reformat ILM API docs (#49348 )	2019-11-20 08:19:33 -08:00

1 2 3 4 5 ...

6463 Commits