elasticsearch

Commit Graph

Author	SHA1	Message	Date
David Kyle	7daed3b8af	Pipeline Inference Aggregation (#58193 ) Adds a pipeline aggregation that loads a model and performs inference on the input aggregation results.	2020-07-02 14:33:02 +01:00
Nik Everett	32bdf8549b	Fail variable_width_histogram that collects from many (#58619 ) Adds an explicit check to `variable_width_histogram` to stop it from trying to collect from many buckets because it can't. I tried to make it do so but that is more than an afternoon's project, sadly. So for now we just disallow it. Relates to #42035	2020-06-30 15:42:46 -04:00
Nik Everett	dda78ff760	Docs: Mark variable_width_histogram experimental (#58574 ) We're tracking this aggregation's experimental-progress in #58573. We'd like a little time to be able to make backwards incompatible changes to the aggregation because we're not 100% sure about the request and response format yet.	2020-06-25 16:54:37 -04:00
James Dorfman	e99d287fbb	Add Variable Width Histogram Aggregation (#42035 ) Implements a new histogram aggregation called `variable_width_histogram` which dynamically determines bucket intervals based on document groupings. These groups are determined by running a one-pass clustering algorithm on each shard and then reducing each shard's clusters using an agglomerative clustering algorithm. This PR addresses #9572. The shard-level clustering is done in one pass to minimize memory overhead. The algorithm was lightly inspired by [this paper](https://ieeexplore.ieee.org/abstract/document/1198387). It fetches a small number of documents to sample the data and determine initial clusters. Subsequent documents are then placed into one of these clusters, or a new one if they are an outlier. This algorithm is described in more details in the aggregation's docs. At reduce time, a [hierarchical agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering) algorithm inspired by [this paper](https://arxiv.org/abs/1802.00304) continually merges the closest buckets from all shards (based on their centroids) until the target number of buckets is reached. The final values produced by this aggregation are approximate. Each bucket's min value is used as its key in the histogram. Furthermore, buckets are merged based on their centroids and not their bounds. So it is possible that adjacent buckets will overlap after reduction. Because each bucket's key is its min, this overlap is not shown in the final histogram. However, when such overlap occurs, we set the key of the bucket with the larger centroid to the midpoint between its minimum and the smaller bucket’s maximum: `min[large] = (min[large] + max[small]) / 2`. This heuristic is expected to increases the accuracy of the clustering. Nodes are unable to share centroids during the shard-level clustering phase. In the future, resolving https://github.com/elastic/elasticsearch/issues/50863 would let us solve this issue. It doesn’t make sense for this aggregation to support the `min_doc_count` parameter, since clusters are determined dynamically. The `order` parameter is not supported here to keep this large PR from becoming too complex.	2020-06-23 09:26:54 -04:00
Cris da Rocha	b5de14d3f6	Missing comma between value types (#58383 ) This applies to all versions of this document (7.7, 7.8, 7.x, current and master).	2020-06-19 23:01:25 +02:00
Tal Levy	c765993d82	add geo_shape documentation for supported aggregations (#58284 ) This commit adds documentation for geo_shape fields in aggregations Closes #55495.	2020-06-18 10:17:49 -07:00
James Rodewig	7826bbee87	[DOCS] Move search API's `docvalue_fields` examples (#57760 ) Changes: * Condenses and relocates the `docvalue_fields` example to the 'Run a search' page. * Adds docs for the `docvalue_fields` request body parameter. * Updates several related xrefs. Co-authored-by: debadair <debadair@elastic.co>	2020-06-11 10:57:15 -04:00
andrewjohnson2	a791d6723d	Added standard deviation / variance sampling to extended stats (#49782 ) Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface. Closes #49554 Co-authored-by: Igor Motov <igor@motovs.org>	2020-06-10 15:00:50 -04:00
James Rodewig	51e3d5ab63	[DOCS] Fix source filtering xrefs (#57720 )	2020-06-05 08:46:26 -04:00
Igor Motov	29b5643c1a	Increase search.max_buckets to 65,535 (#57042 ) Increases the default search.max_buckets limit to 65,535, and only counts buckets during reduce phase. Closes #51731	2020-06-03 11:54:48 -04:00
Benjamin Trent	484de0cd02	Adding transform docs for geotile_grid (#57000 ) transforms and composite aggs support geotile_grid as a source. This adds documentation explaining that support.	2020-06-01 15:32:18 -04:00
Nik Everett	1e5e5e2da2	Update date_histogram docs (#56922 ) * Make it more clear that you can use `month` or `1M`. * Explain rounding rules * Consistently use "time zone" instead of "timezone". It looks like both are right but I see "time zone" much more. And the parameter in elasticsearch is `time_zone` so we may as well line up. Closes #56760 Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-05-29 17:13:14 -04:00
Gabriel Petrovay	709ee956d7	Fixed calendar intervals documentation (#56666 ) - the 1-letter intervals are not parseable (`m`, `h`, `d`, `w`, `M`, `q`, `y`) - fixed formatting broken by new lines	2020-05-15 16:56:27 -04:00
Gil Raphaelli	f29c9ff652	[DOCS] Sort metric and pipeline agg docs (#56613 )	2020-05-15 16:34:47 -04:00
Tal Levy	79367e43da	Add Normalize Pipeline Aggregation (#56399 ) This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes #51005.	2020-05-14 13:32:42 -07:00
Gabriel Petrovay	4029818c24	[Docs] Correct formatting in datehistogram-aggregation.asciidoc (#56664 )	2020-05-13 12:02:36 +02:00
Ignacio Vera	4e39184c38	Add moving percentiles pipeline aggregation (#55441 ) Similar to what the moving function aggregation does, except merging windows of percentiles sketches together instead of cumulatively merging final metrics	2020-05-12 10:30:52 +02:00
James Rodewig	af2d13144f	[DOCS] Add reference docs for `search.max_buckets` setting (#56449 ) Adds reference-style setting documentation for the `search.max_buckets` setting. This setting was previously only documented on the [bucket aggregations][0] page. [0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket.html	2020-05-11 08:35:24 -04:00
Christos Soulios	caf6c5ac19	Histogram field type support for ValueCount and Avg aggregations (#55933 ) Implements value_count and avg aggregations over Histogram fields as discussed in #53285 - value_count returns the sum of all counts array of the histograms - avg computes a weighted average of the values array of the histogram by multiplying each value with its associated element in the counts array	2020-05-04 10:24:35 +03:00
AB Prashanth	785527bb58	[DOCS] Remove approximate document counts example from term agg docs (#55442 ) Removes an example from the "Document counts are approximate" section of the terms agg documentation. As #52377 details, the example was no longer accurate in 7.x or 6.8. Document counts were more precise than the example presented. We've opened issue #56025 to discuss re-adding an example later. Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-04-30 09:49:32 -04:00
Christos Soulios	cefc6af25b	Histogram field type support for Sum aggregation (#55681 ) Implements Sum aggregation over Histogram fields by summing the value of each bucket multiplied by their count as requested in #53285	2020-04-29 11:09:25 +03:00
Zachary Tong	9f165bd44e	Aggs must specify a `field` or `script` (or both) (#52226 ) * Aggs must specify a `field` or `script` (or both) This adds a validation to VSParserHelper to ensure that a field or script or both are specified by the user. This is technically required today already, but throws an exception much deeper in the agg framework and has a very unintuitive error for the user (as well as eating more resources instead of failing early) * Fix StringStats test * Add yaml test * Skip test on older versions Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-04-23 14:26:38 -04:00
Igor Motov	6d28596ead	Add support for filters to T-Test aggregation (#54980 ) Adds support for filters to T-Test aggregation. The filters can be used to select populations based on some criteria and use values from the same or different fields. Closes #53692	2020-04-10 10:19:07 -04:00
Igor Motov	5fc9fc528d	Add Student's t-test aggregation support (#54469 ) Adds t_test metric aggregation that can perform paired and unpaired two-sample t-tests. In this PR support for filters in unpaired is still missing. It will be added in a follow-up PR. Relates to #53692	2020-04-03 11:31:13 -04:00
Gil Raphaelli	4090568797	[DOCS] Fix typos in top metrics agg docs (#54299 )	2020-03-27 10:48:01 -04:00
Paweł Krześniak	de1229cc2b	[DOCS] link fix (#53973 ) Fix bad link in top_metrics.	2020-03-23 13:28:43 -04:00
Zachary Tong	84a59f8447	Add scripting, supported-type tests to ValueCount (#53500 ) Also adds a few small notes to the documentation regarding potentially unintuitive behavior	2020-03-16 15:15:25 -04:00
Lisa Cawley	4a5feab88d	[DOCS] Add anchors for scripted metric aggregations (#53618 )	2020-03-16 12:14:01 -07:00
Nik Everett	230a9a8975	Improve top_metrics docs (#53521 ) * Removes experimental. * Replaces `"v"` (for value) with `"m"` (for metric). * Move the note about tiebreaking into the list of limitations of the sort. * Explain how you ask for `metrics`. * Clean up some wording. * Link to the docs from `top_metrics`. Closes #51813	2020-03-16 13:23:22 -04:00
Nik Everett	8410356c5b	Preserve metric types in top_metrics (#53288 ) This changes the `top_metrics` aggregation to return metrics in their original type. Since it only supports numerics, that means that dates, longs, and doubles will come back as stored, with their appropriate formatter applied.	2020-03-11 16:44:08 -04:00
Anton Dollmaier	e9c8c03fee	[DOCS] Fix parameter formatting for GeoHash grid agg docs (#53032 ) Adds missing colon (`:`) to the parameter definition list.	2020-03-09 08:17:57 -04:00
Nik Everett	56058ab6af	Support multiple metrics in `top_metrics` agg (#52965 ) This adds support for returning multiple metrics to the `top_metrics` agg. It looks like: ``` POST /test/_search?filter_path=aggregations { "aggs": { "tm": { "top_metrics": { "metrics": [ {"field": "v"}, {"field": "m"} ], "sort": {"s": "desc"} } } } } ```	2020-03-05 06:53:37 -05:00
Nik Everett	f4223b6a8f	Add size support to `top_metrics` (#52662 ) This adds support for returning the top "n" metrics instead of just the very top. Relates to #51813	2020-02-27 11:14:57 -05:00
István Zoltán Szabó	14555ca01e	[DOCS] Links transforms in aggregation docs (#52563 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-02-21 08:22:04 +01:00
Nik Everett	5b2266601b	Implement top_metrics agg (#51155 ) The `top_metrics` agg is kind of like `top_hits` but it only works on doc values so it should be faster. At this point it is fairly limited in that it only supports a single, numeric sort and a single, numeric metric. And it only fetches the "very topest" document worth of metric. We plan to support returning a configurable number of top metrics, requesting more than one metric and more than one sort. And, eventually, non-numeric sorts and metrics. The trick is doing those things fairly efficiently. Co-Authored by: Zachary Tong <zach@elastic.co>	2020-02-14 07:13:52 -05:00
Igor Motov	0898df4aac	Add histogram field type support to boxplot aggs (#52265 ) Add support for the histogram field type to boxplot aggs. Closes #52233 Relates to #33112	2020-02-13 08:59:44 -05:00
Igor Motov	c50cfa0668	Add Boxplot Aggregation (#51948 ) Adds a `boxplot` aggregation that calculates min, max, medium and the first and the third quartiles of the given data set. Closes #33112	2020-02-07 18:01:20 -05:00
Mark Tozzi	928c663ce0	Fix dangling 'either' in weighted average docs (#51748 )	2020-01-31 12:45:46 -05:00
Elvis Saravia	520da54e63	update pipeline.asciidoc typo	2020-01-24 14:03:01 +01:00
Igor Motov	23be11cf6c	Fix leftover mentions of method parameter in Percentile Aggs (#51272 ) The method parameter is not used in the percentile aggs, instead the method is determined by the presence of `hdr` or `tdigest` objects. Relates to #8324	2020-01-22 05:02:48 -10:00
Tal Levy	6c86606d2a	Adds support for geo-bounds filtering in geogrid aggregations (#50002 ) It is fairly common to filter the geo point candidates in geohash_grid and geotile_grid aggregations according to some viewable bounding box. This change introduces the option of specifying this filter directly in the tiling aggregation. This is even more relevant to `geo_shape` where the bounds will restrict the shape to be within the bounds this optional `bounds` parameter is parsed in an equivalent fashion to the bounds specified in the geo_bounding_box query.	2020-01-14 08:29:10 -08:00
Nik Everett	326d696d9a	Support offset in composite aggs (#50609 ) Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes #48757	2020-01-07 14:49:09 -05:00
James Rodewig	7f35bcdfc9	[DOCS] Warn about using `geo_centroid` as sub-agg to `geohash_grid` (#50038 ) If `geo_point fields` are multi-valued, using `geo_centroid` as a sub-agg to `geohash_grid` could result in centroids outside of bucket boundaries. This adds a related warning to the geo_centroid agg docs.	2020-01-06 07:45:49 -06:00
Nik Everett	a7cc0b0159	Docs: Refine note about `after_key` (#50475 ) * Docs: Refine note about `after_key` I was curious about composite aggregations, specifically I wanted to know how to write a composite aggregation that had all of its buckets filtered out so you had to use the `after_key`. Then I saw that we've declared composite aggregations not to work with pipelines in #44180. So I'm not sure you can do that any more. Which makes the note about `after_key` inaccurate. This rejiggers that section of the docs a little so it is more obvious that you send the `after_key` back to us. And so it is more obvious that you should only use the `after_key` that we give you rather than try to work it out for yourself. * Apply suggestions from code review Co-Authored-By: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-01-02 10:02:55 -05:00
James Rodewig	3460dc9542	[DOCS] Percentile aggs are non-deterministic (#50468 ) Percentile aggregations are non-deterministic. A percentile aggregation can produce different results even when using the same data. Based on [this discuss post][0], the non-deterministic property stems from processes in Lucene that can affect the order in which docs are provided to the aggregation. This adds a warning stating that the aggregation is non-deterministic and what that means. [0]: https://discuss.elastic.co/t/different-results-for-same-query/111757	2019-12-23 13:11:31 -05:00
Florian Kelbert	0778c34630	[DOCS] Fix typo in bucket sum aggregation docs (#50431 )	2019-12-20 08:47:24 -05:00
Lisa Cawley	6d608e6a0d	[DOCS] Move transform resource definitions into APIs (#50108 )	2019-12-17 09:01:31 -08:00
Jim Ferenczi	804a5042e7	Optimize composite aggregation based on index sorting (#48399 ) Co-authored-by: Daniel Huang <danielhuang@tencent.com> This is a spinoff of #48130 that generalizes the proposal to allow early termination with the composite aggregation when leading sources match a prefix or the entire index sort specification. In such case the composite aggregation can use the index sort natural order to early terminate the collection when it reaches a composite key that is greater than the bottom of the queue. The optimization is also applicable when a query other than match_all is provided. However the optimization is deactivated for sources that match the index sort in the following cases: * Multi-valued source, in such case early termination is not possible. * missing_bucket is set to true	2019-12-17 14:02:06 +01:00
James Rodewig	2d9ee5ddfe	[DOCS] Correct percentile rank agg example response (#50052 ) The example snippets in the percentile rank agg docs use a test dataset named `latency`, which is generated from docs/gradle.build. At some point the dataset and example snippets were updated, but the text surrounding the snippets was not. This means the text and the example snippets shown no longer match up. This corrects that by changing the snippets using /TESTRESPONSE magic comments.	2019-12-12 08:38:48 -05:00
Ignacio Vera	eade4f03f4	New Histogram field mapper that supports percentiles aggregations. (#48580 ) This commit adds a new histogram field mapper that consists in a pre-aggregated format of numerical data to be used in percentiles aggregations.	2019-11-28 13:58:20 +01:00
Przemko Robakowski	04f6b6fdb2	[DOCS] IDs for doc snippets (#49008 ) * Ids for docs snippets * Ids for tests * Ids for docs snippets * ignoring build folder from idea * Ignoring build-eclipse	2019-11-25 15:30:00 +01:00
Lisa Cawley	a4efab6ab4	[DOCS] Merge rollup config details into API (#49412 )	2019-11-22 08:31:30 -08:00
Christos Soulios	b0e12c936b	Implement stats aggregation for string terms (#47468 ) This PR adds a new metric aggregation called string_stats that operates on string terms of a document and returns the following: min_length: The length of the shortest term max_length: The length of the longest term avg_length: The average length of all terms distribution: The probability distribution of all characters appearing in all terms entropy: The total Shannon entropy value calculated for all terms This aggregation has been implemented as an analytics plugin.	2019-11-14 16:07:54 +02:00
James Rodewig	f53eba024b	[DOCS] Remove binary gendered language (#48362 )	2019-10-23 09:36:31 -05:00
Ian Danforth	24cf883792	[DOCS] Fix typo in percentile rank aggregation docs (#47247 )	2019-10-15 15:56:32 -04:00
Alan Woodward	566e1b7d33	Remove type field from DocWriteRequest and associated Response objects (#47671 ) This commit removes the type field from index, update and delete requests, and their associated responses. Relates to #41059	2019-10-11 10:23:55 +01:00
Alan Woodward	7a622f024f	Remove types from BulkRequest (#46983 ) This commit removes types entirely from BulkRequest, both as a global parameter and as individual entries on update/index/delete lines. Relates to #41059	2019-10-07 13:29:12 +01:00
Mark Tozzi	c26ce1d7f5	DocValueFormat implementation for date range fields (#47472 )	2019-10-04 16:01:28 -04:00
Mark Tozzi	57a679fbbb	Documentation notes for Range field histograms (#46890 )	2019-10-01 10:46:04 -04:00
Alan Woodward	c1f99e2d75	Remove `_type` from SearchHit (#46942 ) This commit removes the `_type` field from all search hit responses. Relates to #41059	2019-09-23 19:14:54 +01:00
Javier Ruiz	e8dac62a4a	[DOCS] Fix calendar interval typos for date histo agg (#46911 )	2019-09-20 15:22:04 -04:00
James Rodewig	370e434986	[DOCS] Correct several [source,console-result] snippets (#46930 )	2019-09-20 11:23:15 -04:00
markharwood	dc0abec595	Remove Adjacency_matrix setting in favour of Lucene Boolean query clause setting (#46327 ) Closes #46324	2019-09-19 16:48:04 +01:00
Philipp Krenn	7c5adcc7c1	Minor improvement to the nested aggregation docs (#46475 ) * Minor improvement to the nested aggregation docs * The attributes name and resellers.name were rather confusing, especially since the first one was dynamically mapped and not shown in the documentation (you had to read the test to see it). This change introduces a unique name for the nested attribute and adds the example document to the documentation. * Change the index name from "index" to something more speaking. * Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-11 11:23:39 -04:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	466c59a4a7	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 )	2019-09-05 16:47:18 -04:00
James Rodewig	f5827ba0ae	[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159 )	2019-09-04 12:51:02 -04:00
Zachary Tong	758f7999b7	Add CumulativeCard pipeline agg to pipeline index (#46279 ) The Cumulative Cardinality docs weren't linked from the pipeline index page	2019-09-03 12:10:34 -04:00
Zachary Tong	273c35f79c	Add Cumulative Cardinality agg (and Data Science plugin) (#43661 ) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations.	2019-08-26 10:43:24 -04:00
LHearen	6dadce1112	[DOCS] Correct conditional clause in histogram agg docs (#45643 )	2019-08-19 10:09:10 -04:00
LHearen	d1c0ea7833	[DOCS] Fix a 'value' -> 'values' typo in histogram aggregation docs (#45642 )	2019-08-19 10:02:44 -04:00
Zachary Tong	ae7c071ec7	Allow pipeline aggs to select specific buckets from multi-bucket aggs (#44179 ) This adjusts the `buckets_path` parser so that pipeline aggs can select specific buckets (via their bucket keys) instead of fetching the entire set of buckets. This is useful for bucket_script in particular, which might want specific buckets for calculations. It's possible to workaround this with `filter` aggs, but the workaround is hacky and probably less performant. - Adjusts documentation - Adds a barebones AggregatorTestCase for bucket_script - Tweaks AggTestCase to use getMockScriptService() for reductions and pipelines. Previously pipelines could just pass in a script service for testing, but this didnt work for regular aggs. The new getMockScriptService() method fixes that issue, but needs to be used for pipelines too. This had a knock-on effect of touching MovFn, AvgBucket and ScriptedMetric	2019-08-05 12:15:42 -04:00
Nikita Glashenko	ead4eb5209	Add more flexibility to MovingFunction window alignment (#44360 ) Introduce shift field to MovingFunction aggregation. By default, shift = 0. Behavior, in this case, is the same as before. Increasing shift by 1 moves starting window position by 1 to the right. To simply include current bucket to the window, use shift = 1 For center alignment (n/2 values before and after the current bucket), use shift = window / 2 For right alignment (n values after the current bucket), use shift = window.	2019-08-02 15:09:48 -04:00
Flavio Pompermaier	e66889635d	[DOCS] Correct sum_other_doc_count value in terms agg example (#45028 ) Closes issue #41902	2019-07-31 14:10:05 -04:00
Sandeep Kanabar	0e4be837db	[Docs] Update daterange-aggregation.asciidoc (#44730 ) Correcting the value to be the same as that specified for "missing".	2019-07-29 12:51:15 +02:00
James Rodewig	ea1adb61c2	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:16:35 -04:00
Zachary Tong	eac86c9bb8	Document that pipeline aggs are not compatible with composite agg (#44180 )	2019-07-12 12:34:34 -04:00
Zachary Tong	3e1f73ffa3	Link rare_terms docs from index page (#43882 ) Docs for rare_terms were added in #35718, but neglected to link it from the bucket index page	2019-07-02 13:10:46 -04:00
Zachary Tong	baf155dced	Add RareTerms aggregation (#35718 ) This adds a `rare_terms` aggregation. It is an aggregation designed to identify the long-tail of keywords, e.g. terms that are "rare" or have low doc counts. This aggregation is designed to be more memory efficient than the alternative, which is setting a terms aggregation to size: LONG_MAX (or worse, ordering a terms agg by count ascending, which has unbounded error). This aggregation works by maintaining a map of terms that have been seen. A counter associated with each value is incremented when we see the term again. If the counter surpasses a predefined threshold, the term is removed from the map and inserted into a cuckoo filter. If a future term is found in the cuckoo filter we assume it was previously removed from the map and is "common". The map keys are the "rare" terms after collection is done.	2019-07-01 10:02:36 -04:00
Paul Sanwald	6357857bba	Adds a minimum interval to `auto_date_histogram`. (#42814 ) Adds a minimum interval to `auto_date_histogram`. We do this by restricting the roundings passed into to the aggregator.	2019-06-11 15:53:19 -04:00
Zachary Tong	0192fe7d7c	Add documentation for calendar/fixed intervals (#41919 ) Original PR missed documentation for the new calendar/fixed intervals. This adds the missing documentation	2019-05-10 15:27:41 -04:00
Zachary Tong	290c8b8256	Force selection of calendar or fixed intervals in date histo agg (#33727 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-06 17:17:11 -04:00
James Rodewig	adf67053f4	[DOCS] Add anchors for Asciidoctor migration (#41648 )	2019-04-30 10:19:09 -04:00
Ignacio Vera	cc48427e05	Improve accuracy for Geo Centroid Aggregation (#41033 ) keeps the partial results as doubles and uses Kahan summation to help reduce floating point errors.	2019-04-25 08:06:55 +02:00
Zachary Tong	5ccc0b5a32	Disallow null/empty or duplicate composite sources (#41359 ) Adds some validation to prevent duplicate source names from being used in the composite agg. Also refactored to use a ConstructingObjectParser and removed the private ctor and setter for sources, making it mandatory.	2019-04-24 13:22:06 -04:00
Jason Tedor	656cc709b2	Fix intervals section of auto date-histogram docs (#41203 ) This section should be at the same sub-level as other sections in the auto date-histogram docs, otherwise it is rendered on to another page and is confusing for users to understand what it's in reference to.	2019-04-15 11:27:38 -04:00
Antonio Matarrese	badb8559fb	Use the breadth first collection mode for significant terms aggs. This helps avoid memory issues when computing deep sub-aggregations. Because it should be rare to use sub-aggregations with significant terms, we opted to always choose breadth first as opposed to exposing a `collect_mode` option. Closes #28652.	2019-04-11 15:38:25 -07:00
Lisa Cawley	13454376a4	[DOCS] Fixes callout for Asciidoctor migration (#41127 )	2019-04-11 12:04:04 -07:00
Zachary Tong	6f0f8ab4bc	Remove MovingAverage pipeline aggregation (#39328 ) This was deprecated in 6.4.0 and for the entirety of 7.0. Removed in 8.0	2019-03-19 15:31:05 -04:00
Ian	98bbb4176e	Correct date in daterange-aggregation.asciidoc (#39727 )	2019-03-06 11:30:17 +01:00
Samuel Cifuentes García	ff6ffe8ba1	Improved Terms Aggregation documentation (#38892 ) Added a note after the first query example talking about fielddata.	2019-03-05 10:17:01 -05:00
Hannes Van De Vreken	b76a380f18	Fix typo in DateRange docs (yyy → yyyy) (#38883 )	2019-02-15 10:20:16 -05:00
Alexander Reelsen	5f7168ea74	Remove joda time mentions in documentation (#38720 ) This is the forward port of #38720 (not containing the 7.0 migration docs)	2019-02-14 10:18:48 +01:00
Yuri Astrakhan	f133bf4ed8	add geotile_grid ref to asciidoc (#38632 )	2019-02-08 11:37:35 -05:00
Yuri Astrakhan	f3cde06a1d	geotile_grid implementation (#37842 ) Implements `geotile_grid` aggregation This patch refactors previous implementation https://github.com/elastic/elasticsearch/pull/30240 This code uses the same base classes as `geohash_grid` agg, but uses a different hashing algorithm to allow zoom consistency. Each grid bucket is aligned to Web Mercator tiles.	2019-01-31 19:11:30 -05:00
Julie Tibshirani	9ca26b7e63	Remove more references to type in docs. (#37946 ) * Update the top-level 'getting started' guide. * Remove custom types from the painless getting started documentation. * Fix an incorrect references to '_doc' in the cardinality query docs. * Update the _update docs to use the typeless API format.	2019-01-29 10:51:07 -08:00
Jim Ferenczi	cb451edb01	Allow nested fields in the composite aggregation (#37178 ) This changes adds the support to handle `nested` fields in the `composite` aggregation. A `nested` aggregation can be used as parent of a `composite` aggregation in order to target `nested` fields in the `sources`. Closes #28611	2019-01-25 14:00:39 +01:00
Christoph Büscher	967de04257	Uppercasing some docs section title (#37781 ) Section titles are mostly uppercase, only a few cases where query DSL parameters or Java method names are used as the title they should be lowercased.	2019-01-24 22:54:55 +01:00
Christoph Büscher	95a6951f78	Use new bulk API endpoint in the docs (#37698 ) This change switches to using the typeless bulk API endpoint in the documentation snippets where possible	2019-01-23 09:46:28 +01:00
Boaz Leskes	52ba407931	Expose sequence number and primary terms in search responses (#37639 ) Users may require the sequence number and primary terms to perform optimistic concurrency control operations. Currently, you can get the sequence number via the `docvalues_fields` API but the primary term is not accessible because it is maintained by the `SeqNoFieldMapper` and the infrastructure can't find it. This commit adds a dedicated sub fetch phase to return both numbers that is connected to a new `seq_no_primary_term` parameter.	2019-01-23 09:01:58 +01:00
Christoph Büscher	34f2d2ec91	Remove remaining occurances of "include_type_name=true" in docs (#37646 )	2019-01-22 15:13:52 +01:00
Christoph Büscher	3a96608b3f	Remove more include_type_name and types from docs (#37601 )	2019-01-18 14:11:18 +01:00
Christoph Büscher	25aac4f77f	Remove `include_type_name` in asciidoc where possible (#37568 ) The "include_type_name" parameter was temporarily introduced in #37285 to facilitate moving the default parameter setting to "false" in many places in the documentation code snippets. Most of the places can simply be reverted without causing errors. In this change I looked for asciidoc files that contained the "include_type_name=true" addition when creating new indices but didn't look likey they made use of the "_doc" type for mappings. This is mostly the case e.g. in the analysis docs where index creating often only contains settings. I manually corrected the use of types in some places where the docs still used an explicit type name and not the dummy "_doc" type.	2019-01-18 09:34:11 +01:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Josh Soref	edb48321ba	[DOCS] Various spelling corrections (#37046 )	2019-01-07 14:44:12 +01:00
Igor Motov	d6acd8e15f	Docs: add clarification about geohash use in geohashgrid agg (#36901 ) Adds an example on translating geohashes returned by geohashgrid agg as bucket keys into geo bounding box filters in elasticsearch as well as 3rd party applications. Closes #36413	2019-01-03 15:40:48 -05:00
Luca Cavanna	42ea644903	Remove single shard optimization when suggesting shard_size (#37041 ) When executing terms aggregations we set the shard_size, meaning the number of buckets to collect on each shard, to a value that's higher than the number of requested buckets, to guarantee some basic level of precision. We have an optimization in place so that we leave shard_size set to size whenever we are searching against a single shard, in which case maximum precision is guaranteed by definition. Such optimization requires us access to the total number of shards that the search is executing against. In the context of cross-cluster search, once we will introduce multiple reduction steps (one per cluster) each cluster will only know the number of local shards, which is problematic as we should only optimize if we are searching against a single shard in a single cluster. It could be that we are searching against one shard per cluster in which case the current code would optimize number of terms causing a loss of precision. While discussing how to address the CCS scenario, we decided that we do not want to introduce further complexity caused by this single shard optimization, as it benefits only a minority of cases, especially when the benefits are not so great. This commit removes the single shard optimization, meaning that we will always have heuristic enabled on how many number of buckets to collect on the shards, even when searching against a single shard. This will cause more buckets to be collected when searching against a single shard compared to before. If that becomes a problem for some users, they can work around that by setting the shard_size equal to the size. Relates to #32125	2019-01-02 17:45:49 +01:00
João Barbosa	276726aea2	Added keyed response to pipeline percentile aggregations 22302 (#36392 ) Closes #22302	2018-12-14 16:22:54 -05:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Jeff Hajewski	49087f16f5	Adds deprecation logging to ScriptDocValues#getValues. (#34279 ) `ScriptDocValues#getValues` was added for backwards compatibility but no longer needed. Scripts using the syntax `doc['foo'].values` when `doc['foo']` is a list should be using `doc['foo']` instead. Closes #22919	2018-11-27 14:30:13 -05:00
William Desportes	a204d1cdff	[Docs] Fix typo in datehistogram-aggregation.asciidoc (#35855 )	2018-11-23 15:16:53 +01:00
Jim Ferenczi	d96202a282	[DOCS] Fix missing callouts	2018-11-08 15:40:01 +01:00
Dominik Stadler	d351422215	Add parent-aggregation to parent-join module (#34210 ) Add `parent` aggregation, a special single bucket aggregation that joins children documents to their parent.	2018-11-08 14:13:00 +01:00
Russ Cam	848847d8c7	[Docs] Section header preceded by blank line (#34340 )	2018-11-08 12:44:13 +01:00
Sue Gallagher	1ce3c92a2d	[DOCS] Add info on calendar vs fixed interval. (#31638 ) Extensive edit to add additional information on the difference between calendar intervals and fixed-length intervals.	2018-10-31 10:16:36 -04:00
Andy Bristol	b8280ea7cc	median absolute deviation agg (#34482 ) This commit adds a new single value metric aggregation that calculates the statistic called median absolute deviation, which is a measure of variability that works on more types of data than standard deviation Our calculation of MAD is approximated using t-digests. In the collect phase, we collect each value visited into a t-digest. In the reduce phase, we merge all value t-digests, then create a t-digest of deviations using the first t-digest's median and centroids	2018-10-30 07:22:52 -07:00
Gordon Brown	794d4fa879	Label required scripts in Scripted Metric Agg docs (#35051 ) When combine_script and reduce_script were made into required parameters for Scripted Metric aggregations in #33452, the docs were not updated to reflect that. This marks those parameters as required in the documentation.	2018-10-29 15:13:14 -06:00
Julie Tibshirani	f854330e06	Make sure to use the type _doc in the REST documentation. (#34662 ) * Replace custom type names with _doc in REST examples. * Avoid using two mapping types in the percolator docs. * Rename doc -> _doc in the main repository README. * Also replace some custom type names in the HLRC docs.	2018-10-22 11:54:04 -07:00
Zachary Tong	d981746142	[Docs] clarification about cardinality accuracy (#34616 ) Adds a bit more clarification about how accuracy is dependent on the dataset in question. Closes #18231	2018-10-22 13:15:45 -04:00
markharwood	fe623acf66	Docs - removed experimental/beta markers from adjacency matrix aggregation (#34599 )	2018-10-19 09:33:59 +01:00
markharwood	2a413abb0b	Docs - remove experimental marker from significant_text aggregation (#34598 )	2018-10-19 09:32:02 +01:00
Jim Ferenczi	36557469f6	[DOCS] Removes beta label from composite aggregation (#34329 )	2018-10-05 19:46:20 +02:00
Nik Everett	dc2cf28fde	Docs: Allow skipping response assertions (#34240 ) We generate tests from our documentation, including assertions about the responses returned by a particular API. But sometimes we can't assert that the response is correct because of some defficiency in our tooling. Previously we marked the response `// NOTCONSOLE` to skip it, but this is kind of odd because `// NOTCONSOLE` is really to mark snippets that are json but aren't requests or responses. This introduces a new construct to skip response assertions: ``` // TESTRESPONSE[skip:reason we skipped this] ```	2018-10-04 08:03:38 -04:00
Serge Populov	13af5d5d7f	Docs: Fix typo in field name in aggregations (#34223 )	2018-10-02 10:54:29 -04:00
ben5556	012b9c7539	Corrected aggregation name to match the example (#33786 )	2018-09-17 18:24:43 -07:00
Ryan Ernst	3046656ab1	Scripting: Rework joda time backcompat (#33486 ) This commit switches the joda time backcompat in scripting to use augmentation over ZonedDateTime. The augmentation methods provide compatibility with the missing methods between joda's DateTime and java's ZonedDateTime. Due to getDayOfWeek returning an enum in the java API, ZonedDateTime is wrapped so that the method can return int like the joda time does. The java time api version is renamed to getDayOfWeekEnum, which will be kept through 7.x for compatibility while users switch back to getDayOfWeek once joda compatibility is removed.	2018-09-16 19:18:00 -07:00
Christoph Büscher	fe478c23b7	[Docs] Fix heading in composite-aggregation.asciidoc (#33627 ) The heading for the "Missing buckets" should be on the same level as the the "Order" section.	2018-09-12 16:56:03 +02:00
Jim Ferenczi	7ad71f906a	Upgrade to a Lucene 8 snapshot (#33310 ) The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly. Some comments about the change: * Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309 Closes #32899	2018-09-06 14:42:06 +02:00
Paul Sanwald	c303006e6b	Add interval response parameter to AutoDateInterval histogram (#33254 ) Adds the interval used to the aggregation response.	2018-09-05 07:35:59 -04:00
lipsill	b7c0d2830a	[Docs] Remove repeating words (#33087 )	2018-08-28 13:16:43 +02:00
Luca Cavanna	393eec1482	Set maxScore for empty TopDocs to Nan rather than 0 (#32938 ) We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).	2018-08-22 17:23:54 +02:00
Ryan Ernst	478f6d6cf1	Scripting: Conditionally use java time api in scripting (#31441 ) This commit adds a boolean system property, `es.scripting.use_java_time`, which controls the concrete return type used by doc values within scripts. The return type of accessing doc values for a date field is changed to Object, essentially duck typing the type to allow co-existence during the transition from joda time to java time.	2018-08-01 08:58:49 -07:00
Colm O'Shea	97b379e0d4	fix no=>not typo (#32463 ) Found a tiny typo while reading the docs	2018-07-31 13:33:23 +01:00
Sandeep Kanabar	7ad16ffd84	Docs: Correcting a typo in tophits (#32359 )	2018-07-26 13:30:01 -04:00
Zachary Tong	6ba144ae31	Add WeightedAvg metric aggregation (#31037 ) Adds a new single-value metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents. These values can be extracted from specific numeric fields in the documents. When calculating a regular average, each datapoint has an equal "weight"; it contributes equally to the final value. In contrast, weighted averages scale each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the document, or provided by a script. As a formula, a weighted average is the `∑(value * weight) / ∑(weight)` A regular average can be thought of as a weighted average where every value has an implicit weight of `1`. Closes #15731	2018-07-23 18:33:15 -04:00
Paul Sanwald	feb07559aa	fix typo	2018-07-13 14:59:11 -04:00
Colin Goodheart-Smithe	0edb096eb4	Adds a new auto-interval date histogram (#28993 ) * Adds a new auto-interval date histogram This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time. This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary. Closes #9572 * Adds documentation * Added sub aggregator test * Fixes failing docs test * Brings branch up to date with master changes * trying to get tests to pass again * Fixes multiBucketConsumer accounting * Collects more buckets than needed on shards This gives us more options at reduce time in terms of how we do the final merge of the buckeets to produce the final result * Revert "Collects more buckets than needed on shards" This reverts commit `993c782d11`. * Adds ability to merge within a rounding * Fixes nonn-timezone doc test failure * Fix time zone tests * iterates on tests * Adds test case and documentation changes Added some notes in the documentation about the intervals that can bbe returned. Also added a test case that utilises the merging of conseecutive buckets * Fixes performance bug The bug meant that getAppropriate rounding look a huge amount of time if the range of the data was large but also sparsely populated. In these situations the rounding would be very low so iterating through the rounding values from the min key to the max keey look a long time (~120 seconds in one test). The solution is to add a rough estimate first which chooses the rounding based just on the long values of the min and max keeys alone but selects the rounding one lower than the one it thinks is appropriate so the accurate method can choose the final rounding taking into account the fact that intervals are not always fixed length. Thee commit also adds more tests * Changes to only do complex reduction on final reduce * merge latest with master * correct tests and add a new test case for 10k buckets * refactor to perform bucket number check in innerBuild * correctly derive bucket setting, update tests to increase bucket threshold * fix checkstyle * address code review comments * add documentation for default buckets * fix typo	2018-07-13 13:08:35 -04:00
Jimi Ford	e955ffc38d	Docs: fix typo in datehistogram (#31972 )	2018-07-11 15:04:57 -04:00
Peter Evers	ea15284230	Docs: Match the examples in the description (#31710 ) Prose drifted from snippet.	2018-07-02 14:12:49 -04:00
Peter Evers	050fbc8f3d	Docs: Fix description of percentile ranks example example (#31652 )	2018-06-28 09:29:56 -04:00
Sue Gallagher	357a07e7a2	[DOCS] Fix heading format errors (#31483 ) * [DOCS] Fix heading format errors. Closes #31327 * [DOCS] Fix heading format errors. Closes #31327	2018-06-25 17:25:32 -07:00
Jonathan Little	8e4768890a	Migrate scripted metric aggregation scripts to ScriptContext design (#30111 ) * Migrate scripted metric aggregation scripts to ScriptContext design #29328 * Rename new script context container class and add clarifying comments to remaining references to params._agg(s) * Misc cleanup: make mock metric agg script inner classes static * Move _score to an accessor rather than an arg for scripted metric agg scripts This causes the score to be evaluated only when it's used. * Documentation changes for params._agg -> agg * Migration doc addition for scripted metric aggs _agg object change * Rename "agg" Scripted Metric Aggregation script context variable to "state" * Rename a private base class from ...Agg to ...State that I missed in my last commit * Clean up imports after merge	2018-06-25 12:01:33 +01:00
Colin Goodheart-Smithe	58e9446e00	Removes experimental tag from scripted_metric aggregation (#31298 )	2018-06-13 17:24:32 +01:00
Christoph Büscher	4777d8a2df	[Docs] Fix typo in Min Aggregation reference (#30899 )	2018-05-31 15:05:03 +02:00
Jim Ferenczi	e33d107f84	Add missing_bucket option in the composite agg (#29465 ) This change adds a new option to the composite aggregation named `missing_bucket`. This option can be set by source and dictates whether documents without a value for the source should be ignored. When set to true, documents without a value for a field emits an explicit `null` value which is then added in the composite bucket. The `missing` option that allows to set an explicit value (instead of `null`) is deprecated in this change and will be removed in a follow up (only in 7.x). This commit also changes how the big arrays are allocated, instead of reserving the provided `size` for all sources they are created with a small intial size and they grow depending on the number of buckets created by the aggregation: Closes #29380	2018-05-30 09:48:40 +02:00
Julie Tibshirani	638a719370	Ensure that ip_range aggregations always return bucket keys. (#30701 )	2018-05-24 08:55:14 -07:00
Piotr Prądzyński	a0a8c4f186	filters agg docs duplicated 'bucket' word removal (#30677 ) In one place word 'bucket' was duplicated.	2018-05-17 15:21:50 +01:00
Piotr Prądzyński	cefbd29db3	top_hits doc example description update (#30676 ) Example description does not fit example code.	2018-05-17 15:21:25 +01:00
Zachary Tong	df853c49c0	Add a MovingFunction pipeline aggregation, deprecate MovingAvg agg (#29594 ) This pipeline aggregation gives the user the ability to script functions that "move" across a window of data, instead of single data points. It is the scripted version of MovingAvg pipeline agg. Through custom script contexts, we expose a number of convenience methods: - MovingFunctions.max() - MovingFunctions.min() - MovingFunctions.sum() - MovingFunctions.unweightedAvg() - MovingFunctions.linearWeightedAvg() - MovingFunctions.ewma() - MovingFunctions.holt() - MovingFunctions.holtWinters() - MovingFunctions.stdDev() The user can also define any arbitrary logic via their own scripting, or combine with the above methods.	2018-05-16 10:57:00 -04:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00

1 2 3 4 5 ...

493 Commits