elasticsearch

Commit Graph

Author	SHA1	Message	Date
Armin Braun	484a0780f8	Remove Dead NamedWritableRegistry Fields in Aggs/Search Code (#76743 ) No need for the registry in these places anymore => we can simplify things here and there.	2021-08-20 17:45:38 +02:00
Igor Motov	e35021132e	Fix docCountError calculation for multiple reduces (#76391 ) Fix docCountError calculation in case of multiple reduces. It fixes 2 mistakes in #43874. The first error was introduced in the original PR, where unknown doc count errors were initialized equal to 0, the second was introduced during in order to fix the first one by ignoring these 0s, which essentially disabled the original fix. Fixes #75667	2021-08-12 11:50:17 -10:00
Stuart Tettemer	6c02a6c657	Script: Fields API for Sort and Score scripts (#75863 ) Adds minimal fields API support to sort and score scripts. Example: `field('myfield').getValue(123)` where `123` is the default if the field has no values. Refs: #61388	2021-08-04 10:11:12 -05:00
Nikita Glashenko	1db17ada95	Fix wrong error upper bound when performing incremental reductions (#43874 ) When performing incremental reductions, 0 value of docCountError may mean that the error was not previously calculated, or that the error was indeed previously calculated and its value was 0. We end up rejecting true values set to 0 this way. This may lead to wrong upper bound of error in result. To fix it, this PR makes docCountError nullable. null values mean that error was not calculated yet. Fixes #40005 Co-authored-by: Igor Motov <igor@motovs.org> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2021-07-22 08:18:24 -10:00
Christos Soulios	df941367df	Add dimension mapping parameter (#74450 ) Added the dimension parameter to the following field types: keyword ip Numeric field types (integer, long, byte, short) The dimension parameter is of type boolean (default: false) and is used to mark that a field is a time series dimension field. Relates to #74014	2021-06-24 20:16:27 +03:00
Rory Hunter	a5d2251064	Order imports when reformatting (#74059 ) Change the formatter config to sort / order imports, and reformat the codebase. We already had a config file for Eclipse users, so Spotless now uses that. The "Eclipse Code Formatter" plugin ought to be able to use this file as well for import ordering, but in my experiments the results were poor. Instead, use IntelliJ's `.editorconfig` support to configure import ordering. I've also added a config file for the formatter plugin. Other changes: * I've quietly enabled the `toggleOnOff` option for Spotless. It was already possible to disable formatting for sections using the markers for docs snippets, so enabling this option just accepts this reality and makes it possible via `formatter:off` and `formatter:on` without the restrictions around line length. It should still only be used as a very last resort and with good reason. * I've removed mention of the `paddedCell` option from the contributing guide, since I haven't had to use that option for a very long time. I moved the docs to the spotless config.	2021-06-16 09:22:22 +01:00
Ryan Ernst	ab1a2e4a84	Add precommit task for detecting split packages (#73784 ) Modularization of the JDK has been ongoing for several years. Recently in Java 16 the JDK began enforcing module boundaries by default. While Elasticsearch does not yet use the module system directly, there are some side effects even for those projects not modularized (eg #73517). Before we can even begin to think about how to modularize, we must Prepare The Way by enforcing packages only exist in a single jar file, since the module system does not allow packages to coexist in multiple modules. This commit adds a precommit check to the build which detects split packages. The expectation is that we will add the existing split packages to the ignore list so that any new classes will not exacerbate the problem, and the work to cleanup these split packages can be parallelized. relates #73525	2021-06-08 15:04:23 -07:00
Ryan Ernst	68817d7ca2	Rename o.e.common in libs/core to o.e.core (#73909 ) When libs/core was created, several classes were moved from server's o.e.common package, but they were not moved to a new package. Split packages need to go away long term, so that Elasticsearch can even think about modularization. This commit moves all the classes under o.e.common in core to o.e.core. relates #73784	2021-06-08 09:53:28 -07:00
Luca Cavanna	05ca9cf876	Remove getMatchingFieldTypes method (#73655 ) FieldTypeLookup and MappingLookup expose the getMatchingFieldTypes method to look up matching field type by a string pattern. We have migrated ExistsQueryBuilder to instead rely on getMatchingFieldNames, hence we can go ahead and remove the remaining usages and the method itself. The remaining usages are to find specific field types from the mappings, specifically to eagerly load global ordinals and for the join field type. These are operations that are performed only once when loading the mappings, and may be refactored to work differently in the future. For now, we remove getMatchingFieldTypes and rather call for the two mentioned scenarios getMatchingFieldNames(*) and then getFieldType for each of the returned field name. This is a bit wasteful but performance can be sacrificed for these scenarios in favour of less code to maintain.	2021-06-03 10:01:22 +02:00
Nik Everett	4b5aebe8b0	Add setting to disable aggs optimization (#73620 ) Sometimes our fancy "run this agg as a Query" optimizations end up slower than running the aggregation in the old way. We know that and use heuristics to dissable the optimization in that case. But it turns out that the process of running the heuristics itself can be slow, depending on the query. Worse, changing the heuristics requires an upgrade, which means waiting. If the heurisics make a terrible choice folks need a quick way out. This adds such a way: a cluster level setting that contains a list of queries that are considered "too expensive" to try and optimize. If the top level query contains any of those queries we'll disable the "run as Query" optimization. The default for this settings is wildcard and term-in-set queries, which is fairly conservative. There are certainly wildcard and term-in-set queries that the optimization works well with, but there are other queries of that type that it works very badly with. So we're being careful. Better, you can modify this setting in a running cluster to disable the optimization if we find a new type of query that doesn't work well. Closes #73426	2021-06-02 09:12:54 -04:00
Alan Woodward	3bd594ebe8	Replace simpleMatchToFullName (#72674 ) MappingLookup has a method simpleMatchToFieldName that attempts to return all field names that match a given pattern; if no patterns match, then it returns a single-valued collection containing just the pattern that was originally passed in. This is a fairly confusing semantic. This PR replaces simpleMatchToFullName with two new methods: * getMatchingFieldNames(), which returns a set of all mapped field names that match a pattern. Calling getFieldType() with a name returned by this method is guaranteed to return a non-null MappedFieldType * getMatchingFieldTypes, that returns a collection of all MappedFieldTypes in a mapping that match the passed-in pattern. This allows us to clean up several call-sites because we know that MappedFieldTypes returned from these calls will never be null. It also simplifies object field exists query construction.	2021-05-13 11:35:23 +01:00
Nik Everett	fad5e44b99	update benchmark readme (#72620 ) Documents that version 2.0 of the async profiler doesn't seem to work with jmh. Fixes some syntax in another profiling example.	2021-05-03 11:30:50 -04:00
Ryan Ernst	d933ecd26c	Convert path.data to String setting instead of List (#72282 ) Since multiple data path support has been removed, the Setting no longer needs to support multiple values. This commit converts the PATH_DATA_SETTING to a String setting from List<String>. relates #71205	2021-04-27 08:29:12 -07:00
Rene Groeschke	5bcd02cb4d	Restructure build tools java packages (#72030 ) Related to #71593 we move all build logic that is for elasticsearch build only into the org.elasticsearch.gradle.internal* packages This makes it clearer if build logic is considered to be used by external projects Ultimately we want to only expose TestCluster and PluginBuildPlugin logic to third party plugin authors. This is a very first step towards that direction.	2021-04-26 14:53:55 +02:00
Nik Everett	57e6c78a52	Fix profiled global agg (#71575 ) This fixes the `global` aggregator when `profile` is enabled. It does so by removing all of the special case handling for `global` aggs in `AggregationPhase` and having the global aggregator itself perform the scoped collection using the same trick that we use in filter-by-filter mode of the `filters` aggregation. Closes #71098	2021-04-13 08:36:51 -04:00
Rory Hunter	ac371b070a	Refresh formatter config (#71588 ) Write out the formatter config using the latest Eclipse. This has the effect of configuring assertion formatting properly, which has improved how some of our assertion messsages are formatted. Also reconfigure how annotations are formatted, so that they are correctly line-wrapped.	2021-04-13 09:33:41 +01:00
Armin Braun	b583ea82ad	Use ByteBufferStreamInput to Stream Byte Arrays (#71538 ) For bulk operations that fall back to hotspot intrinsic code (reading short, int, long) using this stream brings a massive speedup. The added benchmark for reading `long` values sees a ~100x speedup in local benchmarking and the vLong read benchmark still sees a slightly under ~10x speedup. Also, this PR moves creation of the `StreamInput` out of the hot benchmark loop for all the bytes reference benchmarks to make the benchmark less noisy and more practically useful. (the `readLong` case using intrinsic operations is so fast that it wouldn't even show up in a profile relative to instantiating the stream otherwise). Relates work in #71181	2021-04-12 22:25:06 +02:00
Armin Braun	2540c7489b	Optimize Reading vInt and vLong from BytesReference (#71522 ) Same optimization as in #71181 (also used by buffering Lucene DataInput implementations) but for the variable length encodings. Benchmarks show a ~50% speedup for the benchmarked mix of values for `vLong`. Generally this change helps the most with large values but shows a slight speedup even for the 1 byte length case by avoiding some indirection and bounds checking.	2021-04-09 21:12:52 +02:00
Armin Braun	afaf26fbd5	Add Benchmark for Long Reads from BytesReference Stream (#71310 ) Relates #70800 This reproduces the issue reported in #70800 and demonstrates that the fix in #71181 brings about a 5x speedup for reading `long` from a bytes reference stream when backed by a paged bytes reference.	2021-04-06 09:20:53 +02:00
Alan Woodward	1653f2fe91	Add script parameter to long and double field mappers (#69531 ) This commit adds a script parameter to long and double fields that makes it possible to calculate a value for these fields at index time. It uses the same script context as the equivalent runtime fields, and allows for multiple index-time scripted fields to cross-refer while still checking for indirection loops.	2021-03-31 11:14:11 +01:00
Jim Ferenczi	ff50da5a77	Remove the _parent_join metadata field (#70143 ) This commit removes the metadata field _parent_join that was needed to ensure that only one join field is used in a mapping. It is replaced with a validation at the field level. This change also fixes in [bug](https://github.com/elastic/kibana/issues/92960) in the handling of parent join fields in _field_caps. This metadata field throws an unexpected exception in [7.11](https://github.com/elastic/elasticsearch/pull/63878) when checking if the field is aggregatable. That's now fixed since this unused field has been removed.	2021-03-10 09:19:30 +01:00
Nik Everett	9545dafdd5	Modest memory savings in date_histogram>terms (#68592 ) This saves 16 bytes of memory per bucket for some aggregations. Specifically, it kicks in when there is a parent bucket and we have a good estimate on its upper bound cardinality, and we have good estimate on the per-bucket cardinality of this aggregation, and both those upper bounds will fit into a single `long`. That sounds unlikely, but there is a fairly common case where we have it: a `date_histogram` followed by a `terms` aggregation powered by global ordinals. This is common enough that we already had at least two rally operations for it: * `date-histo-string-terms-via-global-ords` * `filtered-date-histo-string-terms-via-global-ords` Running those rally tracks shows that the space savings yields a small but statistically significant perform bump. The 90th percentile service time drops by about 4% in the unfiltered case and 1% for the filtered case. That's not great but it good to know saving 16 bytes doesn't slow us down. ``` \| 50th percentile latency \| date-histo \| 3185.77 \| 3028.65 \| -157.118 \| ms \| \| 90th percentile latency \| date-histo \| 3237.07 \| 3101.32 \| -135.752 \| ms \| \| 100th percentile latency \| date-histo \| 3270.53 \| 3178.7 \| -91.8319 \| ms \| \| 50th percentile service time \| date-histo \| 3181.55 \| 3024.32 \| -157.238 \| ms \| \| 90th percentile service time \| date-histo \| 3232.91 \| 3097.67 \| -135.238 \| ms \| \| 100th percentile service time \| date-histo \| 3266.63 \| 3175.08 \| -91.5494 \| ms \| \| 50th percentile latency \| filtered-date-histo \| 1349.22 \| 1331.94 \| -17.2717 \| ms \| \| 90th percentile latency \| filtered-date-histo \| 1402.71 \| 1383.7 \| -19.0131 \| ms \| \| 100th percentile latency \| filtered-date-histo \| 1412.41 \| 1397.7 \| -14.7139 \| ms \| \| 50th percentile service time \| filtered-date-histo \| 1345.18 \| 1326.2 \| -18.9806 \| ms \| \| 90th percentile service time \| filtered-date-histo \| 1397.24 \| 1378.14 \| -19.1031 \| ms \| \| 100th percentile service time \| filtered-date-histo \| 1406.69 \| 1391.63 \| -15.0529 \| ms \| ``` The microbenchmarks for `LongKeyedBucketOrds`, the interface we're targeting, show a performance boost on the method in the path of about 13%. This is obvious not the entire hot path, given that th 13% savings translated to a 4% performance savings over the whole agg. But its something. ``` Benchmark Mode Cnt Score Error Units multiBucketMany avgt 5 10.038 ± 0.009 ns/op multiBucketManySmall avgt 5 8.738 ± 0.029 ns/op singleBucketIntoMulti avgt 5 7.701 ± 0.073 ns/op singleBucketIntoSingleImmutableBimorphicInvocation avgt 5 6.160 ± 0.029 ns/op singleBucketIntoSingleImmutableMonmorphicInvocation avgt 5 6.571 ± 0.043 ns/op singleBucketIntoSingleMutableBimorphicInvocation avgt 5 7.714 ± 0.010 ns/op singleBucketIntoSingleMutableMonmorphicInvocation avgt 5 7.459 ± 0.017 ns/op ``` While I was touching the JMH benchmarks for `LongKeyedBucketOrds` I took the opportunity to try and make the runs that collect from a single bucket more comparable to the ones that collect from many buckets. It only seemed fair.	2021-02-19 15:59:23 -05:00
Nik Everett	58a5e653e1	Add benchmark racing scripts (#68369 ) This adds a microbenchmark running our traditional `ScriptScoreQuery` race. This races Lucene Expressions, Painless, and a hand rolled implementation of `ScoreScript`. Through the magic of the async profiler, this revealed a few bottlenecks that hit painless that we likely can fix! Happy times. Co-authored-by: Rene Groeschke <rene@breskeby.com>	2021-02-03 12:18:05 -05:00
Mark Vieira	a92a647b9f	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 16:10:53 -08:00
Nik Everett	a5f3787be4	It's flame graph time! (#68312 ) Upgrade JMH to latest (1.26) to pick up its async profiler integration and update the documentation to include instructions to running the async profiler and making pretty pretty flame graphs.	2021-02-02 11:11:16 -05:00
Nik Everett	de8b39e52d	Lower contention on requests with many aggs (#66895 ) This lowers the contention on the `REQUEST` circuit breaker when building many aggregations on many threads by preallocating a chunk of breaker up front. This cuts down on the number of times we enter the busy loop in `ChildMemoryCircuitBreaker.limit`. Now we hit it one time when building aggregations. We still hit the busy loop if we collect many buckets. We let the `AggregationBuilder` pick size of the "chunk" that we preallocate but it doesn't have much to go on - not even the field types. But it is available in a convenient spot and the estimates don't have to be particularly accurate. The benchmarks on my 12 core desktop are interesting: ``` Benchmark (breaker) Mode Cnt Score Error Units sum noop avgt 10 1.672 ± 0.042 us/op sum real avgt 10 4.100 ± 0.027 us/op sum preallocate avgt 10 4.230 ± 0.034 us/op termsSixtySums noop avgt 10 92.658 ± 0.939 us/op termsSixtySums real avgt 10 278.764 ± 39.751 us/op termsSixtySums preallocate avgt 10 120.896 ± 16.097 us/op termsSum noop avgt 10 4.573 ± 0.095 us/op termsSum real avgt 10 9.932 ± 0.211 us/op termsSum preallocate avgt 10 7.695 ± 0.313 us/op ``` They show pretty clearly that not using the circuit breaker at all is faster. But we can't do that because we don't want to bring the node down on bad aggs. When there are many aggs (termsSixtySums) the preallocation claws back much of the performance. It even helps marginally when there are two aggs (termsSum). For a single agg (sum) we see a 130 nanosecond hit. Fine. But these values are all pretty small. At best we're seeing a 160 microsecond savings. Not so on a 160 vCPU machine: ``` Benchmark (breaker) Mode Cnt Score Error Units sum noop avgt 10 44.956 ± 8.851 us/op sum real avgt 10 118.008 ± 19.505 us/op sum preallocate avgt 10 241.234 ± 305.998 us/op termsSixtySums noop avgt 10 1339.802 ± 51.410 us/op termsSixtySums real avgt 10 12077.671 ± 12110.993 us/op termsSixtySums preallocate avgt 10 3804.515 ± 1458.702 us/op termsSum noop avgt 10 59.478 ± 2.261 us/op termsSum real avgt 10 293.756 ± 253.854 us/op termsSum preallocate avgt 10 197.963 ± 41.578 us/op ``` All of these numbers are larger because we're running all the CPUs flat out and we're seeing more contention everywhere. Even the "noop" breaker sees some contention, but I think it is mostly around memory allocation. Anyway, with many many (termsSixtySums) aggs we're looking at 8 milliseconds of savings by preallocating. Just by dodging the busy loop as much as possible. The error in the measurements there are substantial. Here are the runs: ``` real: Iteration 1: 8679.417 ±(99.9%) 273.220 us/op Iteration 2: 5849.538 ±(99.9%) 179.258 us/op Iteration 3: 5953.935 ±(99.9%) 152.829 us/op Iteration 4: 5763.465 ±(99.9%) 150.759 us/op Iteration 5: 14157.592 ±(99.9%) 395.224 us/op Iteration 1: 24857.020 ±(99.9%) 1133.847 us/op Iteration 2: 24730.903 ±(99.9%) 1107.718 us/op Iteration 3: 18894.383 ±(99.9%) 738.706 us/op Iteration 4: 5493.965 ±(99.9%) 120.529 us/op Iteration 5: 6396.493 ±(99.9%) 143.630 us/op preallocate: Iteration 1: 5512.590 ±(99.9%) 110.222 us/op Iteration 2: 3087.771 ±(99.9%) 120.084 us/op Iteration 3: 3544.282 ±(99.9%) 110.373 us/op Iteration 4: 3477.228 ±(99.9%) 107.270 us/op Iteration 5: 4351.820 ±(99.9%) 82.946 us/op Iteration 1: 3185.250 ±(99.9%) 154.102 us/op Iteration 2: 3058.000 ±(99.9%) 143.758 us/op Iteration 3: 3199.920 ±(99.9%) 61.589 us/op Iteration 4: 3163.735 ±(99.9%) 71.291 us/op Iteration 5: 5464.556 ±(99.9%) 59.034 us/op ``` That variability from 5.5ms to 25ms is terrible. It makes me not particularly trust the 8ms savings from the report. But still, the preallocating method has much less variability between runs and almost all the runs are faster than all of the non-preallocated runs. Maybe the savings is more like 2 or 3 milliseconds, but still. Or maybe we should think of hte savings as worst vs worst? If so its 19 milliseconds. Anyway, its hard to measure how much this helps. But, certainly some. Closes #58647	2021-01-04 11:26:42 -05:00
Ioannis Kakavas	6e5915dadb	Dependency Graph gradle Task (#63641 ) This change adds a gradle task that builds a simplified dependency graph of our runtime dependencies and pushes that to be monitored by a software composition analysis service.	2020-11-16 11:23:04 +02:00
Rene Groeschke	810e7ff6b0	Move tasks in build scripts to task avoidance api (#64046 ) - Some trivial cleanup on build scripts - Change task referencing in build scripts to use task avoidance api where replacement is trivial.	2020-11-12 12:04:15 +01:00
Yannick Welsch	2afec0d916	Determine shard size before allocating shards recovering from snapshots (#61906 ) Determines the shard size of shards before allocating shards that are recovering from snapshots. It ensures during shard allocation that the target node that is selected as recovery target will have enough free disk space for the recovery event. This applies to regular restores, CCR bootstrap from remote, as well as mounting searchable snapshots. The InternalSnapshotInfoService is responsible for fetching snapshot shard sizes from repositories. It provides a getShardSize() method to other components of the system that can be used to retrieve the latest known shard size. If the latest snapshot shard size retrieval failed, the getShardSize() returns ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE. While we'd like a better way to handle such failures, returning this value allows to keep the existing behavior for now. Note that this PR does not address an issues (we already have today) where a replica is being allocated without knowing how much disk space is being used by the primary.	2020-10-06 17:29:42 +02:00
Jim Ferenczi	fbed2a1709	Request-level circuit breaker support on coordinating nodes (#62223 ) This commit allows coordinating node to account the memory used to perform partial and final reduce of aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException} is thrown if exceeds the maximum memory allowed in this breaker. This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced. This estimation can be completely off for some aggregations but it is corrected with the real size after the reduce completes. If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations and replace the estimation with the serialized size of the newly reduced result. As a follow up we could trigger partial reduces based on the memory accounted in the circuit breaker instead of relying on a static number of shard responses. A simpler follow up that could be done in the mean time is to [reduce the default batch reduce size](https://github.com/elastic/elasticsearch/issues/51857) of blocking search request to a more sane number. Closes #37182	2020-09-24 14:02:49 +02:00
Nik Everett	dfc45396e7	Speed up writeVInt (#62345 ) This speeds up `StreamOutput#writeVInt` quite a bit which is nice because it is very commonly called when serializing aggregations. Well, when serializing anything. All "collections" serialize their size as a vint. Anyway, I was examining the serialization speeds of `StringTerms` and this saves about 30% of the write time for that. I expect it'll be useful other places.	2020-09-15 14:20:53 -04:00
Nik Everett	1af8d9f228	Rework checking if a year is a leap year (#60585 ) This way is faster, saving about 8% on the microbenchmark that rounds to the nearest month. That is in the hot path for `date_histogram` which is a very popular aggregation so it seems worth it to at least try and speed it up a little.	2020-08-05 16:09:51 -04:00
Nik Everett	e84a501f00	Add microbenchmark for LongKeyedBucketOrds (#58608 ) I've always been confused by the strange behavior that I saw when working on #57304. Specifically, I saw switching from a bimorphic invocation to a monomorphic invocation to give us a 7%-15% performance bump. This felt bonkers to me. And, it also made me wonder whether it'd be worth looking into doing it everywhere. It turns out that, no, it isn't needed everywhere. This benchmark shows that a bimorphic invocation like: ``` LongKeyedBucketOrds ords = new LongKeyedBucketOrds.ForSingle(); ords.add(0, 0); <------ this line ``` is 19% slower than a monomorphic invocation like: ``` LongKeyedBucketOrds.ForSingle ords = new LongKeyedBucketOrds.ForSingle(); ords.add(0, 0); <------ this line ``` But only when the reference is mutable. In the example above, if `ords` is never changed then both perform the same. But if the `ords` reference is assigned twice then we start to see the difference: ``` immutable bimorphic avgt 10 6.468 ± 0.045 ns/op immutable monomorphic avgt 10 6.756 ± 0.026 ns/op mutable bimorphic avgt 10 9.741 ± 0.073 ns/op mutable monomorphic avgt 10 8.190 ± 0.016 ns/op ``` So the conclusion from all this is that we've done the right thing: `auto_date_histogram` is the only aggregation in which `ords` isn't final and it is the only aggregation that forces monomorphic invocations. All other aggregations use an immutable bimorphic invocation. Which is fine. Relates to #56487	2020-07-13 16:01:20 -04:00
Rene Groeschke	ef6eb3af3c	Fix dependency related deprecations (#58892 )	2020-07-07 11:29:26 +02:00
Rene Groeschke	9526c7a4b3	Replace compile configuration usage with api (#58451 ) - Use java-library instead of plugin to allow api configuration usage - Remove explicit references to runtime configurations in dependency declarations - Make test runtime classpath input for testing convention - required as java library will by default not have build jar file - jar file is now explicit input of the task and gradle will ensure its properly build	2020-06-30 09:37:09 +02:00
Rene Groeschke	5f9d1f1d7c	Unify dependency licenses task configuration (#58116 ) - Remove duplicate dependency configuration - Use task avoidance api accross the build - Remove redundant licensesCheck config	2020-06-17 18:27:16 +02:00
Nik Everett	8478ee65ff	Speed up time interval arounding around dst (#56371 ) When an index spans a daylight savings time transition we can't use our optimization that rewrites the requested time zone to a fixed time zone and instead we used to fall back to a java.util.time based rounding implementation. In #55559 we optimized "time unit" rounding. This optimizes "time interval" rounding. The java.util.time based implementation is about 1650% slower than the rounding implementation for a fixed time zone. This replaces it with a similar optimization that is only about 30% slower than the fixed time zone. The java.util.time implementation allocates a ton of short lived objects but the optimized implementation doesn't. So it might end up being faster than the microbenchmarks imply.	2020-05-07 17:45:50 -04:00
Nik Everett	0097a86d53	Optimize date_histograms across daylight savings time (#55559 ) Rounding dates on a shard that contains a daylight savings time transition is currently something like 1400% slower than when a shard contains dates only on one side of the DST transition. And it makes a ton of short lived garbage. This replaces that implementation with one that benchmarks to having around 30% overhead instead of the 1400%. And it doesn't generate any garbage per search hit. Some background: There are two ways to round in ES: * Round to the nearest time unit (Day/Hour/Week/Month/etc) * Round to the nearest time interval (3 days/2 weeks/etc) I'm only optimizing the first one in this change and plan to do the second in a follow up. It turns out that rounding to the nearest unit really is two problems: when the unit rounds to midnight (day/week/month/year) and when it doesn't (hour/minute/second). Rounding to midnight is consistently about 25% faster and rounding to individual hour or minutes. This optimization relies on being able to usually figure out what the minimum and maximum dates are on the shard. This is similar to an existing optimization where we rewrite time zones that aren't fixed (think America/New_York and its daylight savings time transitions) into fixed time zones so long as there isn't a daylight savings time transition on the shard (UTC-5 or UTC-4 for America/New_York). Once I implement time interval rounding the time zone rewriting optimization should no longer be needed. This optimization doesn't come into play for `composite` or `auto_date_histogram` aggs because neither have been migrated to the new `DATE` `ValuesSourceType` which is where that range lookup happens. When they are they will be able to pick up the optimization without much work. I expect this to be substantial for `auto_date_histogram` but less so for `composite` because it deals with fewer values. Note: My 30% overhead figure comes from small numbers of daylight savings time transitions. That overhead gets higher when there are more transitions in logarithmic fashion. When there are two thousand years worth of transitions my algorithm ends up being 250% slower than rounding without a time zone, but java time is 47000% slower at that point, allocating memory as fast as it possibly can.	2020-05-07 07:22:32 -04:00
Ryan Ernst	842ce32870	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:23:55 -07:00
Tanguy Leroux	f6feb6c2c8	Merge feature/searchable-snapshots branch into master (#54803 ) This commit merges the searchable-snapshots feature branch into master. See #54803 for the complete list of squashed commits. Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-06 15:51:05 +02:00
Jason Tedor	95a7eed9aa	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 15:52:01 -04:00
Rory Hunter	d32b6671f6	Exclude generated source from benchmarks formatting (#52968 ) IDEs can sometimes run annotation processors that leave files in `src/main/generated/*/.java`, causing Spotless to complain. Even though this path ought not to exist, exclude it anyway in order to avoid spurious failures.	2020-02-28 20:54:47 +00:00
Rory Hunter	a93d98b2c7	Opt-in :benchmarks to automatic formatting (#52756 ) Also, in the Allocators class, a number of methods declared thrown exceptions that IntelliJ reported were never thrown, and as far as I could see this is true, so I removed the exceptions.	2020-02-26 10:37:32 +00:00
Maria Ralli	3473987fdf	Remove Xlint exclusions from gradle files (#52542 ) This commit is part of issue #40366 to remove disabled Xlint warnings from gradle files. In particular, it removes the Xlint exclusions from the following files: - benchmarks/build.gradle - client/client-benchmark-noop-api-plugin/build.gradle - x-pack/qa/rolling-upgrade/build.gradle - x-pack/qa/third-party/active-directory/build.gradle - modules/transport-netty4/build.gradle For the first three files no code adjustments were needed. For x-pack/qa/third-party/active-directory move the suppression at the code level. For transport-netty4 replace the variable arguments with ArrayLists and remove any redundant casts.	2020-02-20 14:06:45 +00:00
Rory Hunter	3a3e5f6176	Apply 2-space indent to all gradle scripts (#48849 ) Closes #48724. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-13 10:14:04 +00:00
Mark Vieira	af6af346f7	Introduce type-safe and consistent pattern for handling build globals (#48778 ) This commit introduces a consistent, and type-safe manner for handling global build parameters through out our build logic. Primarily this replaces the existing usages of extra properties with static accessors. It also introduces and explicit API for initialization and mutation of any such parameters, as well as better error handling for uninitialized or eager access of parameter values. Closes #42042	2019-11-01 09:54:22 -07:00
Jason Tedor	aa12af8a3c	Enable node roles to be pluggable (#43175 ) This commit introduces the possibility for a plugin to introduce additional node roles.	2019-06-13 14:43:14 -04:00
Mark Vieira	12d583dbf6	Remove unnecessary usage of Gradle dependency substitution rules (#42773 )	2019-06-03 16:18:45 -07:00
Mark Vieira	323f312bbc	Replace usages RandomizedTestingTask with built-in Gradle Test (#40978 ) This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.	2019-04-08 14:13:59 -07:00
Martijn van Groningen	cf55ba54cb	Make -try xlint warning disabled by default. (#40833 ) Many gradle projects specifically use the -try exclude flag, because there are many cases where auto-closeable resource ignore is never referenced in body of corresponding try statement. Suppressing this warning specifically in each case that it happens using `@SuppressWarnings("try")` would be very verbose. This change removes `-try` from any gradle project and adds it to the build plugin. Also this change removes exclude flags from gradle projects that is already specified in build plugin (for example -deprecation). Relates to #40366	2019-04-05 08:01:56 +02:00

1 2

97 Commits