elasticsearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	4b5aebe8b0	Add setting to disable aggs optimization (#73620 ) Sometimes our fancy "run this agg as a Query" optimizations end up slower than running the aggregation in the old way. We know that and use heuristics to dissable the optimization in that case. But it turns out that the process of running the heuristics itself can be slow, depending on the query. Worse, changing the heuristics requires an upgrade, which means waiting. If the heurisics make a terrible choice folks need a quick way out. This adds such a way: a cluster level setting that contains a list of queries that are considered "too expensive" to try and optimize. If the top level query contains any of those queries we'll disable the "run as Query" optimization. The default for this settings is wildcard and term-in-set queries, which is fairly conservative. There are certainly wildcard and term-in-set queries that the optimization works well with, but there are other queries of that type that it works very badly with. So we're being careful. Better, you can modify this setting in a running cluster to disable the optimization if we find a new type of query that doesn't work well. Closes #73426	2021-06-02 09:12:54 -04:00
Armin Braun	31643a59e5	Remove Dead Branch from IndexMetadataGenerations#withAddedSnapshot (#73658 ) We never do (nor should) call this method to overwrite a snapshots existing entry, checks and assertions upstream make sure of that. => simplified the code accordingly	2021-06-02 14:29:25 +02:00
Ignacio Vera	b2619ec24c	Specialise ScriptDocValues#GeoPoints for single values (#73656 )	2021-06-02 13:06:00 +02:00
David Turner	ceb3099511	Rename AsyncLucenePersistedState to AsyncPersistedState (#73652 ) This class actually has nothing to do with Lucene. This commit adjusts the name to match.	2021-06-02 11:44:53 +01:00
David Turner	92fb60d154	Use NIOFSDirectory in PersistedClusterStateService (#73654 ) Today we use `SimpleFSDirectory` in `PersistedClusterStateService` since we don't need anything fancy. `SimpleFSDirectory` is today deprecated since it's strictly worse than `NIOFSDirectory` so this commit removes usages of the deprecated class.	2021-06-02 11:33:37 +01:00
David Turner	f1abcf1531	Write next cluster state fully on all failures (#73631 ) Today we do not set the `LucenePersistedState#writeNextStateFully` flag on all failures, notably on an `OutOfMemoryError`. Since we don't exit immediately on an OOME we may have failed part-way through writing a full state but still proceed with another apparently-incremental write. With this commit we ensure `LucenePersistedState#writeNextStateFully` is only set if the previous write was successful.	2021-06-02 11:18:34 +01:00
Martijn van Groningen	27dfc58bd6	Take include_aliases flag into account when restoring data stream aliases (#73595 ) Take RestoreSnapshotRequest#includeAliases() into account when restoring data stream aliases from a snapshot into a cluster. Relates to #66163	2021-06-02 11:02:24 +02:00
Martijn van Groningen	6b2322f827	Also rename write data stream alias during a restore. (#73588 ) Rename during a restore should also rename write data stream in data stream alias. Relates to #66163	2021-06-02 11:01:59 +02:00
Armin Braun	aaa45cef37	Cache RepositoryData Outright instead of Serialized (#73190 ) Serializing and compressing `RepositoryData` seems to have been the wrong trade-off in hindsight. While saving some heap on a quiet master it makes every repository operation cost heap for the a newly instantiated `RepositoryData`. Concurrent repository operations and snapshot API requests can thus easily lead to many duplicate instances on heap causing memory pressure. Limiting caching to smaller instances also appears to have been the wrong choice in hindsight. While duplication of a few 100kb instances of `RepositoryData` is mostly not a big deal, duplicating a `5MB` instance a couple of times (e.g. seen during heavily concurrent get snapshots requests) eventually becomes a problem.	2021-06-02 10:01:13 +02:00
Luca Cavanna	4ca2e0300b	ExistsQueryBuilder to no longer rely on getMatchingFieldTypes (#73617 ) We've been discussing possibly removing `FieldTypeLookup#getMatchingFieldTypes`, or at least its `SearchExecutionContext` variant that applies runtime mappings. This is another step in that direction: the exists query can rely on getMatchingFieldNames instead, and look up field types by name.	2021-06-02 08:33:01 +02:00
William Brafford	80ea64cd2e	[8.x] OsStats must be lenient with bad data from older nodes (#73610 ) We've had a series of bug fixes for cases where an OsProbe gives negative values, most often just -1, to the OsStats class. We added assertions to catch cases where we were initializing OsStats with bad values. Unfortunately, these fixes turned to not be backwards-compatible. In this commit, we simply coerce bad values to 0 when data is coming from nodes that don't have the relevant bug fixes. Relevant PRs: * #42725 * #56435 * #57317 Fixes #73459	2021-06-01 18:52:55 -04:00
James Baiera	c13384ce01	Add X-Elastic-Product header on all http responses (#73434 ) * Add product response header to all responses * share header value privately across package * Make the product header lowercase. * Do not expose the product header if request is unauthenticated. * Fix checkstyle Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2021-06-01 16:46:03 -04:00
David Turner	e027ce977c	Extend version barrier to all upgrades (#73358 ) Today when upgrading to the next major version we have a so-called _major version barrier_: once the cluster comprises nodes of the new major version then nodes of the previous major version are prevented from joining the cluster. This means we can be certain that `clusterState.nodes().getMinNodeVersion().major` will never decrease, so we can implement upgrade logic that relies on the cluster remaining in its wholly-upgraded state. This commit generalises this behaviour to apply to all upgrades, so that we can be certain that `clusterState.nodes().getMinNodeVersion()` will never decrease in a running cluster. Closes #72911	2021-06-01 09:07:23 +01:00
Tanguy Leroux	3336bb8c7c	Fix small inconsistency in doc_stats search source (#73372 ) This commit fixes a small inconsistency in the FrozenEngine and Engine classes when they refer to the source used to open a Searcher for loading doc stats.	2021-06-01 09:35:11 +02:00
David Turner	a4015fde9b	Expand Javadoc for AllocationService#reroute (#73512 ) Notes that this method can be expensive and recommends the preferred way to call it.	2021-06-01 07:32:03 +01:00
Nhat Nguyen	5efb6eaba6	Update Lucene to 8.9.0-snapshot-ddc238e5df8 (#73568 ) Just include LUCENE-9980, which fixes #39591. Closes #39591	2021-05-31 13:49:29 -04:00
Martijn van Groningen	afc17bdb74	Add support for is_write_index flag to data stream aliases. (#73462 ) This allows indexing documents into a data stream alias. The ingestion is that forwarded to the write index of the data stream that is marked as write data stream. The `is_write_index` parameter can be used to indicate what the write data stream is, when updating / adding a data steam alias. Relates to #66163	2021-05-31 15:08:39 +02:00
Ignacio Vera	fcb523c833	Change rest status code for TaskCancelledException to 400 (#73524 )	2021-05-31 07:26:18 +02:00
Julie Tibshirani	b47d81d2b0	Replace remaining references to 'json' with 'flattened' We missed these when renaming the flattened field type.	2021-05-28 14:54:40 -07:00
Luca Cavanna	78cdd009ed	DynamicFieldType to expose its known subfields names (#73530 ) We use DynamicFieldType to dynamically resolve fields names at their lookup. This is currently used by the flattened field mapper and we intend to use it to expose emitting multiple fields from a runtime field scripts. Those multiple sub-fields will be resolved dynamically, but we need to make a distinction between sub-fields that are known in advance, which we would like to be discoverable e.g. through field_caps at all times, and additional sub-fields that may be resolved at lookup time but are not necessarily known in advance. For this we are introducing a new method to DynamicFieldType that FieldTypeLookup consults in its getMatchingFieldNames and getMatchingFieldTypes method when the argument is a wilcard pattern.	2021-05-28 18:28:04 +02:00
Luca Cavanna	313d832283	FieldTypeLookup to support dynamic runtime fields (#73519 ) We recently streamlined support for dynamic field lookups in FieldTypeLookup. That is now used by the flattened field mapper. We would also like to use it for runtime fields, hence this commit adds support for dynamic runtime fields. This will be useful to support emitting multiple fields from a single runtime field script, as the sub-fields will be dynamically emitted.	2021-05-28 16:26:10 +02:00
David Turner	2b521e3d9e	Fix illegal access on PIT creation for frozen index (#73517 ) Closes #73514	2021-05-28 12:45:30 +01:00
Armin Braun	14a31b9813	Fix Bug with Concurrent Snapshot and Index Delete (#73456 ) Fixes the incorrect assumption in the snapshot state machine that a finished snapshot delete could only start shard snapshots: in fact it can also move snapshots to a completed state.	2021-05-27 13:26:26 +01:00
Martijn van Groningen	bbb25a01ce	Add more validation for data stream aliases. (#73416 ) Currently when attempting to an alias to points to both data streams and regular indices the alias does get created, but points only to data streams. With this change attempting to add such aliases results in a client error. Currently when adding data stream aliases with unsupported parameters (e.g. filter or routing) the alias does get created, but without the unsupported parameters. With this change attempting to create aliases to point to data streams with unsupported parameters will result in a client error. Relates to #66163	2021-05-27 13:22:01 +02:00
Armin Braun	b8fb1eb33d	Make ShardGenerations Immutable (#73452 ) This object should be completely immutable. Also added a useful assertion that makes sure we don't accidentally overwrite a valid generation with `null` when dealing dealing failed status updates.	2021-05-27 10:57:48 +02:00
Julie Tibshirani	aa94cd5212	Make sure to return total hits when field collapsing (#73298 ) Previously, we would always return 0 total hits when there were no groups. Now that collapsing supports search_after, it's possible for total hits to be greater than 0 but no groups to return. This PR also fixes a test bug where we set the wrong missing value for sorting on doubles. Fixes #73270.	2021-05-26 15:46:54 -07:00
David Turner	9f4d5f85f8	Improve failure logging in testCorruptTranslogFiles (#73431 ) Includes the full cluster allocation explain output in the assertion failure message so we can see the state of the world on a failure. Relates #72849	2021-05-26 19:23:49 +01:00
Julie Tibshirani	9e52b290ab	Remove duplicate XCombinedFieldQuery (#73183 ) This query was copied from Lucene and can be removed now that we've upgraded to a Lucene 8.9 snapshot.	2021-05-26 11:04:13 -07:00
Ignacio Vera	409b6cefe3	Add painless script support for geo_shape field (#72886 ) Users can access the centroid, bounding box and dimensional type of the shape.	2021-05-26 18:55:45 +02:00
Alan Woodward	1b060a2de5	Search analyzer should default to configured index analyzer over default (#73359 ) When a search or search_quote analyzer on a text mapper is not defined, we fallback to a configured default search/search_quote analyzer if it exists. However, if an index analyzer has been configured on the mapper then we should first fall back to that. Fixes #73333	2021-05-26 17:03:38 +01:00
Przemysław Witek	7e3f098dcf	[Transform] Revamp transform config and query validation code (#72526 )	2021-05-26 13:49:54 +02:00
Armin Braun	ddc3744b16	Speed up Maps.copyMapWithAddedEntry to Speed up ITs (#73308 ) This method is taking about 4% of CPU time with internal cluster tests for me. 80% of that were coming from the slow immutability assertion, the rest was due to the slow way we were building up the new map. The CPU time slowness likely translates into outright test slowness, because this was mainly hit through adding transport handlers when starting nodes (which happens on the main test thread). Fixed both to save a few % of test runtime.	2021-05-26 12:54:26 +02:00
Martijn van Groningen	628980c1e0	Data stream aliases and action request's includeDataStreams flag. (#73266 ) When data stream aliases are resolved then the includeDataStreams flag of an action request should be taken into account, so that data stream aliases aren't resolved to backing indices for apis that don't support data streams. Closes #73195	2021-05-26 10:07:12 +02:00
William Brafford	584974ef13	Validate that system indices aren't also hidden indices (#72768 ) * Validate that system indices aren't also hidden inidices * Remove hidden from ingest geo system index * Add test coverage * Remove hidden setting from system index even if not upgrading	2021-05-25 16:45:49 -04:00
Mark Vieira	6cf3d65388	Add 7.13.1 version constant	2021-05-25 08:26:25 -07:00
Lee Hinman	95bccda599	Remove deprecated ._tier allocation filtering settings (#73074 ) These settings were deprecated in 7.13+ in #72835 and are now removed by this commit. This commit also ensures that the settings are removed from index metadata when the metadata is loaded. The reason for this is that if we allow the settings to remain (because they are not technically "invalid"), then the index will not be able to be allocated, because the FilterAllocationDecider will be looking for nodes with the _tier attribute.	2021-05-24 14:38:34 -06:00
Nik Everett	6aa47a93d7	Fix spurious error in test (#73336 ) Fix a test that was failing to correctly identify that we can't optimize the `exist` filter on keyword fields. We can't optimize it most of the time, but it thought we could sometimes because we can optimize it sometimes. Specifically, when there are no values for that field in the segment at all. The test bumped into segments like that and, correctly, optimized the filter. This changes the test to make sure we never bump into segments like that when we're asserting that we can't optimize the agg. Closes #73185	2021-05-24 13:57:23 -04:00
Armin Braun	f5aa82427d	Stricter Parsing Shard Level Repository Metadata (#73269 ) Similar to #73268 we should be stricter here, especially when we are super-strict about additional fields anyway. Also, use our parser exception utils to get better exceptions if parsing fails.	2021-05-20 21:25:44 +02:00
Armin Braun	b13f43b24e	Refactor RestoreService Restore Path (#73258 ) Make the restore path a little easier to follow by splitting it up into the cluster state update and the steps that happen before the CS update. Also, document more pieces of it and remove some confusing redundant code.	2021-05-20 15:51:58 +02:00
Armin Braun	25dcc62459	Fix SnapshotInfo.fromXContentInternal not Fully Consuming Parser (#73268 ) The parsing here was causing trouble with the new streaming deserialization because it did not fully consume the parser so if the internal buffer of the parser was just enough to finish reading the `"snapshot"` field but missed the closing bracket, then the stream behind the parser would not have been consumed fully. Also it was strangely lenient and would just read a broken in-progress `SnapshotInfo` if it ran into SMILE that contained any object field under any key that isn't "snapshot". I made it a little stricter now to enforce that we have a "snapshot" field and not just an object field by any name.	2021-05-20 14:16:12 +02:00
Julie Tibshirani	f85a9dddb9	Support field collapsing with search_after (#73023 ) This change adds support for using `search_after` with field collapsing. When using these in conjunction, the same field must be used for both sorting and field collapsing. This helps keep the behavior simple and predictable. Otherwise it would be possible for a group to appear on multiple pages of results. Currently search after is handled directly in `CollapsingTopDocsCollector`. As a follow-up, we could generalize the logic and move support to the Lucene grouping framework. Closes #53115.	2021-05-19 14:21:18 -07:00
Mark Tozzi	9b99234b4a	Javadoc for how aggs work (#73214 ) Based a tech talk Nik gave, I just typed up the notes.	2021-05-19 16:04:36 -04:00
Gordon Brown	4162a5710e	Handle the existence of system data streams in Get Aliases API (#73244 ) This commit adjusts the behavior of the Get Aliases API to more thoroughly prevent errors and warnings from being emitted unnecessarily from the Get Aliases API by retrieving all indices including system ones and only warning in the post processing of the action. Additionally, the IndexAbstractionResolver has been updated to properly handle system data streams when evaluating visibility. Closes #73218 Co-authored-by: jaymode <jay@elastic.co>	2021-05-19 13:09:35 -06:00
Joe Gallo	70cfcf83eb	Remove obsolete datastream BWC checks (#73247 )	2021-05-19 12:18:26 -04:00
Przemyslaw Gomulka	35460a5f8a	[Rest Api Compatibility] REST Terms vector typed response (#73117 ) Enabling the tests and adds a type field for termvector response the commit that enabled typed endpoints but missed to update the response #72155	2021-05-19 13:23:47 +02:00
Martijn van Groningen	4b2c3ab0b7	The get aliases api should not return entries for data streams with no aliases (#72953 ) The get alias api should take into account the aliases parameter when returning aliases that refer to data streams and don't return entries for data streams that don't have any aliases pointing to it. Relates to #66163	2021-05-19 10:07:11 +02:00
David Turner	7a0eaabe39	Improve BlobStoreFormatTests#randomCorruption (#73201 ) This method today corrupts bytes until the checksum changes, but (a) it's comparing the checksum vs one computed before even reading the file, and (b) changing a single byte will always invalidate a CRC-32 checksum so the loop is unnecessary as is the checksum calculation. It also doesn't ever try truncating the file which is a realistic kind of corruption that we must be able to detect. This commit addresses all that.	2021-05-18 16:59:45 +01:00
Armin Braun	06fc62f256	Fix UpdateThreadPoolSettingsTests (#73199 ) Small and obvious oversight from #73172	2021-05-18 15:43:34 +02:00
Armin Braun	da242856fd	Introduce SNAPSHOT_META Threadpool for Fetching Repository Metadata (#73172 ) Adds new snapshot meta pool that is used to speed up the get snapshots API by making `SnapshotInfo` load in parallel. Also use this pool to load `RepositoryData`. A follow-up to this would expand the use of this pool to the snapshot status API and make it run in parallel as well.	2021-05-18 14:40:39 +02:00
Ryan Ernst	77d756b534	Deprecate shared and index data path settings (#73178 ) This commit adds deprecation warnings for use of the path.shared_data setting as well as the index setting index.data_path. relates #73168	2021-05-18 05:38:35 -07:00

1 2 3 4 5 ...

7143 Commits