elasticsearch

Commit Graph

Author	SHA1	Message	Date
Mayya Sharipova	f18b9d5ac8	Add segment sorter for data streams (#75195 ) It is beneficial to sort segments within a datastream's index by desc order of their max timestamp field, so that the most recent (in terms of timestamp) segments will be first. This allows to speed up sort query on @timestamp desc field, which is the most common type of query for datastreams, as we are mostly concerned with the recent data. This patch addressed this for writable indices. Segments' sorter is different from index sorting. An index sorter by itself is concerned about the order of docs within an individual segment (and not how the segments are organized), while the segment sorter is only used during search and allows to start docs collection with the "right" segment, so we can terminate the collection faster. This PR adds a property to IndexShard `isDataStreamIndex` that shows if a shard is a part of datastream.	2021-09-03 09:42:48 -04:00
Rene Groeschke	35ec6f348c	Introduce simple public yaml-rest-test plugin (#76554 ) This introduces a basic public yaml rest test plugin that is supposed to be used by external elasticsearch plugin authors. This is driven by #76215 - Rename yaml-rest-test to intern-yaml-rest-test - Use public yaml plugin in example plugins Co-authored-by: Mark Vieira <portugee@gmail.com>	2021-08-31 08:45:52 +02:00
William Brafford	81b6d10216	Add system data streams to feature state snapshots (#75902 ) Add system data streams to the "snapshot feature state" code block, so that if we're snapshotting a feature by name we grab that feature's system data streams too. Handle these data streams on the restore side as well. * Add system data streams to feature state snapshots * Don't pass system data streams through index name resolution * Don't add no-op features to snapshots * Hook in system data streams for snapshot restoration	2021-08-16 11:10:39 -04:00
Mayya Sharipova	02d112e352	Move DataStreamTimestampFieldMapper to core (#75906 ) As we build more functionality around data streams, it is necessary to quickly identify if an index is a part of data stream. For this purpose we need to move the field mapper for datatastream's timestamp meta-field to core. This PR also adds MappingLookup::isDataStreamTimestampFieldEnabled method that can quickly check if data-stream timestamp meta-field is enabled.	2021-08-04 07:54:48 -04:00
Gordon Brown	02ea1f9502	Properly apply `system` flag on data streams when restoring a snapshot (#75819 ) This commit modifies the restore process to ensure that the `system` flag is properly applied to restored data streams. Otherwise, this flag is lost when restoring system data streams, which causes errors and/or assertion failures as the backing indices are properly marked as system indices, but the restored data stream is no longer a system data stream. Also adds a test to ensure this flag survives a round trip through the snapshot/restore process.	2021-07-29 11:36:04 -06:00
Mark Vieira	9d14bc91d7	Set netty available processors system property for tests globally (#75699 )	2021-07-27 11:21:42 -07:00
Dan Hermann	0232675e30	Create data stream aliases from template (#73867 )	2021-07-22 13:12:20 -05:00
Martijn van Groningen	3dde09a7b4	Add filter support to data stream aliases (#74784 ) This allows specifying a query as filter on data stream alias, which will then always be applied when searching via this alias. Relates #66163	2021-07-20 11:21:27 +02:00
Armin Braun	c061964a3a	Fix Default Restore not Restoring Indices when DS are Restored (#75266 ) There was a simple bug that caused the default to `*` or `_all` on the request to not work anymore when resolving what indices to restore because we would add datastream indices to the restore array, thus causing the indices filtering to think we only wanted those specific indices. closes #75192	2021-07-13 14:42:23 +02:00
Luca Cavanna	c6641bf00c	Rename ParseContext to DocumentParserContext (#74963 ) ParseContext is used to parse documents. It was easily confused with ParserContext (now renamed to MappingParserContext) which is instead used to parse mappings. To remove any confusion, this commit renames ParseContext to DocumentParserContext and adapts its subclasses accordingly.	2021-07-06 09:15:59 -04:00
Martijn van Groningen	3f44a5b3fc	Allow remove alias actions to target both data streams and regular indices. (#74403 ) When removing aliases allow that an alias action's index expression can match both data streams and regular indices. This allows api calls like `DELETE /_all/_alias/my-alias`, which can common in tear down / cleanup logic. In this case `my-alias` just points to regular indices, but `_all` can be expanded to data streams too if exist. This can then trigger validation logic that prevents adding aliases that refer to both indices and data streams. However this api call never adds any alias, only removes it. So failing with this validation error doesn't make much sense. This change adjusts the validation logic so that: 'match with both data streams and regular indices are disallowed' validation is only executed for alias actions that add aliases. Relates to #66163	2021-06-24 09:13:42 +02:00
Martijn van Groningen	eb701c77f0	Fix wildcard support in update aliases api for data stream aliases (#74285 ) Fix update indices aliases to accept wildcard expressions as alias names in remove alias actions for aliases referring to data stream. Also allow add alias actions for data stream aliases to contain multiple aliases names. Both changes are inline with alias actions for indices aliases. Relates to #66163	2021-06-22 09:26:36 +02:00
Rory Hunter	a5d2251064	Order imports when reformatting (#74059 ) Change the formatter config to sort / order imports, and reformat the codebase. We already had a config file for Eclipse users, so Spotless now uses that. The "Eclipse Code Formatter" plugin ought to be able to use this file as well for import ordering, but in my experiments the results were poor. Instead, use IntelliJ's `.editorconfig` support to configure import ordering. I've also added a config file for the formatter plugin. Other changes: * I've quietly enabled the `toggleOnOff` option for Spotless. It was already possible to disable formatting for sections using the markers for docs snippets, so enabling this option just accepts this reality and makes it possible via `formatter:off` and `formatter:on` without the restrictions around line length. It should still only be used as a very last resort and with good reason. * I've removed mention of the `paddedCell` option from the contributing guide, since I haven't had to use that option for a very long time. I moved the docs to the spotless config.	2021-06-16 09:22:22 +01:00
Martijn van Groningen	8eee970e9f	Adjust get alias api for write data streams (#73987 ) Use `is_write_index` instead of `is_write_data_stream` to indicate whether an data stream alias is a write data stream alias. Although the latter is a more accurate name, the former is what is used to turn a data stream alias into a write data stream in the indices aliases api. The design of data stream aliases is that it looks and behaves like any other alias and using `is_write_data_stream` would go against this design. Also index or indices is an accepted overloaded term that can mean both regular index, data stream or an alias in Elasticsearch APIs. By using `is_write_index`, consumers of the get aliases API don't need to make changes. Relates to #66163	2021-06-14 09:25:22 +02:00
Ryan Ernst	68817d7ca2	Rename o.e.common in libs/core to o.e.core (#73909 ) When libs/core was created, several classes were moved from server's o.e.common package, but they were not moved to a new package. Split packages need to go away long term, so that Elasticsearch can even think about modularization. This commit moves all the classes under o.e.common in core to o.e.core. relates #73784	2021-06-08 09:53:28 -07:00
William Brafford	1c295a92d8	Add threadpool for critical operations on system indices (#72625 ) * Add new thread pool for critical operations * Split critical thread pool into read and write * Add POJO to hold thread pool names * Add tests for critical thread pools * Add thread pools to data streams * Update settings for security plugin * Retrieve ExecutorSelector from SystemIndices where possible * Use a singleton ExecutorSelector	2021-06-03 12:07:37 -04:00
Martijn van Groningen	27dfc58bd6	Take include_aliases flag into account when restoring data stream aliases (#73595 ) Take RestoreSnapshotRequest#includeAliases() into account when restoring data stream aliases from a snapshot into a cluster. Relates to #66163	2021-06-02 11:02:24 +02:00
Martijn van Groningen	6b2322f827	Also rename write data stream alias during a restore. (#73588 ) Rename during a restore should also rename write data stream in data stream alias. Relates to #66163	2021-06-02 11:01:59 +02:00
Martijn van Groningen	afc17bdb74	Add support for is_write_index flag to data stream aliases. (#73462 ) This allows indexing documents into a data stream alias. The ingestion is that forwarded to the write index of the data stream that is marked as write data stream. The `is_write_index` parameter can be used to indicate what the write data stream is, when updating / adding a data steam alias. Relates to #66163	2021-05-31 15:08:39 +02:00
Martijn van Groningen	0560caaeae	Make DataStreamsSnapshotsIT resilient to failures because of local time. (#73516 ) Backing index names contain a date component. Instead of defining what the expected backing index names are before creating the data streams, resolve these expected backing index names after the data streams are created. This way we avoid failures that can occur around midnight (local time), where the expected names are created before midnight and the data streams are created after midnight. Closes #73510	2021-05-28 13:17:33 +02:00
Martijn van Groningen	bbb25a01ce	Add more validation for data stream aliases. (#73416 ) Currently when attempting to an alias to points to both data streams and regular indices the alias does get created, but points only to data streams. With this change attempting to add such aliases results in a client error. Currently when adding data stream aliases with unsupported parameters (e.g. filter or routing) the alias does get created, but without the unsupported parameters. With this change attempting to create aliases to point to data streams with unsupported parameters will result in a client error. Relates to #66163	2021-05-27 13:22:01 +02:00
Martijn van Groningen	628980c1e0	Data stream aliases and action request's includeDataStreams flag. (#73266 ) When data stream aliases are resolved then the includeDataStreams flag of an action request should be taken into account, so that data stream aliases aren't resolved to backing indices for apis that don't support data streams. Closes #73195	2021-05-26 10:07:12 +02:00
Martijn van Groningen	4b2c3ab0b7	The get aliases api should not return entries for data streams with no aliases (#72953 ) The get alias api should take into account the aliases parameter when returning aliases that refer to data streams and don't return entries for data streams that don't have any aliases pointing to it. Relates to #66163	2021-05-19 10:07:11 +02:00
Martijn van Groningen	963c226118	Adjust get alias api with aliases pointing to data streams (#73140 ) Change the get alias api to not return a 404 when filtering by alias name that refers to data streams. Originated from #72953 Relates to #66163	2021-05-18 12:20:37 +02:00
Martijn van Groningen	f041a02d8c	Take data stream aliases into account with snapshot and restore (#72970 ) Data stream aliases are stored separately from the data streams in the cluster state. Currently snapshot/restore only takes data streams into account during snapshotting and restoring, this change changes snapshot/restore to also capture and restore data stream aliases. Which alias instances to use depends on the actual data streams that are included in a snapshot or restored from a snapshot. Relates to #66163	2021-05-18 12:09:39 +02:00
Martijn van Groningen	6689b8bf1c	Add basic alias support for data streams (#72613 ) Aliases to data streams can be defined via the existing update aliases api. Aliases can either only refer to data streams or to indices (not both). Also the existing get aliases api has been modified to support returning aliases that refer to data streams. Aliases for data streams are stored separately from data streams and and refer to data streams by name and not to the backing indices of a data stream. This means that when backing indices are added or removed from a data stream that then the data stream alias doesn't need to be updated. The authorization model for aliases that refer to data streams is the same as for aliases the refer to indices. In security privileges can be defined on aliases, indices and data streams. When a privilege is granted on an alias then access is also granted on the indices that an alias refers to (irregardless whether privileges are granted or denied on the actual indices). The same will apply for aliases that refer to data streams. See for more details: https://github.com/elastic/elasticsearch/issues/66163#issuecomment-824709767 Relates to #66163	2021-05-11 09:51:05 +02:00
Rene Groeschke	e609e07cfe	Remove internal build logic from public build tool plugins (#72470 ) Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic This includes a refactoring of ElasticsearchDistribution to handle types better in a way we can differentiate between supported Elasticsearch Distribution types supported in TestCkustersPlugin and types only supported in internal plugins. It also introduces a set of internal versions of public plugins. As part of this we also generate the plugin descriptors now. As a follow up on this we can actually move these public used classes into an extra project (declared as included build) We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase	2021-05-06 14:02:35 +02:00
Armin Braun	e683d71ea4	Fix Edge Case Datastream Snapshot Create Bug (#72747 ) We always clean up the list of datastreams depending on the indices acutally in the snapshot. Even for non-partial snapshots datastream indices could be excluded from the snapshot by index exlusions that remove all the datastream's indices from the snapshot.	2021-05-05 16:07:31 +02:00
Luca Cavanna	b92b9d1c94	Replace some DocumentMapper usages with MappingLookup (#72400 ) We recently replaced some usages of DocumentMapper with MappingLookup in the search layer, as document mapper is mutable which can cause issues. In order to do that, MappingLookup grew and became quite similar to DocumentMapper in what it does and holds. In many cases it makes sense to use MappingLookup instead of DocumentMapper, and we may even be able to remove DocumentMapper entirely in favour of MappingLookup in the long run. This commit replaces some of its straight-forward usages.	2021-05-03 09:42:37 +02:00
Martijn van Groningen	4d2f709951	Allow disabling ignore_malformed on data stream's timestamp field (#72406 ) If `index.mapping.ignore_malformed` has been set to `true` then here is no way to overwrite that to `false` for a data stream's timestamp field. Before this commit, validation would fail that disallow the usage of `ignore_malformed` attribute on a data stream's timestamp field. This commit allows the usage of `ignore_malformed` attribute, so that `index.mapping.ignore_malformed` can be disabled for a data stream's timestamp field. The `ignore_malformed` attribute can only be set to false. This allows the following index template: ``` PUT /_index_template/filebeat { "index_patterns": [ "filebeat-*" ], "template": { "settings": { "index": { "mapping.ignore_malformed": true } }, "mappings": { "properties": { "@timestamp": { "type": "date", "ignore_malformed": false } } } }, "data_stream": {} } ``` Closes #71755	2021-04-29 09:38:10 +02:00
Rene Groeschke	5bcd02cb4d	Restructure build tools java packages (#72030 ) Related to #71593 we move all build logic that is for elasticsearch build only into the org.elasticsearch.gradle.internal* packages This makes it clearer if build logic is considered to be used by external projects Ultimately we want to only expose TestCluster and PluginBuildPlugin logic to third party plugin authors. This is a very first step towards that direction.	2021-04-26 14:53:55 +02:00
Armin Braun	afb42fbf1b	Make SnapshotsInProgress.Entry#indices a Map (#72196 ) We constantly need to look up from index name to `IndexId`, might as well just have this field as a map to simplify some code here and there and save a few cycles during cluster state updates for snapshots of large index counts.	2021-04-26 14:30:05 +02:00
Julie Tibshirani	b897ceb181	Revert "Fix ListenableFuture Resolving Listeners under Mutex" (#72055 ) We think that this PR, along with #70373, has exposed some legitimate bugs that we need to address. They're currently causing test failures that are hard to scope and mute. For now we'll revert this commit, with the plan to reintroduce it along with the bug fixes. Relates to #72033, #72031.	2021-04-21 15:37:30 -07:00
Mark Vieira	7240b1cddc	Mute SystemDataStreamIT	2021-04-21 08:18:46 -07:00
Jay Modi	a7dbb31765	Add Fleet action results system data stream (#71667 ) This commit adds support for system data streams and also the first use of a system data stream with the fleet action results data stream. A system data stream is one that is used to store system data that users should not interact with directly. Elasticsearch will manage these data streams. REST API access is available for external system data streams so that other stack components can store system data within a system data stream. System data streams will not use the system index read and write threadpools.	2021-04-20 13:33:12 -06:00
Lyudmila Fokina	3b0b7941ae	Warn users if security is implicitly disabled (#70114 ) * Warn users if security is implicitly disabled Elasticsearch has security features implicitly disabled by default for Basic and Trial licenses, unless explicitly set in the configuration file. This may be good for onboarding, but it also lead to unintended insecure clusters. This change introduces clear warnings when security features are implicitly disabled. - a warning header in each REST response if security is implicitly disabled; - a log message during cluster boot.	2021-04-13 18:33:41 +02:00
Joe Gallo	2a0ec50d47	Apply REST API compatibility testing for the :x-pack plugins (#71302 )	2021-04-06 11:35:33 -04:00
Martijn van Groningen	633b66f09d	Allow closing a write index of a data stream. (#70908 ) Prior to this commit when attempting to close a data stream a validation error is returned indicating that it is forbidden to close a write index of a data stream. The idea behind that is to ensure that a data stream always can accept writes. For the same reason deleting a write index is not allowed (the write index can only be deleted when deleting the entire data stream). However closing an index isn't as destructive as deleting an index (an open index request makes the write index available again) and there are other cases where a data stream can't accept writes. For example when primary shards of the write index are not available. So the original reasoning for not allowing to close a write index isn't that strong. On top of this is that this also avoids certain administrative operations from being performed. For example restoring a snapshot containing data streams that already exist in the cluster (in place restore). Closes #70903 #70861	2021-03-30 10:15:56 +02:00
Dan Hermann	2c6ba92d46	Improve data stream rollover and simplify cluster metadata validation for data streams (#70934 )	2021-03-29 07:36:44 -05:00
Mark Vieira	6339691fe3	Consolidate REST API specifications and publish under Apache 2.0 license (#70036 )	2021-03-26 16:20:14 -07:00
Nhat Nguyen	5bb440cdca	Move point in time to server (#70704 ) This change moves the implementation of point in time to the server package.	2021-03-24 14:29:20 -04:00
Joe Gallo	9c5fa020d8	[REST Compatible API] Route refactoring (addendum) (#70168 ) Followup to #69573	2021-03-10 10:14:51 -05:00
Jay Modi	1487a5a991	Introduce system index types including external (#68919 ) This commit introduces system index types that will be used to differentiate behavior. Previously system indices were all treated the same regardless of whether they belonged to Elasticsearch, a stack component, or one of our solutions. Upon further discussion and analysis this decision was not in the best interest of the various teams and instead a new type of system index was needed. These system indices will be referred to as external system indices. Within external system indices, an option exists for these indices to be managed by Elasticsearch or to be managed by the external product. In order to represent this within Elasticsearch, each system index will have a type and this type will be used to control behavior. Closes #67383	2021-03-01 10:38:53 -07:00
Yannick Welsch	e76860592e	Allow force-merges to run in parallel on a node (#69416 ) Increasing the number of threads to be used for force-merging does not automatically give you any parallelism, even if you have many shards per node, as force-merge requests are split into node-level subrequests (see TransportBroadcastByNodeAction, superclass of TransportForceMergeAction), one for each node, and these subrequests are then executing sequentially for all the shards on that node.	2021-02-25 14:24:48 +01:00
Armin Braun	d334e3bef1	Improve Partial Snapshot Rollover Behavior (#69364 ) Using new reconciliation functionality to not needlessly drop rolling over data streams from the final snapshot. closes #68536	2021-02-22 17:57:13 +01:00
Rene Groeschke	bdf229a148	Introduce Internal Test Artifact Plugin (#68766 ) This reduces the ceremony declaring test artifacts for a project. It also solves an issue with usage of deprecated testRuntime that testArtifacts extendsFrom which seems not required at all and would have broke with Gradle 7.0 anyhow Test artifact resolution is now variant aware which allows us a more adequate compile and runtime classpath for the consuming projects. We also Introduce a convention method in the elasticsearch build to declare test artifact dependencies in an easy way close to how its done by the gradle build in test fixture plugin. Furthermore we cleaned up some inconsistent test dependencies declarations when relying on a project and on its test artifacts	2021-02-16 14:36:17 +01:00
David Turner	96b8a24e8c	Indices segments: bg serialize, make cancellable (#68965 ) The response to an `IndicesSegmentsAction` might be large, perhaps 10s of MBs of JSON, and today it is serialized on a transport thread. It also might take so long to respond that the client times out, resulting in the work needed to compute the response being wasted. This commit introduces the `DispatchingRestToXContentListener` which dispatches the work of serializing an `XContent` response to a non-transport thread, and also makes `TransportBroadcastByNodeAction` sensitive to the cancellability of its tasks. It uses these two features to make the `RestIndicesSegmentsAction` serialize its response on a `MANAGEMENT` thread, and to abort its work more promptly if the client's channel is closed before the response is sent.	2021-02-16 08:19:50 +00:00
Gordon Brown	3f6472de74	Introduce "Feature States" for managing snapshots of system indices (#63513 ) This PR expands the meaning of `include_global_state` for snapshots to include system indices. If `include_global_state` is `true` on creation, system indices will be included in the snapshot regardless of the contents of the `indices` field. If `include_global_state` is `true` on restoration, system indices will be restored (if included in the snapshot), regardless of the contents of the `indices` field. Index renaming is not applied to system indices, as system indices rely on their names matching certain patterns. If restored system indices are already present, they are automatically deleted prior to restoration from the snapshot to avoid conflicts. This behavior can be overridden to an extent by including a new field in the snapshot creation or restoration call, `feature_states`, which contains an array of strings indicating the "feature" for which system indices should be snapshotted or restored. For example, this call will only restore the `watcher` and `security` system indices (in addition to `index_1`): ``` POST /_snapshot/my_repository/snapshot_2/_restore { "indices": "index_1", "include_global_state": true, "feature_states": ["watcher", "security"] } ``` If `feature_states` is present, the system indices associated with those features will be snapshotted or restored regardless of the value of `include_global_state`. All system indices can be omitted by providing a special value of `none` (`"feature_states": ["none"]`), or included by omitting the field or explicitly providing an empty array (`"feature_states": []`), similar to the `indices` field. The list of currently available features can be retrieved via a new "Get Snapshottable Features" API: ``` GET /_snapshottable_features ``` which returns a response of the form: ``` { "features": [ { "name": "tasks", "description": "Manages task results" }, { "name": "kibana", "description": "Manages Kibana configuration and reports" } ] } ``` Features currently map one-to-one with `SystemIndexPlugin`s, but this should be considered an implementation detail. The Get Snapshottable Features API and snapshot creation rely upon all relevant plugins being installed on the master node. Further, the list of feature states included in a given snapshot is exposed by the Get Snapshot API, which now includes a new field, `feature_states`, which contains a list of the feature states and their associated system indices which are included in the snapshot. All system indices in feature states are also included in the `indices` array for backwards compatibility, although explicitly requesting system indices included in a feature state is deprecated. For example, an excerpt from the Get Snapshot API showing `feature_states`: ``` "feature_states": [ { "feature_name": "tasks", "indices": [ ".tasks" ] } ], "indices": [ ".tasks", "test1", "test2" ] ``` Co-authored-by: William Brafford <william.brafford@elastic.co>	2021-02-11 11:55:14 -07:00
Rene Groeschke	5dfa6f46ac	Remove deprecated usage of default configuration (#68575 ) This has been deprecated in gradle before but we havnt been warned. Gradle 7.0 will likely introduce a change in behaviour here that we should fix the usage of this configuration upfront. See https://github.com/gradle/gradle/issues/16027 for further information about the change in Gradle 7.0	2021-02-07 12:08:02 +01:00
Mark Vieira	a92a647b9f	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 16:10:53 -08:00

1 2 3

128 Commits