elasticsearch

Commit Graph

Author	SHA1	Message	Date
István Zoltán Szabó	1971bd4591	[DOCS] Adds Transform alerts docs (#78185 )	2021-10-05 14:06:48 +02:00
James Rodewig	5c7fac77b3	[DOCS] Add Beats config example for ingest pipelines (#78633 ) * [DOCS] Add Beats config example for ingest pipelines The Elasticsearch ingest pipeline docs cover ingest pipelines for Fleet and Elastic Agent. However, the docs don't cover Beats. This adds those docs. Relates to https://github.com/elastic/beats/pull/28239. * Update docs/reference/ingest.asciidoc Co-authored-by: DeDe Morton <dede.morton@elastic.co> Co-authored-by: DeDe Morton <dede.morton@elastic.co>	2021-10-05 05:47:50 -04:00
Alan Woodward	2de2bef4de	Remove indices_segments 'verbose' parameter (#78451 ) The 'verbose' option to /_segments returns memory information for each segment. However, lucene 9 has stopped tracking this memory information as it is largely held off-heap and so is no longer significant. This commit deprecates the 'verbose' parameter and makes it a no-op. Fixes #75955	2021-10-05 09:17:16 +01:00
Ignacio Vera	920b3b52c2	Add support for metrics aggregations to mvt end point (#78614 ) It adds support for several aggregations.	2021-10-05 09:17:25 +02:00
James Rodewig	fd30c6daf8	Add reference to PHP client on Bulk API page (#78558 ) (#78651 ) Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com> Co-authored-by: Christian Fratta <christian.fratta@gmail.com>	2021-10-04 17:42:42 -04:00
Joe Gallo	4a14f2f6f9	Validate that snapshot repository exists for ILM policies at creation/update time (#78468 )	2021-10-04 15:19:10 -04:00
Benjamin Trent	7a7fffcb5a	[ML] Text/Log categorization multi-bucket aggregation (#71752 ) This commit adds a new multi-bucket aggregation: `categorize_text` The aggregation follows a similar design to significant text in that it reads from `_source` and re-analyzes the the text as it is read. Key difference is that it does not use the indexed field's analyzer, but instead relies on the `ml_standard` tokenizer with specialized ML token filters. The tokenizer + filters are the same that machine learning categorization anomaly jobs utilize. The high level logical flow is as follows: - at each shard, read in the text field with a custom analyzer using `ml_standard` tokenizer - Read in the particular tokens from the analyzer - Feed these tokens to a token tree algorithm (an adaptation of the drain categorization algorithm) - Gather the individual log categories (the leaf nodes), sort them by doc_count, ship those buckets to be merged - Merge all buckets that have the EXACT same key - Once all buckets are merged, pass those keys + counts to a new token tree for additional merging - That tree builds the final buckets and that is returned to the user Algorithm explanation: - Each log is parsed with the ml-standard tokenizer - each token is passed into a token tree - For `max_match_token` each token is stored in the tree and at `max_match_token+1` (or `len(tokens)`) a log group is created - If another log group exists at that leaf, merge it if they have `similarity_threshold` percentage of tokens in common - merging simply replaces tokens that are different in the group with `` - If a layer in the tree has `max_unique_tokens` we add a `` child and any new tokens are passed through there. Catch here is that on the final merge, we first attempt to merge together subtrees with the smallest number of documents. Especially if the new sub tree has more documents counted. ## Aggregation configuration. Here is an example on some openstack logs ```js POST openstack/_search?size=0 { "aggs": { "categories": { "categorize_text": { "field": "message", // The field to categorize "similarity_threshold": 20, // merge log groups if they are this similar "max_unique_tokens": 20, // Max Number of children per token position "max_match_token": 4, // Maximum tokens to build prefix trees "size": 1 } } } } ``` This will return buckets like ```json "aggregations" : { "categories" : { "buckets" : [ { "doc_count" : 806, "key" : "nova-api.log.1.2017-05-16_13 INFO nova.osapi_compute.wsgi.server * HTTP/1.1 status len time" } ] } } ```	2021-10-04 11:49:16 -04:00
Stef Nestor	e0cb0beb73	[DOCS] Fix SLM status response (#78584 ) The get SLM status API will only return one of three statuses: `RUNNING`, `STOPPING`, or `STOPPED`. This corrects the docs to remove the `STARTED` status and document the `RUNNING` status. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-10-04 09:41:17 -04:00
Tanguy Leroux	63d663e220	Add periodic maintenance task to clean up unused blob store cache docs (#78438 ) In #77686 we added a service to clean up blob store cache docs after a searchable snapshot is no more used. We noticed some situations where some cache docs could still remain in the system index: when the system index is not available when the searchable snapshot index is deleted; when the system index is restored from a backup or when the searchable snapshot index was deleted on a version before #77686. This commit introduces a maintenance task that periodically scans and cleans up unused blob cache docs. This task is scheduled to run every hour on the data node that contain the blob store cache primary shard. The periodic task works by using a point in time context with search_after.	2021-10-04 13:15:56 +02:00
James Rodewig	9e0299f551	[DOCS] Troubleshoot the flood-stage watermark error (#78519 ) Adds troubleshooting steps for the flood-stage watermark error. Closes #77906.	2021-10-01 08:32:53 -04:00
Ignacio Vera	e4cde37111	Add centroid grid type in mvt request (#78305 ) For this grid type, the features on the aggregation layer are represented by a point that is computed from the centroid of the data inside the cell Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-10-01 06:56:13 +02:00
James Rodewig	c33e340a47	[DOCS] EQL: Document `runs` keyword (#78478 ) (#78518 ) Documents the `runs` keyword for running the same event criteria successively in a sequence query. Relates to #75082. # Conflicts: # docs/reference/release-notes/highlights.asciidoc	2021-09-30 10:23:14 -04:00
Yannick Welsch	3dac76c190	Disk usage API does not support timeout parameters (#78503 ) Fixes the documentation that the disk usage API is not supporting timeout parameters. Closes #78356	2021-09-30 16:08:00 +02:00
James Rodewig	12019a89fd	[DOCS] Document archived settings (#78351 ) Documents `archived.*` persistent cluster settings and index settings. These settings are commonly produced during a major version upgrade. Closes #28027	2021-09-30 09:27:53 -04:00
debadair	7431a9656e	[DOCS] Fix erroneous page break. (#78487 )	2021-09-29 15:12:13 -07:00
William Brafford	8c2fe902f3	Feature upgrade rest stubs (#77827 ) * Add stubs for get API * Add stub for post API * Register new actions in ActionModule * HLRC stubs * Unit tests * Add rest api spec and tests * Add new action to non-operator actions list	2021-09-29 16:25:15 -04:00
Jack Conradson	086ba1aefb	Remove JodaCompatibleZonedDateTime (#78417 ) This change removes JodaCompatibleZonedDateTime and replaces it with ZonedDateTime for use in scripting. Breaking changes: * JodaCompatibleDateTime no longer exists and cannot be cast to in Painless. Use ZonedDateTime instead. * The dayOfWeek method on ZonedDateTime returns the DayOfWeek enum instead of an int from JodaCompatibleDateTime. dayOfWeekEnum still exists on ZonedDateTime as an augmentation to support the transition to ZonedDateTime, but is now deprecated in favor of dayOfWeek on ZonedDateTime.	2021-09-29 13:01:40 -07:00
Benjamin Trent	498e6e3d0f	[ML] adding docs for estimated heap and operations (#78376 ) Add docs for optionally supplying memory and operation estimates in put model	2021-09-29 09:11:42 -04:00
James Rodewig	4544ab2dbb	[DOCS] Always enable file and native realms unless explicitly disabled (#78405 ) * [DOCS] Always enable file and native realms by default Adds an 8.0 breaking change for PR #69096. The copy is based on the 7.13 deprecation notice added with PR #69320. * reword * Update docs/reference/migration/migrate_8_0/security.asciidoc Co-authored-by: Yang Wang <ywangd@gmail.com> * Update docs/reference/migration/migrate_8_0/security.asciidoc Co-authored-by: Yang Wang <ywangd@gmail.com> Co-authored-by: Yang Wang <ywangd@gmail.com>	2021-09-29 09:10:30 -04:00
James Rodewig	f4b5ef7416	[DOCS] Remove `include_type_name` query parameter (#78394 ) Adds an 8.0 breaking change for PR #48632.	2021-09-29 09:00:15 -04:00
Benjamin Trent	b96d929af3	[ML] add documentation for get deployment stats API (#78412 ) * [ML] add documentation for get deployment stats API * Apply suggestions from code review Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2021-09-29 07:20:25 -04:00
David Turner	07a2acac93	Improve docs for pre-release version compatibility (#78428 ) * Improve docs for pre-release version compatibility Follow-up to #78317 clarifying a couple of points: - a pre-release build can restore snapshots from released builds - compatibility applies if at least one of the local or remote cluster is a released build * Remote cluster build date nit	2021-09-29 04:49:07 -04:00
James Baiera	eafbd336c2	Remove Monitoring ingest pipelines (#77459 ) Monitoring installs a number of ingest pipelines which have been historically used to upgrade documents when mappings and document structures change between versions. Since there aren't any changes to the document format, nor will there be by the time the format is completely retired, we can comfortably remove these pipelines.	2021-09-28 16:10:02 -04:00
James Rodewig	58595e7af5	[DOCS] Searches on the `_type` field are no longer supported (#78400 ) Adds an 8.0 breaking change for PR #68564	2021-09-28 14:51:45 -04:00
Benjamin Trent	408489310c	[ML] add zero_shot_classification task for BERT nlp models (#77799 ) Zero-Shot classification allows for text classification tasks without a pre-trained collection of target labels. This is achieved through models trained on the Multi-Genre Natural Language Inference (MNLI) dataset. This dataset pairs text sequences with "entailment" clauses. An example could be: "Throughout all of history, man kind has shown itself resourceful, yet astoundingly short-sighted" could have been paired with the entailment clauses: ["This example is history", "This example is sociology"...]. This training set combined with the attention and semantic knowledge in modern day NLP models (BERT, BART, etc.) affords a powerful tool for ad-hoc text classification. See https://arxiv.org/abs/1909.00161 for a deeper explanation of the MNLI training and how zero-shot works. The zeroshot classification task is configured as follows: ```js { // <snip> model configuration </snip> "inference_config" : { "zero_shot_classification": { "classification_labels": ["entailment", "neutral", "contradiction"], // <1> "labels": ["sad", "glad", "mad", "rad"], // <2> "multi_label": false, // <3> "hypothesis_template": "This example is {}.", // <4> "tokenization": { /<snip> tokenization configuration </snip>/} } } } ``` * <1> For all zero_shot models, there returns 3 particular labels when classification the target sequence. "entailment" is the positive case, "neutral" the case where the sequence isn't positive or negative, and "contradiction" is the negative case * <2> This is an optional parameter for the default zero_shot labels to attempt to classify * <3> When returning the probabilities, should the results assume there is only one true label or multiple true labels * <4> The hypothesis template when tokenizing the labels. When combining with `sad` the sequence looks like `This example is sad.` For inference in a pipeline one may provide label updates: ```js { //<snip> pipeline definition </snip> "processors": [ //<snip> other processors </snip> { "inference": { // <snip> general configuration </snip> "inference_config": { "zero_shot_classification": { "labels": ["humanities", "science", "mathematics", "technology"], // <1> "multi_label": true // <2> } } } } //<snip> other processors </snip> ] } ``` * <1> The `labels` we care about, these replace the default ones if they exist. * <2> Should the results allow multiple true labels Similarly one may provide label changes against the `_infer` endpoint ```js { "docs":[{ "text_field": "This is a very happy person"}], "inference_config":{"zero_shot_classification":{"labels": ["glad", "sad", "bad", "rad"], "multi_label": false}} } ```	2021-09-28 09:38:23 -04:00
James Rodewig	485e7deaa0	[DOCS] Re-add docs for multiple data paths (MDP) (#78342 ) We deprecated support for multiple data paths (MDP) in 7.13. However, we won't remove support until after 8.0. Changes: * Reverts PR #72267, which removed MDP docs * Removes a related item from the 8.0 breaking changes.	2021-09-28 09:20:45 -04:00
James Rodewig	0c01bcdd9f	[DOCS] Remove index API's `types` option (#78335 ) Adds an 8.0 breaking change for PR #47203.	2021-09-28 08:44:25 -04:00
James Rodewig	1764fa0e8f	[DOCS] Remove `type` query (#78334 ) Adds an 8.0 breaking change for PR #47207.	2021-09-28 08:44:06 -04:00
Benjamin Trent	00defa38a9	[ML] adding some initial document for our pytorch NLP model support (#78270 ) Adding docs for: put vocab put model definition part start deployment all the new NLP configuration objects for trained model configurations	2021-09-27 12:46:13 -04:00
David Turner	4782cf4d91	Add docs for pre-release version compatibility (#78317 ) The reference manual includes docs on version compatibility in various places, but it's not clear that these docs only apply to released versions and that the rules for pre-release versions are stricter than folks expect. This commit adds some words to the docs for unreleased versions which explains this subtlety.	2021-09-27 16:56:35 +01:00
Przemyslaw Gomulka	8c0d7fa2fa	[doc] Improve documentation for deprecation logging (#78326 ) adding a section on WARN messages relates #77030	2021-09-27 16:56:26 +02:00
James Rodewig	b20939f071	[DOCS] Document empty first line support for msearch API (#78284 ) Adds an 8.0 breaking change for PR #41011	2021-09-27 08:58:22 -04:00
Lukas Wegmann	421b3e80de	Document missing_order param for composite aggregations (#77839 ) Documents the missing_order parameter for composite aggregations introduced in #76740	2021-09-27 09:57:45 +02:00
James Rodewig	38125c147d	[DOCS] Remove `gateway.auto_import_dangling_indices` setting (#78280 ) Adds an 8.0 breaking change for PR #59698.	2021-09-26 19:24:01 -04:00
James Rodewig	181aebd1dc	[DOCS] Watcher history now writes to a data stream (#78277 ) Adds an 8.0 breaking change for PR #64252.	2021-09-23 16:07:01 -04:00
James Rodewig	96c4bd96a9	[DOCS] Remove support for `unmapped_type:string` sort (#78272 ) * [DOCS] Remove support for `unmapped_type:string` sort Adds an 8.0 breaking change for PR #45675. * Clarify error * Reset mapping changes	2021-09-23 13:37:46 -04:00
James Rodewig	b3cdf60ab3	Adding priority list and executing description to the pending tasks doc (#74456 ) (#78259 ) * Adding priority to the pending tasks doc https://github.com/elastic/elasticsearch/pull/19448#discussion_r70969307 `917fea7c5d/core/src/main/java/org/elasticsearch/common/Priority.java (L29)` * Adding executing into the cluster pending tasks * Update docs/reference/cluster/pending.asciidoc Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com> Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com> Co-authored-by: Leaf-Lin <39002973+Leaf-Lin@users.noreply.github.com> Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>	2021-09-23 11:17:18 -04:00
István Zoltán Szabó	1d367abffc	[DOCS] Modifies aggregations title abbreviation to follow convention. (#78252 )	2021-09-23 16:22:27 +02:00
James Rodewig	ce4b95e5b0	[DOCS] Document `time_series_metric` mapping parameter (#78013 ) Changes: * Documents the `time_series_metric` mapping parameter for PR #76766. * Renames the `dimension` parameter to `time_series_dimension` for PR #78012. * Adds support for `unsigned_long` to `time_series_dimension` for PR #78204.	2021-09-23 08:54:19 -04:00
Ignacio Vera	9033faffff	Add cross cluster search test for mvt end point (#78054 ) This commit adds a test to check that it is supported and document it. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-09-23 07:59:44 +02:00
Tim Vernum	6125067145	Add 'show' command to the keystore CLI (#76693 ) This adds a new "elasticsearch-keystore show" command that displays the value of a single secure setting from the keystore. An optional `-o` (or `--output`) parameter can be used to direct output to a file. The `-o` option is required for binary keystore values because the CLI `Terminal` class does not support writing binary data. Hence this command: elasticsearch-keystore show xpack.watcher.encryption_key > watcher.key would not produce a file with the correct contents. Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>	2021-09-23 12:37:20 +10:00
James Rodewig	80ba92f1b1	[DOCS] Add breaking change for unsupported `script` fields (#78217 ) Adds an 8.0 breaking change for PR #59507.	2021-09-22 17:41:06 -04:00
Adam Locke	6940673e8a	[DOCS] Update remote cluster docs (#77043 ) * [DOCS] Update remote cluster docs * Add files, rename files, write new stuff * Plethora of changes * Add test and update snippets * Redirects, moved files, and test updates * Moved file to x-pack for tests * Remove older CCS page and add redirects * Cleanup, link updates, and some rewrites * Update image * Incorporating user feedback and rewriting much of the remote clusters page * More changes from review feedback * Numerous updates, including request examples for CCS and Kibana * More changes from review feedback * Minor clarifications on security for remote clusters * Incorporate review feedback Co-authored-by: Yang Wang <ywangd@gmail.com> * Some review feedback and some editorial changes Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Yang Wang <ywangd@gmail.com>	2021-09-22 16:02:33 -04:00
James Rodewig	15baf4017a	[DOCS] Remove `_term` and `_time` agg order keys (#78209 ) Adds an 8.0 breaking change for the removal of the `_term` and `_time` agg `order` keys. Relates to #39450	2021-09-22 15:54:14 -04:00
James Rodewig	ce56c19346	[DOCS] Remove support for EOL OSs and `SysV init` (#78199 ) Adds an 8.0 breaking change for the removal of support for several EOL operating systems and `SysV init`. Relates to #51480 and #51716	2021-09-22 13:41:52 -04:00
James Rodewig	2b2f0e1d7f	[DOCS] Remove the `listener` thread pool (#78194 ) Changes: * Removes docs for the `listener` thread pool * Adds an 8.0 breaking change for the thread pool removal Relates to #53314 and #53049	2021-09-22 13:41:05 -04:00
Ryan Ernst	a06aff9b01	Revert "Fail index creation using custom data path (#76792 )" (#78031 ) This reverts commit `79d91ed9d3`.	2021-09-22 09:02:56 -07:00
Adam Locke	7d61b0261c	[DOCS] Add composite runtime fields (#78050 ) * [DOCS] Add composite runtime fields * Update snippets and tests * Add note that composite runtime fields cannot be indexed yet	2021-09-22 07:56:50 -04:00
Ignacio Vera	75b7b0db03	Add track_total_hits support in mvt API (#78074 ) This allows consumers of the API to be able to know exactly if all the features in a tile has been considered when building the hits layer of a vector tile Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-09-22 08:37:50 +02:00
James Rodewig	db1aac1d8b	[DOCS] Edit dedicated hosts section heading	2021-09-21 17:53:07 -04:00

1 2 3 4 5 ...

9115 Commits