Commit Graph

10215 Commits

Author SHA1 Message Date
David Turner 4c68382065
Capture thread dump on ShardLockObtainFailedException (#93458)
We sometimes see a `ShardLockObtainFailedException` when a shard failed
to shut down as fast as we expected, often because a node left and
rejoined the cluster. Sometimes this is because it was held open by
ongoing scrolls or PITs, but other times it may be because the shutdown
process itself is too slow. With this commit we add the ability to
capture and log a thread dump at the time of the failure to give us more
information about where the shutdown process might be running slowly.

Relates #93226
2023-02-02 11:17:40 -05:00
Ievgen Degtiarenko 513dc2f24f
Expose per node counts (#93439) 2023-02-02 16:13:01 +01:00
Hendrik Muhs cf5ea0bb1f
[ML] rename frequent_items to frequent_item_sets and make it GA (#93421)
rename frequent_items to frequent_item_sets and remove the experimental batch
2023-02-02 09:25:00 +01:00
Benjamin Trent 323a13ac3f
Add `term` query support to rank_features mapped field (#93247)
This adds term query capabilities for rank_features fields. term queries against rank_features are not scored in the typical way as regular fields. This is because the stored feature values take advantage of the term frequency storage mechanism, and thus regular BM25 does not work.

Instead, a term query against a rank_features field is very similar to linear rank_feature query. If more complicated combinations of features and values are required, the rank_feature query should be used.
2023-02-01 13:32:13 -05:00
Benjamin Trent 7f9f3bcd30
Add new query_vector_builder option to knn search clause (#93331)
This adds a new option to the knn search clause called query_vector_builder. This is a pluggable configuration that allows the query_vector created or retrieved.
2023-02-01 13:31:46 -05:00
Artem Prigoda 58c1bcc0f8
[DOCS] [main] Add release notes for 8.6.1 (#93236) (#93404)
Forward ports the release notes from #93236
2023-02-01 11:42:36 +01:00
Abdon Pijpelink d93382bcb6
[DOCS] Remove 'from' parameter from update_by_query/delete_by_query docs (#93379) 2023-02-01 09:09:57 +09:00
Nicolas Ruflin 9f4d7fafad
Add `ignore_missing_component_templates` config option (#92436)
This change introduces the configuration option `ignore_missing_component_templates` as discussed in https://github.com/elastic/elasticsearch/issues/92426 The implementation [option 6](https://github.com/elastic/elasticsearch/issues/92426#issuecomment-1372675683) was picked with a slight adjustment meaning no patterns are allowed.

## Implementation

During the creation of an index template, the list of component templates is checked if all component templates exist. This check is extended to skip any component templates which are listed under `ignore_missing_component_templates`. An index template that skips the check for the component template `logs-foo@custom` looks as following:


```
PUT _index_template/logs-foo
{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}
```

The component template `logs-foo@package` has to exist before creation. It can be created with:

```
PUT _component_template/logs-foo@custom
{
  "template": {
    "mappings": {
      "properties": {
        "host.ip": {
          "type": "ip"
        }
      }
    }
  }
}
```

## Testing

For manual testing, different scenarios can be tested. To simplify testing, the commands from `.http` file are added. Before each test run, a clean cluster is expected.

### New behaviour, missing component template

With the new config option, it must be possible to create an index template with a missing component templates without getting an error:

```
### Add logs-foo@package component template

PUT http://localhost:9200/
    _component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json

{
  "template": {
    "mappings": {
      "properties": {
        "host.name": {
          "type": "keyword"
        }
      }
    }
  }
}

### Add logs-foo index template

PUT http://localhost:9200/
    _index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json

{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}

### Create data stream

PUT http://localhost:9200/
    _data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json

### Check if mappings exist

GET http://localhost:9200/
    logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```

It is checked if all templates could be created and data stream mappings are correct.

### Old behaviour, with all component templates

In the following, a component template is made optional but it already exists. It is checked, that it will show up in the mappings:

```
### Add logs-foo@package component template

PUT http://localhost:9200/
    _component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json

{
  "template": {
    "mappings": {
      "properties": {
        "host.name": {
          "type": "keyword"
        }
      }
    }
  }
}

### Add logs-foo@custom component template

PUT http://localhost:9200/
    _component_template/logs-foo@custom
Authorization: Basic elastic password
Content-Type: application/json

{
  "template": {
    "mappings": {
      "properties": {
        "host.ip": {
          "type": "ip"
        }
      }
    }
  }
}

### Add logs-foo index template

PUT http://localhost:9200/
    _index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json

{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}

### Create data stream

PUT http://localhost:9200/
    _data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json

### Check if mappings exist

GET http://localhost:9200/
    logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```

### Check old behaviour

Ensure, that the old behaviour still exists when a component template is used that is not part of `ignore_missing_component_templates`: 

```
### Add logs-foo index template

PUT http://localhost:9200/
    _index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json

{
  "index_patterns": ["logs-foo-*"],
  "data_stream": { },
  "composed_of": ["logs-foo@package", "logs-foo@custom"],
  "ignore_missing_component_templates": ["logs-foo@custom"],
  "priority": 500
}
```

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
2023-01-31 08:40:29 -07:00
Francisco Fernández Castaño da387b430c
Link to the time-units doc in S3 repository docs instead of explaining it in words (#93351) 2023-01-31 11:59:20 +01:00
Keith Massey 13b71900a6
Download the geoip databases only when needed (#92335)
This commit changes the geoip downloader so that we only download the geoip databases if you
have at least one geoip processor in your cluster, or when you add a new geoip processor (or if
`ingest.geoip.downloader.eager.download` is explicitly set to true).
2023-01-30 13:07:48 -06:00
Andrei Dan f193392bed
[DOCS] List the downsample ILM action in the correct order of execution (#93233) 2023-01-30 16:00:36 +00:00
Ievgen Degtiarenko abbc78dc0f
Add version to discovery node toXContent and toString (#93307) 2023-01-30 10:10:49 +01:00
Glen Smith 81d9cbe0ca
Update frequent-items-aggregation.asciidoc (#93287)
Fix type togeher > together
2023-01-27 09:45:17 -05:00
István Zoltán Szabó 05c77534fe
[DOCS] Fixes markup for example in count function docs. (#93308) 2023-01-27 14:41:30 +01:00
David Turner ce736dd0e0 Revert "enhancement: boolean field to support ignore_malformed (#90122)"
This was merged in error without a full CI run, and has some issues.

This reverts commit edcdc43519.
This reverts commit 26c0a35558.
2023-01-25 15:09:59 +00:00
Hritik Kumar edcdc43519
enhancement: boolean field to support ignore_malformed (#90122)
* enhancement: boolean field to support ignore_malformed

* fix: changes in current builder for BooleanFieldMappers within tests files.

* Updating documentation

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co>
2023-01-25 13:56:50 +00:00
Keith Massey f327352601
Making JsonProcessor stricter so that it does not silently drop data (#93179)
This PR makes JsonProcessor's JSON parsing a little bit stricter so that
we are not silently dropping data when given bad inputs. Previously if
the input string began with something that could be parsed as a valid
json field, then the processor would grab that and ignore the rest. For
example, `123 "foo"` would be parsed as `123`, dropping the `"foo"`. Now
by default it will throw an IllegalArgumentException on a string like
this. A user can now set the `strict_json_parsing` parameter to false to
get the old behavior. For example:

```
POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "",
    "processors" : [
      {
        "json" : {
          "field" : "message",
          "strict_json_parsing": false
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "message": "123 \"foo\""
      }
    }
  ]
}'
```

Closes #92898
2023-01-24 18:43:35 -05:00
Ryan Ernst 2cad2266d5
Suggest systemd override file instead of unit file for tmpdir (#93211)
The systemd unit file is part of the Elasticsearch package and should
not be edited. Instead, we recommend creating a service override file.
This commit tweaks the docs for setting tmp dir with systemd to use the
override file instead of editing the unit file.

relates #93121
2023-01-24 13:56:14 -08:00
Craig Taverner e8b4de9a8a
Documentation for geohex_grid over geo_shape (#92999)
* Documentation for geohex_grid over geo_shape

The feature to add support for geohex_grid aggregations over geo_shape
fields was added in https://github.com/elastic/elasticsearch/pull/91956.
This is the associated documentation for that.

* Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Fix explanation for geo_point vs geo_shape proj

When aggregating geohex over geoshape we use requirectangular because
underlying lucene index indexes and searches the polygons in that way.

* Correct spelling

According to grammarly, "therefor" is not an alternative spelling
of "therefore". We should use the conjunctive form here.

See https://www.grammarly.com/blog/therefore-vs-therefor/

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-01-24 16:03:27 +01:00
Martijn van Groningen 92f229d643
Update tsdb docs to include warning and additional limitations (#93191)
Update tsdb docs to include a warning that the format of the `_tsid` field shouldn't be relied upon and added additional limitations about dimension fields.
2023-01-24 12:05:50 +01:00
Abdon Pijpelink bfc52d576c
[DOCS] Update CCS compatibility matrix for 8.7 (#93172) 2023-01-24 10:19:35 +01:00
Keith Massey ebb860d1af
Adding to the documentation and tests for the _none pipeline (#93057) 2023-01-23 14:09:57 -06:00
Nikola Grcevski f117f76460
[DOCS] Forward-port persisting vm.max_map_count for WSL2 (#87276) 2023-01-23 13:02:14 -05:00
Abdon Pijpelink e5d0a724ac
[DOCS] ILM Move Step example only phase (#93161)
* [+DOC] ILM Move Step example only phase

Updates [doc](https://www.elastic.co/guide/en/elasticsearch/reference/master/ilm-move-to-step.html?edit) to append example similar to https://github.com/elastic/elasticsearch/pull/75435 (≥v7.15.0) to show users working example of only using `next_step.phase`.

* Move example to the end of the page

* Fix failing code snippet tests

* Skip test

Co-authored-by: Stef Nestor <steffanie.nestor@gmail.com>
2023-01-23 16:41:24 +01:00
Francisco Fernández Castaño ed9246f8d4
Amend read_timeout S3 repository setting description (#93136) 2023-01-23 15:34:46 +01:00
Abdon Pijpelink e6c9ecb282
Revert "[+DOC] ILM Move Step example only phase (#90329)" (#93154)
This reverts commit a536143e26.
2023-01-23 14:35:59 +01:00
Stef Nestor a536143e26
[+DOC] ILM Move Step example only phase (#90329)
* [+DOC] ILM Move Step example only phase

Updates [doc](https://www.elastic.co/guide/en/elasticsearch/reference/master/ilm-move-to-step.html?edit) to append example similar to https://github.com/elastic/elasticsearch/pull/75435 (≥v7.15.0) to show users working example of only using `next_step.phase`.

* Move example to the end of the page

* Fix failing code snippet tests

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-01-23 13:54:38 +01:00
Stef Nestor ae5f28e3e4
[+DOC] Restore policies in restoring ILM indices (#90119)
* [+DOC] Restore policies in restoring ILM indices

👋 howdy! This may need Asciidoc reformatting. Will you kindly add in express commentary on [Restore a managed Datastream or Index](https://www.elastic.co/guide/en/elasticsearch/reference/master/index-lifecycle-and-snapshots.html?edit) to also restore ILM policies as needed (via `include_global_state`). Otherwise, you induce ILM errors once ILM starts (and have to do a form of repeating the entire outlined procedure to get indices going through correctly.)

* Apply suggestions from code review

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2023-01-23 10:46:19 +01:00
David Turner a4de6bd14a
Document how to set ES_TMPDIR in the service file (#93121)
Today we suggest that users set `ES_TMPDIR` using `export`, which only
works if you're running things directly from the shell. Yet most users
encountering `ES_TMPDIR` problems seem to on RHEL and trying to run
things via `systemd`, for whom the `export` suggestion doesn't work.

This commit adds to the docs a suggestion of how to adjust the `systemd`
service file to set the appropriate environment variable.

Relates #80651
2023-01-23 08:30:54 +00:00
Yang Wang 9ce06fddc4
JWT realm documentation update - take 2 (#92539)
This PR is another round of documentation update for the JWT realm with the goal to achieve better clarity, differentiating more between the two token types and encourage readers to choose between them carefully.

Relates: #92409
2023-01-23 12:40:55 +11:00
Abdon Pijpelink 648d80e517
[DOCS] Add ssl.verification_mode to secure settings (#93083)
Co-authored-by: Adam Locke <adam.locke@elastic.co>
2023-01-19 17:13:55 +01:00
David Kilfoyle 3f880613f8
[Docs] Remove tech preview notice from downsampling docs (#92913) 2023-01-19 09:05:59 -05:00
Iraklis Psaroudakis 6ff081beef
Clarify searchable snapshot repository reliability (#93023)
To make it clear that repository snapshots should be available and reliable for any mounted searchable snapshots.

Co-authored-by: David Turner <david.turner@elastic.co>
2023-01-19 14:31:01 +02:00
Stef Nestor eb1de9493e
[+DOC] node_concurrent_recoveries default (#90330)
Notes that `node_concurrent_recoveries` default is 2 (same as both sub-settings which already note that).
2023-01-18 13:53:48 +01:00
Michael Bischoff a8706293e6
Pipeline setting missing in reindex.asciidoc (#89125)
* Update reindex.asciidoc

* Update docs/reference/docs/reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-01-18 11:37:40 +01:00
Valeriy Khakhutskyy c24712bfa7
[ML] Add multimodal distribution field processing for anomaly score explanation (#92978)
The companion PR to elastic/ml-cpp#2440 adds processing of multimodal_distribution field in the anomaly score explanation. I added a changelog entry in the ml-cpp PR hence I mark this PR as a non-issue.
2023-01-17 21:16:12 +01:00
Abdon Pijpelink 64ce4d1189
[DOCS] Downsampling code snippet formatting (#92981) 2023-01-17 15:31:25 +01:00
Przemysław Witek 40d32205db
[Transform] Add `from` parameter to Transform Start API (#91116) 2023-01-17 10:36:21 +01:00
Martijn van Groningen e1d0ba83f5
Fixed typo in docs. 2023-01-17 09:34:21 +01:00
Christos Soulios a183843893
[DOCS] Fix incorrect statement for `aggregate_metric_double` field type (#92961)
Documentation incorrectly states that all aggregations are supported by
the `aggregate_metric_double` field.

This PR rectifies this  error.

Closes #92236
2023-01-16 12:33:20 -05:00
Iraklis Psaroudakis 555a4d91ee
Update add-repository.asciidoc (#92945)
Our guide on re-registering a corrupt repository should link to the warnings about the potential side-effects of corruption.
2023-01-16 17:20:21 +02:00
Anthony McGlone 436d47da5c
[DOCS] Add API example for setting SQL permissions (#92491)
* [DOCS] Add API example for setting SQL permissions (#86711)

* Update docs/reference/sql/security.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/sql/security.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/sql/security.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* [DOCS] Update API example for setting SQL permissions (#86711)

* [DOCS] Update console result for API example (#86711)

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2023-01-16 11:11:46 +01:00
Kush 35da7dca72
Fix questions under how highlighters work (#92922) 2023-01-16 10:43:23 +01:00
Dale Visser 1a9150dddb [Docs] Differentiate runtime field and indexed field (#91057)
Clarify wording of upgrading runtime fields to index field.
2023-01-13 17:05:26 +01:00
David Turner dfab580976
Limit length of lag detector hot threads log lines (#92851)
If debug logging is enabled then the lag detector will capture and
report the hot threads of a lagging node. In some cases the resulting
log message can be very large, exceeding 10kiB, which means it is
truncated in most logging setups. The relevant thread(s) may be waiting
on I/O, which is not considered "hot" and therefore may not appear in
the first 10kiB.

This commit adjusts this logging mechanism to split the message into
chunks of size at most 2kiB (after compression and base64-encoding) to
ensure that the entire hot threads output can be faithfully
reconstructed from these logs.

Closes #88126
2023-01-13 13:11:26 +00:00
Mary Gouseti a7fdd3c036
GA the Health API under the url /_health_report (#92879) 2023-01-13 10:42:38 +01:00
Stef Nestor d9cbefc19c
[DOC] Troubleshooting Expensive Searches (#92725)
* [DOC] Troubleshooting Expensive Searches

👋 re: https://github.com/elastic/elasticsearch/issues/73222 adds in content so we can link to users on how to find source of expensive searches.

* Several edits

* Apply suggestions from code review

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-01-13 09:55:13 +01:00
Benjamin Trent a46e532cda
Allow more than one KNN search clause (#92118)
It makes sense to allow more than one KNN search clause per individual search request. It may be that different documents have separate vector spaces or that a single doc is index with more than one vector space. In both of these scenarios, users may want to retrieve a resulting set that takes into account all their indexed vector spaces. 

A prime example here would be searching a semantic text embedding along with searching an image embedding. 


closes https://github.com/elastic/elasticsearch/issues/91187
2023-01-12 11:35:50 -05:00
Tanguy Leroux dfa32a71a5
Adjust doc about dangling indices after node is detached from cluster (#92824)
Dangling indices are not imported automatically  since 8.0 but the
`elasticsearch-node detach-cluster` documentation still suggests it is.
It tried to make it more explicit by listing the Dangling API to use and
by using the work "manually".
2023-01-12 11:21:50 -05:00
Abdon Pijpelink 1bb660c810
[DOCS] Remove 'Watching event data' example (#92872) 2023-01-12 16:02:53 +01:00