Commit Graph

389 Commits

Author SHA1 Message Date
Hendrik Muhs 1b556d75fa
mute another node stats test (#91346)
muting another test part as it causes a lot of CI failures

relates #91081
2022-11-07 06:07:09 -05:00
Mary Gouseti d55059afab
Mute reference/cluster/nodes-stats/line_2751 (#91174) 2022-10-28 11:55:53 +02:00
Francisco Fernández Castaño 1a3032beb6
Keep track of average shard write load (#90768)
This commit adds a new field, write_load, into the shard stats. This new stat exposes the average number of write threads used while indexing documents.

Closes #90102
2022-10-13 16:34:45 +02:00
Ievgen Degtiarenko 4d6d979e0e
Deprecate state field in `/_cluster/reroute` response (#90399) 2022-10-05 08:18:27 +02:00
Iraklis Psaroudakis 3ed7a04d22
Introduce node mappings stats (#89807)
So that they are visible in NodeIndicesStats only at the node and index (but not shard) levels. Also visible in the _cat/nodes table. And make an exact count yaml REST test.
2022-09-19 15:47:47 +03:00
Artem Prigoda 72a6fdc2b8
Support "dry run" mode for updating Desired Nodes (#88305)
Add the dry_run query parameter to support simulating of updating of desired nodes. The update request will be validated, but no cluster state updates will be performed. In order to indicate that the response was a result of a dry run, we add the dry_run run field to the JSON representation of a response.

See #82975
2022-07-26 09:03:12 +02:00
Elasticsearch addict e3dc098a0a
Tasks doc: fix a mistake about the reindex task description (#88669) 2022-07-22 12:17:00 +02:00
Elasticsearch addict 11473964ab
Improve description for task api detailed param (#88493)
Co-authored-by: David Turner <david.turner@elastic.co>
2022-07-14 09:22:28 +02:00
David Turner ff269f8104
Small fixes to clear voting config excls API (#87828)
Fixes the name of the REST param in the error message, and expands the
API docs to emphasise that the exclusions should be empty in normal
operation.
2022-06-20 10:40:39 +01:00
David Turner fcf293f87c
Report overall mapping size in cluster stats (#87556)
Adds measures of the total size of all mappings and the total number of
fields in the cluster (both before and after deduplication).

Relates #86639
Relates #77466
2022-06-14 13:55:14 +01:00
Mayya Sharipova 4dabd5eb8e
Add mapping stats for indexed dense_vectors (#86859)
Add cluster mapping stats for indexed dense_vectors

Currently _cluster/stats mapping section displays all mapping types
along with their count. In 8.0 we introduced indexed dense_vector
types, and we would like to collect more enhanced stats on them:
- number of indexed dense_vector fields
- sum of dims across all indexed dense_vector fields

This allows to differentiate how indexed dense_vector types are
used as opposed to unindexed dense_vector types.
2022-06-07 08:40:28 -04:00
Joe Gallo 79990fa49b
Remove "Push back excessive requests for stats (#83832)" (#87054) 2022-05-23 12:58:02 -04:00
Francisco Fernández Castaño e91e7e653b
Add support for CPU ranges in desired nodes (#86434)
This commit adds support for CPU ranges in the desired nodes API. 

This aligns better with environments where administrators/orchestrators
can define lower and upper bounds for the amount of CPUs that the
desired node would get once deployed. 

This allows to provide information about the expected CPU and possible
allowed overcommit that the desired node will run on.

This was the previous expected body for the desired nodes API (we still support it):
```
PUT /_internal/desired_nodes/history/1
{
    "nodes" : [
        {
            "settings" : {
                 "node.name" : "instance-000187",
                 "node.external_id": "instance-000187",
                 "node.roles" : ["data_hot", "master"],
                 "node.attr.data" : "hot",
                 "node.attr.logical_availability_zone" : "zone-0"
            },
            "processors" : 8, 
            "memory" : "58gb",
            "storage" : "1700gb",
            "node_version" : "8.3.0"
        }
    ]
}
```

Now it's possible to define `processors` or `processors_range` as in:
```
PUT /_internal/desired_nodes/history/1
{
    "nodes" : [
        {
            "settings" : {
                 "node.name" : "instance-000187",
                 "node.external_id": "instance-000187",
                 "node.roles" : ["data_hot", "master"],
                 "node.attr.data" : "hot",
                 "node.attr.logical_availability_zone" : "zone-0"
            },
            "processors_range" : {"min": 8.0, "max": 16.0},
            "memory" : "58gb",
            "storage" : "1700gb",
            "node_version" : "8.3.0"
        }
    ]
}
```
Note that `max` in `processors_range` is optional.

This commit also moves from representing CPUs as integers to
accept floating point numbers.

Note: I disabled the bwc yamlRestTests for versions < 8.3 since we introduced
a few "breaking changes" but since this is an internal API it should be fine.
2022-05-20 11:47:32 +02:00
David Turner 6f0cee0fae
Add master_timeout support to voting config exclusions APIs (#86670)
Today the add/clear voting config exclusions APIs route a request to the
master node but do not expose the usual `?master_timeout` parameter
allowing to change the timeout for this phase of execution. This commit
adds the missing parameter.
2022-05-11 13:56:50 +01:00
Rene Groeschke 62d5aa986c
Port gradle docs test plugin to use internal yaml rest test plugin (#86598)
Remove usage of deprecated elasticsearch.rest-test in DocsTestPlugin

we keep some files in src/test in docs projects as moving them would require more changes
in build-docs project outside this repository
2022-05-11 12:01:23 +02:00
Gabi Davar 43ab984639
Add documentation for "io_time_in_millis" (#84911)
Add documentation for "io_time_in_millis"

Co-authored-by: Adam Locke <adam.locke@elastic.co>
2022-04-25 16:43:19 +01:00
Ryan Ernst d60cde6681
Remove flavor from build (#85796)
The default distribution is the only remaining build flavor, and has been for
quite a while now. This commit removes flavor from the internal Build
class. It keeps rest api compat for nodes info for now by hardcoding
`default`.
2022-04-11 16:46:55 -07:00
Ryan Ernst cf3dc57132
Remove no-jdk deprecations (#85765)
The no-jdk distributions exist in 7.x and before. They were removed with
8.0. This commit removes the remaining deprecation messages for using
the no-jdk distribution. Note that when talking with an older node, we
drop the bundledJdk attribute. This is ok because it is only possible
for this to not be true when talking with a 7.17 node, during an upgrade,
and the usingBundledJdk is retained, which is the important thing if
debugging a problem.

relates #76896
relates #85758
2022-04-11 14:52:31 -07:00
Mary Gouseti ed0bb2a8af
Push back excessive requests for stats (#83832)
Resolves #51992
2022-02-28 08:46:18 +01:00
Nhat Nguyen 86964c9752
Document partial search results with skip_unavailable (#84057)
This commit adds an explanation for the relation between `allow_partial_search_results` and `skip_unavailable` in CCS requests.

Relates to #33915

Closes #82407

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2022-02-23 10:04:52 -05:00
David Turner 02f38e3da9
Make allocation explanations more actionable (#83983)
The cluster allocation explain API includes a top-level status
indicating to the user whether the shard can be assigned/rebalanced/etc
or not. Today this status is fairly terse and experience shows that
users sometimes struggle to understand how to interpret it and to decide
on follow-up actions.

This commit makes the top-level explanation more detailed and
actionable. For instance, in the cases like `THROTTLED` where the status
is transient we instruct the user to wait; if a shard is lost we say to
restore it from a snapshot; if a shard cannot be assigned we say to
choose a specific node where its assignment is expected and to address
the obstacles.

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2022-02-22 09:23:01 +00:00
Tobias Stadler e3deacf547
[DOCS] Fix typos (#83895) 2022-02-15 12:42:17 -05:00
Mark Vieira fcf1380492 Fix documentation snippet tests 2022-02-02 13:29:02 -08:00
Francisco Fernández Castaño 520b8435d0
Add desired nodes API (#82975)
This commit adds the Desired Nodes API, allowing orchestrators
that manage Elasticsearch clusters to let the system know about the
current/planned topology that the cluster will run on.
This allows the system to take better decisions based on the entire
cluster topology, including nodes that will be added/removed in the
near future.
2022-02-01 17:54:57 +01:00
David Turner dc26886ee1
Clarify cluster state API docs (#82930)
The `GET _cluster/state` API is really only suitable for debugging or
diagnostics. Its response format is not documented since it changes
fairly freely between versions.

Today we mention in its docs that this API is unstable, and deliberately
omit a description of its response format, but we don't explicitly say
that it's only for diagnostics and is unsuitable for consumption by
external tools that might try and use it for monitoring.

This commit adjusts the docs to give some more explicit guidance about
how it should and shouldn't be used.
2022-01-24 10:09:45 -05:00
James Rodewig e21a9a0711
[DOCS] Re-add cluster settings precedence (#82738)
Adds some tags and an include statement to re-add cluster settings precedence information to the cluster update settings API page. This should aid with discoverability.

Closes https://github.com/elastic/elasticsearch/issues/82634

Relates to https://github.com/elastic/elasticsearch/pull/79579
2022-01-18 12:32:43 -05:00
Jake Landis fd6f04bb24
[docs] clarify purged http stats (#82123) 2022-01-04 09:51:41 -06:00
Olivier Cavadenti 90e4e8ce63
Add index pressure stats in cluster stats (#80303)
`GET _nodes/stats` returns statistics about indexing pressure for each node.
With this commit `GET _cluster/stats` now returns stats about indexing pressure
computed by aggregating the indexing pressure stats of each node in the
cluster.

Closes #79788
2021-12-09 12:41:08 +00:00
David Turner 54e0370b3e
Track histogram of transport handling times (#80581)
Adds to the transport node stats a record of the distribution of the
times for which a transport thread was handling a message, represented
as a histogram.

Closes #80428
2021-11-29 15:41:33 +00:00
Artem Prigoda 89bbac9216
Revert "Return 200 OK response code for a cluster health timeout (#78968)" (#80821)
* Revert "Return 200 OK response code for a cluster health timeout (#78968)"

This reverts commit a2c3daea

* Revert "Allow deprecation warning for the return_200_for_cluster_health_timeout parameter (#80178)"

This reverts commit 1c711e35fc.

* Revert "Drop pre-7.2.0 wire format in ClusterHealthRequest (#79551)"

This reverts commit b9fbe66ab0.

* Revert "Adjust the BWC version for the return200ForClusterHealthTimeout field (#79436)"

This reverts commit f60bda5685.

* Revert "Use query param instead of a system property for opting in for new cluster health response code (#79351)"

This reverts commit 8901a999

* Revert "Deprecate returning 408 for a server timeout on `_cluster/health` (#78180)"

This reverts commit f266eb32

* Drop pre-7.2.0 wire format in ClusterHealthRequest (#79551)

This reverts commit fa4d562c

* Revert "Disable BWC for #80821 (#80839)"

This reverts commit cb0e73e2fc.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-11-18 19:55:16 +01:00
James Rodewig 2f4143267e
[DOCS] Un-deprecate transient cluster settings (#80766) (#80780)
#80556 reverted the deprecation of transient cluster settings. This replaces deprecation language in the docs with a warning/recommendation to avoid transient settings.

Closes #80557
# Conflicts:
#	docs/reference/migration/migrate_7_16.asciidoc
2021-11-16 16:00:13 -05:00
Nikola Grcevski 3308dd5c00
Undo transient settings deprecation (#80558)
This change removes the deprecation warning when calling
the cluster settings APIs with transient settings.

Relates to #80556
2021-11-09 17:07:55 -05:00
Stuart Tettemer 30e15ba838
Script: Time series compile and cache evict metrics (#79078)
Collects compilation and cache eviction metrics for
each script context.

Metrics are available in _nodes/stats in 5m/15m/1d
buckets.

Refs: #62899
2021-11-03 13:13:42 -05:00
James Rodewig cb6347b3da
[DOCS] Add transient settings migration guide (#80091) (#80272)
Changes:

* Adds a transient settings migration guide to the 7.16 docs.
* Updates the related deprecation docs to link to the guide.

Closes #80055

Relates to #79167.
2021-11-03 09:23:25 -04:00
James Rodewig 8f23448870
[DOCS] Update ESS best practice for dynamic cluster settings (#79579)
Changes:

* Updates a tip in the configuration docs to point Cloud users to the [edit user settings](https://www.elastic.co/guide/en/cloud/current/ec-add-user-settings.html) feature.
* Removes some duplicate content from the cluster update settings API docs.

Relates to https://github.com/elastic/cloud/pull/90394

Co-authored-by: David Kilfoyle <41695641+kilfoyle@users.noreply.github.com>
2021-10-26 11:57:42 -04:00
Artem Prigoda 8901a9998e
Use query param instead of a system property for opting in for new cluster health response code (#79351)
The original change was implemented in #78940, bu we have decided to move from a system property to an a request parameter, so Cloud users/clients have an easier way to opt-in for the new status code.

Relates #70849
2021-10-18 22:43:59 +02:00
David Roberts e86de065cf
Allow total memory to be overridden (#78750)
Since #65905 Elasticsearch has determined the Java heap settings
from node roles and total system memory.

This change allows the total system memory used in that calculation
to be overridden with a user-specified value. This is intended to
be used when Elasticsearch is running on a machine where some other
software that consumes a non-negligible amount of memory is running.
For example, a user could tell Elasticsearch to assume it was
running on a machine with 3GB of RAM when actually it was running
on a machine with 4GB of RAM.

The system property is `es.total_memory_bytes`, so, for example,
could be specified using `-Des.total_memory_bytes=3221225472`.
(It is specified in bytes rather than using a unit, because it
needs to be parsed by startup code that does not have access to
the utility classes that interpret byte size units.)
2021-10-16 12:01:37 +01:00
Nikola Grcevski 055c770083
Deprecation of transient cluster settings (#78794)
This PR changes uses of transient cluster settings to
persistent cluster settings. 

The PR also deprecates the transient settings usage.

Relates to #49540
2021-10-15 13:00:52 -04:00
David Turner 5767d51c2b
Add tests/fix docs for nodes info API (#79223)
The docs for `GET _nodes/<node>/<metric>` omitted a couple of metrics
and indicated that this API returned dynamic stats rather than static
info. They also didn't mention that `_all` is a legal value, nor
did it give a way to suppress all metrics even though this is possible.

This commit adjusts the docs and adds tests to ensure that selecting
metrics works as expected and to ensure that there is a future-proof
legal way to suppress all metrics.

Closes #79187

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-10-15 15:57:52 +01:00
Keith Massey 4df15f5177
Changing name of shards field in node/stats api to shard_stats (#78531)
If the _nodes/stats API received a level=shards request parameter, then the response would have two "shards" fields,
which would cause problems with json parsers. This commit renames the "shards" field that currently only contains
"total_count" to "shard_stats".
Relates #78311 #75433
2021-10-06 17:19:04 -05:00
James Rodewig b3cdf60ab3
Adding priority list and executing description to the pending tasks doc (#74456) (#78259)
* Adding priority to the pending tasks doc

https://github.com/elastic/elasticsearch/pull/19448#discussion_r70969307
917fea7c5d/core/src/main/java/org/elasticsearch/common/Priority.java (L29)

* Adding executing into the cluster pending tasks

* Update docs/reference/cluster/pending.asciidoc

Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>

Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>

Co-authored-by: Leaf-Lin <39002973+Leaf-Lin@users.noreply.github.com>
Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>
2021-09-23 11:17:18 -04:00
David Turner 4a17847b85
Add timing stats to publication process (#76771)
This commit introduces into the node stats API various statistics to
track the time that the elected master spends in various phases of the
cluster state publication process.

Relates #76625
2021-08-23 17:38:32 +01:00
Peter Dyson cad55c8393
[DOCS] Clarify usage of optional human readable jvm uptime metric in Nodes Stats API (#76545)
To return the JVM `uptime` metric, the `human` query parameter must be `true`.

Co-authored-by: Adam Locke <adam.locke@elastic.co>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-08-20 08:55:23 -04:00
David Turner 95edc6deb2
Clarify allocation explain if random shard chosen (#75670)
Today we often encounter users that are confused by the behaviour of
calling `GET _cluster/allocation/explain` without a body: it _seems_ to
work, but it explains a random shard, and if this isn't the shard
they're thinking of then it's unclear how to proceed.

With this commit we add a note to the response when a shard was randomly
chosen indicating that it is possible, and possibly useful, to explain a
different shard. We also adjust the exception message in the case when
all shards are assigned to indicate why it's an invalid request and what
to do to make it valid.
2021-08-02 15:14:09 +01:00
Adrien Grand d15445e0f3
Remove usage of RAM accounting of segments (#75674)
This is a pre-requisite for the upgrade to Lucene 9, which removes the ability to estimate RAM usage of segments.
2021-07-29 08:36:09 +02:00
Keith Massey ddc3b37580
Adding shard count to node stats api (#75433)
* Adding shard count to _nodes/stats api

Added a shards section to each node returned by the _nodes/stats api. Currently this new section only contains a total count of all shards on the node.
2021-07-27 10:39:53 -05:00
James Rodewig 5729bb8d49
[DOCS] Update alias references (#73427)
Updates several `index aliases` references to `aliases`.
2021-05-27 16:00:57 -04:00
David Turner b2956b3ae7
Identify cancelled tasks in list tasks API (#72931)
This commit adds a `cancelled` flag to each cancellable task in the
response to the list tasks API, allowing users to see that a task has
been properly cancelled and will complete as soon as possible.

Closes #72907
2021-05-17 11:02:50 +01:00
James Rodewig ba66669eb3
[DOCS] Rename mount types for searchable snapshots (#72699)
Changes:

* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
2021-05-05 16:35:33 -04:00
David Turner dd7f555ca5
Open with better cluster allocation explain ex. (#72245)
Today the only example of calling the cluster allocation explain API above the
fold is the bare `GET /_cluster/allocation/explain` which kind of works but is
not usually what the user wants. This commit changes the docs so that we open
with an example showing how we usually expect it to be called. This will make
it clearer that you should normally specify exactly for which shard you want an
explanation. It also tidies up a few other wrinkles in these docs.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-04-26 17:41:22 +01:00
Luca Cavanna 6422fd5df2
Output script stats for indexed fields (#71219)
We have recently introduced the ability to associate an indexed field with a script. This commit updates the existing mappings stats to output stats about the script, similar to what we already do for runtime fields.
2021-04-12 13:32:50 +02:00
James Rodewig 693807a6d3
[DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
Henning Andersen 0f28e97857
Total data set size in stats (#70625)
With shared cache searchable snapshots we have shards that have a size
in S3 that differs from the locally occupied disk space. This commit
introduces `store.total_data_set_size` to node and indices stats, allowing to
differ between the two.

Relates #69820
2021-03-30 15:23:29 +02:00
James Rodewig 69db7ce171
[DOCS] Remove dupe `wait_for_completion` def (#71012) 2021-03-30 06:46:57 -04:00
Dan Hermann 8ff7360901
[DOCS] HTTP client stats (#70512) 2021-03-19 06:22:17 -05:00
James Rodewig d51a04cd8c
[DOCS] Add operator privileges to APIs and settings (#69903) 2021-03-15 09:20:09 -04:00
James Rodewig 71bb0c7714 [DOCS] Reword `ingest` description 2021-03-09 13:14:23 -05:00
Luca Cavanna ffe61fb097
Move runtime fields stats to server (#69487)
Runtime fields usage is currently reported as part of the xpack feature usage API. Now that runtime fields are part of server, their corresponding stats can be moved to be part of the ordinary mapping stats exposed by the cluster stats API.
2021-03-08 12:38:20 +01:00
Yannick Welsch 529c6227fe
Support include_unloaded_segments in node stats (#69682)
Adds support for the include_unloaded_segments flag in node stats, which helps with understanding resource usage of
shared_cache-style searchable snapshots on a per-node basis.
2021-03-01 17:18:47 +01:00
James Rodewig 9af74ec561
[DOCS] Remove added admons (#69452) 2021-02-23 10:35:21 -05:00
David Roberts 6e392a317d
Add processor architectures to cluster stats (#68264)
This change adds a new "architectures" section to the
cluster stats, containing a summary of how many nodes
in the cluster are on each processor architecture.

The intention is to make it easier to see whether
clusters are running on aarch64, or mixed x86_64/aarch64,
which may aid support as aarch64 becomes more commonly
used.
2021-02-02 09:48:20 +00:00
David Turner 2adeb4a666
Expand and consolidate networking docs (#68051)
Today's network config docs are split into "Network", "HTTP" and
"Transport" pages, with unclear relationships between them. We often
encounter users with weird configs that indicate they don't really
understand how these settings all relate. In fact these pages are all
very interrelated, and the HTTP and Transport pages are almost all only
for advanced users. This commit brings these docs into a single page and
rewords some things to try and guide users away from the advanced
settings unless their configuration needs all the extra complexity.

It also adds a section entitled "Binding and publishing" which clarifies
the meanings of the `bind_host` and `publish_host` parameters. This is
also a common source of confusion amongst users.

It also clarifies that many of these settings accept a list of
addresses, and warns that this may not be what you want. Closes #67956.

Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-02-01 13:06:20 +00:00
Lee Hinman ac1433d300
Add index creation version stats to cluster stats (#68141)
This commit adds statistics about the index creation versions to the `/_cluster/stats` endpoint. The
stats look like:

```
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "indices" : {
    "count" : 3,
    ...
    "versions" : [
      {
        "version" : "8.0.0",
        "index_count" : 1,
        "primary_shard_count" : 2,
        "total_primary_size" : "8.6kb",
        "total_primary_bytes" : 8831
      },
      {
        "version" : "7.11.0",
        "index_count" : 1,
        "primary_shard_count" : 1,
        "total_primary_size" : "4.6kb",
        "total_primary_bytes" : 4230
      }
    ]
  },
  ...
}
```

(`total_primary_size` is only shown with the `?human` flag)

This is useful for telemetry as it allows us to see if/when a cluster has indices created on a
previous version that would need to be either upgraded or supported during an upgrade.
2021-01-28 13:58:21 -07:00
James Rodewig 3e34247570
[DOCS] Add security privileges to cluster API docs (#67589) 2021-01-19 10:18:59 -05:00
Ioannis Kakavas bd873698bc
Ensure CI is run in FIPS 140 approved only mode (#64024)
We were depending on the BouncyCastle FIPS own mechanics to set
itself in approved only mode since we run with the Security
Manager enabled. The check during startup seems to happen before we
set our restrictive SecurityManager though in
org.elasticsearch.bootstrap.Elasticsearch , and this means that
BCFIPS would not be in approved only mode, unless explicitly
configured so.

This commit sets the appropriate JVM property to explicitly set
BCFIPS in approved only mode in CI and adds tests to ensure that we
will be running with BCFIPS in approved only mode when we expect to.
It also sets xpack.security.fips_mode.enabled to true for all test clusters
used in fips mode and sets the distribution to the default one. It adds a
password to the elasticsearch keystore for all test clusters that run in fips
mode.
Moreover, it changes a few unit tests where we would use bcrypt even in
FIPS 140 mode. These would still pass since we are bundling our own
bcrypt implementation, but are now changed to use FIPS 140 approved
algorithms instead for better coverage.

It also addresses a number of tests that would fail in approved only mode
Mainly:

    Tests that use PBKDF2 with a password less than 112 bits (14char). We
    elected to change the passwords used everywhere to be at least 14
    characters long instead of mandating
    the use of pbkdf2_stretch because both pbkdf2 and
    pbkdf2_stretch are supported and allowed in fips mode and it makes sense
    to test with both. We could possibly figure out the password algorithm used
    for each test and adjust password length accordingly only for pbkdf2 but
    there is little value in that. It's good practice to use strong passwords so if
    our docs and tests use longer passwords, then it's for the best. The approach
    is brittle as there is no guarantee that the next test that will be added won't
    use a short password, so we add some testing documentation too.
    This leaves us with a possible coverage gap since we do support passwords
    as short as 6 characters but we only test with > 14 chars but the
    validation itself was not tested even before. Tests can be added in a followup,
    outside of fips related context.

    Tests that use a PKCS12 keystore and were not already muted.

    Tests that depend on running test clusters with a basic license or
    using the OSS distribution as FIPS 140 support is not available in
    neither of these.

Finally, it adds some information around FIPS 140 testing in our testing
documentation reference so that developers can hopefully keep in
mind fips 140 related intricacies when writing/changing docs.
2020-12-23 21:00:49 +02:00
James Rodewig 10b036e934
[DOCS] Fix timeout parameter defaults (#66111) 2020-12-21 09:02:06 -05:00
bellengao d14492ca13
[DOCS] Fix some typos in docs (#66672) 2020-12-21 12:45:51 +02:00
James Rodewig 7c0f193b2c
[DOCS] Fix formatting (#66450) 2020-12-16 11:09:55 -05:00
Adam Locke be3bc46111
[DOCS] Add description for node info settings. (#66362) 2020-12-15 11:27:42 -05:00
bellengao e198bb233e
[DOCS] Correct the default value of `wait_for_completion` query param (#65800)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2020-12-04 15:52:35 -05:00
James Rodewig 0f406f1734
[DOCS] Add cluster get settings API example (#65754) 2020-12-02 10:37:01 -05:00
James Rodewig 72621873fd
[DOCS] Remove erroneous `flat_settings` query param (#65670) (#65745)
Co-authored-by: Thiago Souza <thiago@elastic.co>
2020-12-02 09:42:35 -05:00
Wylie Conlon 10ee0f2878
Clarify field data cache behavior in docs (#64375)
* Clarify that field data cache includes global ordinals
* Describe that the cache should be cleared once the limit is reached
* Clarify that the `_id` field does not supported aggregations anymore
* Fold the `fielddata` mapping parameter page into the `text field docs
* Improve cross-linking
2020-11-20 13:53:23 -08:00
James Rodewig 1ea83359bb
[DOCS] Fix case for 'Boolean' (#64299) 2020-10-29 09:04:43 -04:00
Adam Locke 789ee2d73e
[DOCS] Combining important config settings into a single page (#63849)
* Combining important config settings into a single page.

* Updating ids for two pages causing link errors and implementing redirects.
2020-10-19 10:02:22 -04:00
Lee Hinman 0c3599577e
Add index.routing.allocation.prefer._tier setting (#62589)
This commit adds the `index.routing.allocation.prefer._tier` setting to the
`DataTierAllocationDecider`. This special-purpose allocation setting lets a user specify a
preference-based list of tiers for an index to be assigned to. For example, if the setting were set
to:

```
"index.routing.allocation.prefer._tier": "data_hot,data_warm,data_content"
```

If the cluster contains any nodes with the `data_hot` role, the decider will only allow them to be
allocated on the `data_hot` node(s). If there are no `data_hot` nodes, but there are `data_warm` and
`data_content` nodes, then the index will be allowed to be allocated on `data_warm` nodes.

This allows us to specify an index's preference for tier(s) without causing the index to be
unassigned if no nodes of a preferred tier are available.

Subsequent work will change the ILM migration to make additional use of this setting.

Relates to #60848
2020-09-18 14:49:59 -06:00
James Rodewig 136275e3e6
[DOCS] Fix typo in nodes stats docs (#61601) (#61716)
Co-authored-by: Henry <henryloh@ucla.edu>
2020-08-31 09:29:40 -04:00
Lee Hinman 28cec563b1
Allocate newly created indices on data_hot tier nodes (#61342)
This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by
default when they are created.

This does not break existing behavior, as nodes with the `data` role are considered to be part of
the hot tier. Users that separate their deployments by using the `data_hot` (and `data_warm`,
`data_cold`, `data_frozen`) roles will have their data allocated on the hot tier nodes now by
default.

This change is a little more complicated than changing the default value for
`index.routing.allocation.include._tier` from null to "data_hot". Instead, this adds the ability to
have a plugin inject a setting into the builder for a newly created index. This has the benefit of
allowing this setting to be visible as part of the settings when retrieving the index, for example:

```
// Create an index
PUT /eggplant

// Get an index
GET /eggplant?flat_settings
```

Returns the default settings now of:

```json
{
  "eggplant" : {
    "aliases" : { },
    "mappings" : { },
    "settings" : {
      "index.creation_date" : "1597855465598",
      "index.number_of_replicas" : "1",
      "index.number_of_shards" : "1",
      "index.provided_name" : "eggplant",
      "index.routing.allocation.include._tier" : "data_hot",
      "index.uuid" : "6ySG78s9RWGystRipoBFCA",
      "index.version.created" : "8000099"
    }
  }
}
```

After the initial setting of this setting, it can be treated like any other index level setting.

This new setting is *not* set on a new index if any of the following is true:

- The index is created with an `index.routing.allocation.include.<anything>` setting
- The index is created with an `index.routing.allocation.exclude.<anything>` setting
- The index is created with an `index.routing.allocation.require.<anything>` setting
- The index is created with a null `index.routing.allocation.include._tier` value
- The index was created from an existing source metadata (shrink, clone, split, etc)

Relates to #60848
2020-08-27 12:51:12 -06:00
James Rodewig a94e5cb7c4
[DOCS] Replace Wikipedia links with attribute (#61171) 2020-08-17 09:44:24 -04:00
James Rodewig ae01606785
[DOCS] Replace `twitter` dataset in docs (#60604) 2020-08-03 12:49:56 -04:00
Tim Brooks b1a6271ec8
Add configured indexing memory limit to node stats (#60342)
This commit adds the configured memory limit to the node stats API.
2020-07-29 11:20:59 -06:00
David Turner 940d618186
Log and track open/close of transport connections (#60297)
Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.

This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
2020-07-28 16:58:00 +01:00
James Rodewig 441c3a21b1
[DOCS] Update my-index examples (#60132)
Changes the following example index names to `my-index-000001` for consistency:

* `my-index`
* `my_index`
* `myindex`
2020-07-27 14:46:39 -04:00
Tim Brooks 5c227dac88
Implement human readable indexing pressure stats (#60022)
The indexing pressure stats do not currently have human readable
variants. This commit add human readable variants and updates the
documentation.
2020-07-22 09:54:51 -06:00
James Rodewig 80b674fb25
[DOCS] Reformat snippets to use two-space indents (#59973) 2020-07-21 12:24:26 -04:00
Tim Brooks 08506de861
Add indexing pressure documentation (#59456)
This commit adds documentation about the new indexing pressure memory
limit setting and exposure of this metrics in node stats.
2020-07-20 19:35:26 -06:00
David Turner 7bb748da8c
Remove sporadic min/max usage estimates from stats (#59755)
Today `GET _nodes/stats/fs` includes `{least,most}_usage_estimate`
fields for some nodes. These fields have rather strange semantics. They
are only reported on the elected master and on nodes that have been the
elected master since they were last restarted; when a node stops being
the elected master these stats remain in place but we stop updating them
so they may become arbitrarily stale.

This means that these statistics are pretty meaningless and impossible
to use correctly. Even if they were kept up to date they're never
reported for data-only nodes anyway, despite the fact that data nodes
are the ones where we care most about disk usage. The information needed
to compute the path with the least/most available space is already
provided in the rest the stats output, so we can treat the inclusion of
these stats as a bug and fix it by simply removing them in this commit.
Since these stats were always optional and mostly omitted (for opaque
reasons) this is not considered a breaking change.
2020-07-20 14:48:53 +01:00
James Rodewig 2be9db01c8
[DOCS] Replace `datatype` with `data type` (#58972) 2020-07-07 13:52:10 -04:00
James Rodewig e5a1269e6f
[DOCS] Add data streams to cluster APIs docs (#58945)
Makes existing docs for the cluster health and cluster state APIs aware
of data streams.
2020-07-02 17:04:55 -04:00
David Turner 83d6589b2a
Account for remaining recovery in disk allocator (#58029)
Today the disk-based shard allocator accounts for incoming shards by
subtracting the estimated size of the incoming shard from the free space on the
node. This is an overly conservative estimate if the incoming shard has almost
finished its recovery since in that case it is already consuming most of the
disk space it needs.

This change adds to the shard stats a measure of how much larger each store is
expected to grow, computed from the ongoing recovery, and uses this to account
for the disk usage of incoming shards more accurately.
2020-07-01 08:04:45 +01:00
Lisa Cawley 27111f9faa
[DOCS] Updates pull and issue release attributes (#58348) 2020-06-18 12:38:49 -07:00
David Turner dc3e047a16
Add admonition to cluster state instability note (#57985)
We document that the cluster state API is an internal representation which may
change, but apparently not emphatically enough. This commit adds a `NOTE:`
admonition to this paragraph.
2020-06-11 15:26:26 +01:00
Lisa Cawley 8b9293b3bf
[DOCS] Replace docdir attribute with es-repo-dir (#57489) 2020-06-01 15:55:05 -07:00
James Rodewig b8a4e00b11
[DOCS] Document `dynamic` and `static` setting types (#56919) 2020-05-19 12:10:59 -04:00
Théophile Helleboid - chtitux 0c00a982be
Docs fix node_id spec for secure settings reload API (#55712)
Fix docs typo for the `node_id` parameter in the secure settings reload API.
2020-05-05 11:20:06 +03:00
David Turner b04a6f4766
Improve same-shard allocation explanations (#56010)
I see occasional confusion about the explanations emitted by the same-shard
allocation decider, particularly amongst new users setting up a single-node
cluster and trying to determine why their cluster has `yellow` health. For
example:

    the shard cannot be allocated to the same node on which a copy of the shard
    already exists

This is technically correct but it's quite a complicated sentence. Also, by
starting with "the shard cannot be allocated" it makes it sound like this is
the problem, whereas in fact this message is a good thing and users should
typically focus their attention elsewhere.

This commit simplifies the wording of these messages and makes them sound more
positive, for example:

    a copy of this shard is already allocated to this node
2020-04-30 16:58:06 +01:00
Igor Motov b909cee8e9
Expose agg usage in Feature Usage API (#55732)
* Expose agg usage in Feature Usage API

Counts usage of the aggs and exposes them on the _nodes/usage/.

Closes #53746

* Refactor to include non value sources aggregations

* Fix reported values source type for parent and children aggs

* Refactor SearchModule constructor

* Fix subtype in TTest and IPRanges

* Fix more subtypes in aggs that don't register themselves

* Fix doc tests

* Fix docs

* Fix ScriptedMetricAggregatorTests

* Fix compilation issues after merge

* Fix merge fallout

* This gets stale quickly...

* Address review comments

* Fix tests that were missing proper agg registration in the search module

* Fix ScriptedMetricAggregatorTests

* Address review comments

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-04-30 09:49:59 -04:00
David Turner 10ab397d7f
Adjust docs for voting config exclusions API (#55006)
In #50836 we deprecated the existing voting config exclusions API and added a
new one. This commit adjust the docs to match.
2020-04-20 19:47:09 +01:00
James Rodewig 399bc86574
[DOCS] Document analysis/mapping response for cluster stats API (#55054)
PR #51260 moved usage counts about mapping field types and analysis to
the `_cluster/stats` API.

This documents those stats in the response section of the cluster stats
API docs.
2020-04-17 08:42:13 -04:00
Ioannis Kakavas 16e9433ead
Fix ReloadSecureSettings API to consume password (#54771)
The secure_settings_password was never taken into consideration in
the ReloadSecureSettings API. This commit fixes that and adds
necessary REST layer testing. Doing so, it also

- Allows TestClusters to have a password protected keystore
so that it can be set for tests.
- Adds a parameter to the run task so that elastisearch can
be run with a password protected keystore from source.
2020-04-10 16:48:36 +03:00