Related to issue #77823
This does the following:
- Updates several asciidoc files that contained code snippets with
invalid JSON, most involving unnecessary trailing commas.
- Makes the switch from the Groovy JSON parser to the Jackson parser,
pursuant to the general goal of eliminating Groovy dependence.
- Makes testing of JSON validity at build time more strict.
Note that this update still allows backslash escaping for any
character. Currently that matters because of the file
"docs/reference/ml/anomaly-detection/apis/get-datafeed-stats.asciidoc",
specifically this part:
"attributes" : {
"ml.machine_memory" :
"$body.datafeeds.0.node.attributes.ml\.machine_memory",
"ml.max_open_jobs" : "512"
}
It's not clear to me what change, if any, is appropriate there. So,
I've left in the escaped period and configured the parser to ignore
it for the time being.
Replaces the hard-coded ESS lead-in with the docs attribute.
Previously, this copy omitted Microsoft Azure. This ensures these docs are better maintained.
* Add docs on searchable snaps costs
Adds a note on why searchable snapshots is cheaper, including warnings
that it might be more expensive too.
* Split into sections
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
* data -> the shard contents
* More wording tweaks
* Apply suggestions from code review
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
The current shrink API snippet doesn't show you how to remove replicas or reduce primary shards.
Rather than duplicate those instructions from the shrink API docs, this removes the snippet. A link to the shrink API and shrink ILM action docs is already provided.
It also updates a delete index API snippet to avoid wildcards. Wildcard expansion for the delete index API is disabled by default in 8.0.
This commit changes the es merge policy to apply the maximum segment size
on force merges that only expunge deletes (forceMergeDeletes).
This option is useful for read-write use cases that wants to reclaim deleted docs
more aggressively than the `index.merge.policy.deletes_pct_allowed`.
Closes#61764
Relates #77270
PR #77360 clarifies that a cluster's nodes don't need to be in the same data
center. This adds a similar clarification to the ES introduction docs.
Co-authored-by: David Turner <david.turner@elastic.co>
We make wire changes in #77633 so we need to disable the backwards
compatibility tests in master before merging the wire changes. We'll
re-enable them after the backport is merged.
Today we document that you can use URL repositories with searchable
snapshots, but in fact it only works for HTTP(S) repositories. This
commit adjusts the docs to clarify.
Relates #69521
This adds profiling to the fetch phase so we can tell when fetching is
slower than we'd like and we can tell which portion of the fetch is
slow. The output includes which stored fields were loaded, how long it
took to load stored fields, which fetch sub-phases were run, and how
long those fetch sub-phases took.
Closes#75892
* Skip bwc
* Don't compare fetch profiles
* Use passed one
* no npe
* Do last rename
* Move method down
* serialization tests
* Fix sneaky serialization
* Test for sneaky bug
* license header
* Document
* Fix test
* newline
* Restore assertion
* unit test merging
* Handle inner hits
* Fixup
* Revert unneeded
* Revert inner hits profiling
* Fix names
* Fixup names
* Move results building
* Drop loaded_nested
* Checkstyle
* Fixup more
* Finish writeable cleanup
Add unit tests for merge
* Remove null checking builder
* Fix wire mistake
How did this pass before?!
* Rename
* Remove funny builder
* Remove name munging
Several sentences in the 8.0 breaking changes reference setting
system properties in `elasticsearch.yml`, which is not supported. This corrects
those sentences.
It also fixes a sentence that references the `http.content_type.required`
setting as a system property.
We currently use the plaintext body of a shard request as the key to the
request cache. This has the disadvantage that very large requests can
quickly fill up the cache due to the size of their keys. With this commit,
we instead use a sha-256 hash of the shard request as the cache key,
which will use a constant (and much smaller) number of bytes.
PR #77155 updated the keystore instructions for Docker. However, it removed an
example that included the `KEYSTORE_PASSWORD` env variable.
This replaces a docker compose example with the original example from PR #51123.
* [DOC] Update Persist Keystore via Docker
From feedback from ES Devs summarized in [^1], I believe this needs to reflect a directory mount rather than file mount to not error. Also adding in the two common mounting errors, but not sure if this is the right place for them.
[^1] https://discuss.elastic.co/t/persist-elasticsearch-kibana-keystores-with-docker/283099
* feedback
* Reorganize
* reword
* fix formatting
* address review feedback
* remove extra whitespace
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Today the multi-zone-cluster design docs say to keep all the nodes in a
single datacenter. This doesn't really reflect what we do in practice:
each zone in AWS/GCP/Azure/etc is a separate datacenter with decent
connectivity to the other zones in the same region. This commit adjusts
the docs to allow for this.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Changes:
* Notes the delete index API can delete multiple indices at once.
* Notes deleting an index deletes its docs, shards, and metadata but does not delete any related Kibana components.
* Relocates a note about deleting a data stream's write index to the description.
* Corrects the default `expand_wildcards` value.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
The char filter replaces the previous default of `first_non_blank_line`.
`first_non_blank_line` worked well to figure out what line had characters at all, but log lines
like the following were handled poorly:
```
--------------------------------------------------------------------------------
Alias 'foo' already exists and this prevents setting up ILM for logs
--------------------------------------------------------------------------------
```
When combined with the `ml_standard` tokenizer, the first line was used:
```
--------------------------------------------------------------------------------
```
This has no valid tokens for our standard tokenizer. Consequently, no tokens were found by `ml_standard` tokenizer.
The new filter, `first_line_with_letters`, returns the first line with any letter character (e.g. `Character#isLetter` returns true).
Given the previously poorly handled log, when combining with our `ml_standard` tokenizer, we get the following, more appropriate, tokens:
```
"tokens" : ["Alias", "foo", "already", "exists", "and", "this", "prevents", "setting", "up", "ILM", "for", "logs"]
```
This commit changes default deprecation logger level to CRITICAL, where default means deprecations emitted by DeprecationLogger#critical method.
It also introduces WARN deprecations which are emitted by DeprecationLogger#warn Those log lines emitted at WARN are meant to indicate that a functionality is deprecated but will not break at next major version.
relates #76754
* Make the ILM `freeze` action a no-op
This changes the ILM `freeze` action to not actually freeze the index, instead performing no
operation.
Relates to #70192
* zoop -> noop in documentation anchor
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Adds additional information about how Elasticsearch uses polygon orientation. Elasticsearch only uses a polygon's orientation to determine if it crosses the international dateline. If so, Elasticsearch splits the polygon at the dateline.
Closes#74891
This commit removes the ability to set the vocabulary location in the model config.
This opts instead for sane defaults to be set and used. Wrapping this up in an
API.
The index is now always the internally managed .ml-inference-native index
and the document ID is always <model_id>_vocabulary
This API only works for pytorch/nlp type models.
Today we expire the client stats for HTTP channels 5 minutes after they
close. It's possible to open a very large number of HTTP channels in 5
minutes, possibly inadvertently, and the stats for those channels can be
overwhelming.
This commit introduces a limit on the number of channels tracked by each
node which applies in addition to the age limit, and makes these limits
configurable via static settings. It drops the pruning of old stats when
starting to track a new channel and instead uses a queue to expire the
oldest stats when each channel closes if necessary to respect the count
limit; it only performs age-based expiry when retrieving the stats,
since the count limit now bounds the memory needed. Finally, it
tightents up some missing synchronization and makes sure that we expose
only immutable objects to the stats subsystem.
Previously, if a model failed to be allocated on any node, the deployment failed.
This commit allows for an allocation to be partially_started and indicates its
current state via a new state value in the deployment stats API.
Additionally, when starting a deployment, the user may specify to wait_for
starting, partially_started, started and the API will block (as long as timeout doesn't expire) until that state is reached.
This updates the default for the `is_write_index` parameter of the aliases API and create alias API.
The default behavior for `is_write_index` can vary based on:
1. Whether the alias is used for indices or data streams.
2. If an index alias, whether the alias points to multiple indices.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: 诗心客 <ishixinke@qq.com>
PRs #73062 and #73043 repurposed the `alias` anchor for a new guide for index
and data stream aliases. Previously, this anchor was used for our field alias
documentation.
Repurposing the anchor has caused continuity errors for users selecting
different versions of the ES docs. It could also cause confusion for users with
a `/current/` link to the `alias` page.
This updates the anchor for the alias guide and adds a redirect page to
disambiguate the `alias` anchor.
It also fixes a bread crumb issue for redirects following the 'Modifying your
Data' redirect page.
Closes#77034.