Fixes the text field mapper and the analyzers class that also retained parameter references that go really heavy.
Makes `TextFieldMapper` take hundreds of bytes compared to multiple kb per instance.
closes#73845
This commit updates two task names:
```
yamlRestCompatTest -> yamlRestTestV7CompatTest
transformV7RestTests -> yamlRestTestV7CompatTransform
```
`7` is the N-1 version and calculated, such that when `8` is
N-1 version the task names will be `yamlRestTestV8CompatTest` and
`yamlRestTestV8CompatTransform`
The motivation for `yamlRestCompatTest -> yamlRestTestV7CompatTest` is that
many projects have configured `yamlRestCompatTest`
but that configuration is specific to the N-1 version. For example,
if we blacklist tests when running compatibility with v7, we don't also
want to blacklist those tests when running compatibility with v8.
By introducing a version-specific identifier in the name, the task will not
even exist when bumping the version creating the need to (correctly) remove
the version-specific condition.
The motivation for `transformV7RestTests -> yamlRestTestV7CompatTransform`
is to provide more consistent naming.
The idea behind the naming is the main task people
are likely familiar with is :
`yamlRestTest` so we will use that as a base.
`yamlRestTestV7CompatTest` to run the version-specific compat tests
`yamlRestTestV7CompatTransform` to run the version-specific transformations for the compat tests
CI should be un-effected since since we introduced a lifecycle task
name `checkRestCompat` which is what CI should be configured to use.
Today `AbstractRefCounted` has a `name` field which is only used to
construct the exception message when calling `incRef()` after it's been
closed. This isn't really necessary, the stack trace will identify the
reference in question and give loads more useful detail besides. It's
also slightly irksome to have to name every single implementation.
This commit drops the name and the constructor parameter, and also
introduces a handy factory method for use when there's no extra state
needed and you just want to run a method or lambda when all references
are released.
This PR implements support for multiple validators to a FieldMapper.Parameter.
The Parameter#setValidator method was replaced by Parameter#addValidator that can be called multipled times
to add validation to a parameter.
All validators of a parameter will be executed in the same order as they have been added and if any of them fails all validation will failed.
This introduces a basic public yaml rest test plugin that is supposed to be used by external
elasticsearch plugin authors. This is driven by #76215
- Rename yaml-rest-test to intern-yaml-rest-test
- Use public yaml plugin in example plugins
Co-authored-by: Mark Vieira <portugee@gmail.com>
Closes#74795.
Introduce two Docker image variants for Cloud. The first bundles
(actually installs) the S3, Azure and GCS repository plugins. The
second bundles all official plugins, but only installs the repository
plugins.
Both images also bundle Filebeat and Metricbeat.
The testing utils have been refactored to introduce a `docker`
sub-package. This allows the static `Docker.containerId` to be
shared without needing all the code in one big class. The code for
checking file ownership / permissions has also been refactored to
a more Hamcrest style, using a custom Docker file matcher.
v7compatibilityNotSupportedTests was introduced to make it easier to
track tests that have been identified as not needing compatible changes
and those that still need to be checked.
We have checked all tests now and the separate list is no longer needed.
relates #51816
relates #73912
* Reformatting to keep Checkstyle after formatting
* Configure spotless everywhere, and disable the tasks if necessary
* Add XContentBuilder helpers, fix test
* Tweaks
* Add a TODO
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This refactors the signature of snapshot finalization. For one it allows removing
a TODO about being dependent on mutable `SnapshotInfo` which was not great but
more importantly this sets up a follow-up where state can be shared between the
cluster state update at the end of finalization and subsequent old-shard-generation
cleanup so that we can resolve another open TODO about leaking shard generation files
in some cases.
This was fixed for the repository plugin at some point as well.
We don't need the Apache http client here since we're using the
net http one.
=> One less dependency on our system wide http dependency to think about.
It's in the title, we were not accounting for relative paths at all
here and only saved by the fact that we mostly short-circuit to
non-streaming writes.
Extended testing to catch this case for S3 and would do a follow-up
to extend it for the other implementations as well.
Adds minimal fields API support to sort and score scripts.
Example: `field('myfield').getValue(123)` where `123` is the default if the field has no values.
Refs: #61388
If the underlying directory for an HdfsBlobContainer has been deleted (such as by calling HdfsBlobContainer.delete()) then listBlobsByPrefix() was throwing a FileNotFoundException. This change makes listBlobsByPrefix() return an empty array instead, which is inline with the behavior of FsBlobContainer. It also adds HdfsSnapshotRepoTestKitIT, which runs the repo analyzer against the HDFS repo.
Closes#73708
SimpleFS is deprecated and will be removed in Lucene 9. This commit
deprecates SimpleFS in 7.x and uses NIOFS for SimpleFS in Elasticsearch
7.15 or later as it offers superior or equivalent performance to
SimpleFS.
In FIPS mode loading the `.p12` keystore used by the new SDK version is not supported
because of "PBE AlgorithmParameters not available". Fortunately, the SDK still includes
the old jks trust store so we can just manually load it the same way it was loaded by
the previous version to fix things.
Also, fixed `SocketAccess` to properly rethrow this kind of exception and not run into
a class cast issue.
Closes#75023
relates https://github.com/googleapis/google-api-java-client/pull/1738
ParseContext is used to parse documents. It was easily confused with ParserContext (now renamed to MappingParserContext) which is instead used to parse mappings.
To remove any confusion, this commit renames ParseContext to DocumentParserContext and adapts its subclasses accordingly.
This PR adds a new API for doing streaming serialization writes to a repository to enable repository metadata of arbitrary size and at bounded memory during writing.
The existing write-APIs require knowledge of the eventual blob size beforehand. This forced us to materialize the serialized blob in memory before writing, costing a lot of memory in case of e.g. very large `RepositoryData` (and limiting us to `2G` max blob size).
With this PR the requirement to fully materialize the serialized metadata goes away and the memory overhead becomes completely bounded by the outbound buffer size of the repository implementation.
As we move to larger repositories this makes master node stability a lot more predictable since writing out `RepositoryData` does not take as much memory any longer (same applies to shard level metadata), enables aggregating multiple metadata blobs into a single larger blobs without massive overhead and removes the 2G size limit on `RepositoryData`.
This PR returns the get snapshots API to the 7.x format (and transport client behavior) and enhances it for requests that ask for multiple repositories.
The changes for requests that target multiple repositories are:
* Add `repository` field to `SnapshotInfo` and REST response
* Add `failures` map alongside `snapshots` list instead of returning just an exception response as done for single repo requests
* Pagination now works across repositories instead of being per repository for multi-repository requests
closes#69108closes#43462
Aiming for configuring less during the build,
this removes non required configuration from qa build scripts that do not
contain any sources. We also remove a few non required afterEvaluate hooks
Currently the `_ignore` field indexes and stores the names of every field in a document that has been ignored
because eg. it was malformed. The `ignore_above` option for keyword-type fields
serves a somewhat similar purpose, so this change add logix that adds these
fields to the "_ignored" field as well for `keyword`, `wildcard` and
`icu_collation_keyword` fields.
Closes#74228
The version field on all lucene Analyzers is unused, and is being removed
in lucene 9. This commit deprecates setting a version on an analyzer in
index settings and removes the related calls to Analyzer.setVersion()
Relates to #74057
With work to make repo APIs more async incoming in #73570
we need a non-blocking way to run this check. This adds that async
check and removes the need to manually pass executors around as well.
Modularization of the JDK has been ongoing for several years. Recently
in Java 16 the JDK began enforcing module boundaries by default. While
Elasticsearch does not yet use the module system directly, there are
some side effects even for those projects not modularized (eg #73517).
Before we can even begin to think about how to modularize, we must
Prepare The Way by enforcing packages only exist in a single jar file,
since the module system does not allow packages to coexist in multiple
modules.
This commit adds a precommit check to the build which detects split
packages. The expectation is that we will add the existing split
packages to the ignore list so that any new classes will not exacerbate
the problem, and the work to cleanup these split packages can be
parallelized.
relates #73525
ParseField is part of the x-content lib, yet it doesn't exist under the
same root package as the rest of the lib. This commit moves the class to
the appropriate package.
relates #73784
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.
relates #73784
The recent upgrade of the Azure SDK has caused a few test failures that
have been difficult to debug and do not yet have a fix. In particular, a
change to the netty reactor resolving
(https://github.com/reactor/reactor-netty/issues/1655). We need to wait
for a fix for that issue, so this reverts commit
6c4c4a0ecb.
relates #73493
The org.elasticsearch.bootstrap package exists in server with classes
for starting up Elasticsearch. The elasticsearch-core jar has a handful
of classes that were split out from there, namely java version parsing
and jarhell. This commit moves those classes to a new
org.elasticsearch.jdk package so as to not split the server owned
bootstrap package.
relates #73784
Since Java 16, the default value for illegal-access is deny. This means
the latest release of Elasticsearch, and all current integration tests,
run with deny (since we don't explicitly set it in jvm options). Yet
tests run with illegal-access=warn, for legacy reasons. #71908
proposed to remove the setting from test jvms, but concerns were raised
there about whether this would cause some test failures.
This commit explicitly sets tests to deny. This has the added benefit
that any failures will be caught even when running tests with older
jvms.
This commit upgrades the Azure SDK to 12.11.0 and Jackson to 2.12.2. The
Jackson upgrade must happen at the same time due to Azure depending on
this new version of Jackson.
closes#66555closes#67214
Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com>
This changes the result of AuthorizationEngine.loadAuthorizedIndices
(and dependent methods) from List<String> to Set<String>.
This has the following performance benefits:
1. `contains` checks are faster
2. RBACEngine always formed this collections as a Set, so this
change reduces unnecessary copying.
An additional performance improvement was added when resolve authorized
index names for data streams.
Upgrades to Lucene-8.9 snapshot which includes:
- LUCENE-9507: Custom order for leaves (/cc @mayya-sharipova)
- LUCENE-9935: Enable bulk merge for stored fields with index sort