Commit Graph

20 Commits

Author SHA1 Message Date
Armin Braun 8947c1e980
Save Memory on Large Repository Metadata Blob Writes (#74313)
This PR adds a new API for doing streaming serialization writes to a repository to enable repository metadata of arbitrary size and at bounded memory during writing. 
The existing write-APIs require knowledge of the eventual blob size beforehand. This forced us to materialize the serialized blob in memory before writing, costing a lot of memory in case of e.g. very large `RepositoryData` (and limiting us to `2G` max blob size).
With this PR the requirement to fully materialize the serialized metadata goes away and the memory overhead becomes completely bounded by the outbound buffer size of the repository implementation. 

As we move to larger repositories this makes master node stability a lot more predictable since writing out `RepositoryData` does not take as much memory any longer (same applies to shard level metadata), enables aggregating multiple metadata blobs into a single larger blobs without massive overhead and removes the 2G size limit on `RepositoryData`.
2021-06-29 11:29:55 +02:00
Rory Hunter a5d2251064
Order imports when reformatting (#74059)
Change the formatter config to sort / order imports, and reformat the
codebase. We already had a config file for Eclipse users, so Spotless now
uses that.

The "Eclipse Code Formatter" plugin ought to be able to use this file as
well for import ordering, but in my experiments the results were poor.
Instead, use IntelliJ's `.editorconfig` support to configure import
ordering.

I've also added a config file for the formatter plugin.

Other changes:
   * I've quietly enabled the `toggleOnOff` option for Spotless. It was
     already possible to disable formatting for sections using the markers
     for docs snippets, so enabling this option just accepts this reality
     and makes it possible via `formatter:off` and `formatter:on` without
     the restrictions around line length. It should still only be used as
     a very last resort and with good reason.
   * I've removed mention of the `paddedCell` option from the contributing
     guide, since I haven't had to use that option for a very long time. I
     moved the docs to the spotless config.
2021-06-16 09:22:22 +01:00
Ryan Ernst 68817d7ca2
Rename o.e.common in libs/core to o.e.core (#73909)
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.

relates #73784
2021-06-08 09:53:28 -07:00
Armin Braun 52e7b926a9
Make Large Bulk Snapshot Deletes more Memory Efficient (#72788)
Use an iterator instead of a list when passing around what to delete.
In the case of very large deletes the iterator is a much smaller than
the actual list of files to delete (since we save all the prefixes
which adds up if the individual shard folders contain lots of deletes).
Also this commit as a side-effect adjusts a few spots in logging where the
log messages could be catastrophic in size when trace logging is activated.
2021-05-10 13:40:57 +02:00
Armin Braun bef9dab643
Cleanup BlobPath Class (#72860)
There should be a singleton for the empty version of this.
All the copying to `String[]` or use as an iterator make
no sense either when we can just use the list outright.
2021-05-10 00:10:39 +02:00
David Turner 5838e4def8
Use a deterministic seed for repo test kit tests (#72321)
Today we do not specify the `?seed=` parameter when running the
repository analyzer in REST tests, so we cannot reproduce the set
of operations that led to a failure. This commit introduces a
deterministic value for this parameter.

Relates #72358 which seems to indicate some kind of bug in how certain
checksums are calculated in the test fixtures.
2021-04-28 10:28:37 +01:00
David Turner efc22bed82
Longer timeout in RepositoryAnalysisSuccessIT (#72314)
In #72229 a test run was observed to exceed this 5-second timeout. This
commit increases it to 20 seconds.
2021-04-27 16:13:27 +01:00
David Turner 2a1332829e
Fix up and reinstate BWC tests after backport (#72310)
Completes backport of #72077
2021-04-27 15:11:25 +01:00
David Turner 1c4791e398
Abort writes in repo analyzer (#72077)
We rely on the repository implementation correctly handling the case where a
write is aborted before it completes. This is not guaranteed for third-party
repositories.

This commit adds a rare action during analysis which aborts the write
just before it completes and verifies that the target blob is not found
by any node.
2021-04-27 14:13:22 +01:00
Jason Tedor e119ac60d4
Move data tier roles to server (#71084)
This commit moves the data tier roles to server. It is no longer
necessary to separate these roles from server as we no longer build
distributions that would not contain these roles. Moving these roles
will simplify many things. This is deliberately the smallest possible
commit that moves these roles. Other aspects related to the data tiers
can move in separate, also small, commits.
2021-03-31 15:13:02 -04:00
David Turner 7be55a2a34
Avoid atomic overwrite tests on FS repositories (#70483)
Today we leniently permit overwrites of blobs in a repository not to be
atomic, since they are not in shared filesystem repositories. In fact
it's worse, on Windows overwrites do not even work if there is a
concurrent reader. In practice this isn't very important, we do almost
no overwrites and almost never read the file that's being overwritten,
but we do still test for atomic overwrites in the repository analyzer.

With this commit we suppress the atomic overwrite checks in the
repository analyzer for FS repositories, and remove the lenience since
all other repositories should implement atomic overwrites correctly.

Closes #70303
2021-03-22 10:51:33 +00:00
David Turner 421df6c797
Avoid atomic write of large blobs in repo analyzer (#69960)
Today we randomly perform an atomic write even if there are no early
reads, but we should only do so if the blob is small enough to write
atomically.
2021-03-04 16:38:08 +00:00
David Turner fe6f50e121
Fix lower size bounds in RandomBlobContent*Tests (#69771)
The repository analyzer does not write empty blobs to the repository,
and we assert that the blobs are nonempty, but the tests randomly check
for the empty case anyway. This commit ensures that the blobs used in
tests are always nonempty.

Relates #67247
2021-03-02 10:30:17 +00:00
Francisco Fernández Castaño 28306b4411
Take into account `base_path` setting during repository analysis execution. (#69690)
Relates #67247
2021-03-01 16:51:52 +01:00
David Turner bf8819d04c
Fix testRepositoryAnalysis (#69538)
This test would occasionally generate an invalid request in which the
max total data size cannot be satisfied. This commit ensures the max
total data size is always sufficiently large.

Relates #67247
Closes #69219
2021-02-24 15:50:22 +00:00
Julie Tibshirani 7a0c379c90 Mute RepositoryAnalysisSuccessIT#testRepositoryAnalysis. 2021-02-18 12:09:37 -08:00
David Turner 5dc26dad94
Permit overwritten blobs to be missing (#69102)
Today blob overwrites are not completely atomic, there is an
intermediate state in which the blob is missing, but the repository
analyser does not fully account for that intermediate state.

This commit relaxes the check for missing blobs if they were
overwritten.

Closes #69087
2021-02-17 10:49:18 +00:00
Julie Tibshirani a7c5c2e0ea Mute all testRepositoryAnalysis tests. 2021-02-16 14:03:02 -08:00
David Turner 6f02a9b088
Use ActionRunnable#wrap not ActionRunnable#run (#69047)
Fixes a bug introduced in #67247.
2021-02-16 15:22:14 +00:00
David Turner 92d13a3f7d
Introduce repository test kit/analyser (#67247)
Today we rely on blob stores behaving in a certain way so that they can be used
as a snapshot repository. There are an increasing number of third-party blob
stores that claim to be S3-compatible, but which may not offer a suitably
correct or performant implementation of the S3 API. We rely on somesubtle
semantics with concurrent readers and writers, but some blob stores may not
implement it correctly. Hitting a corner case in the implementation may be rare
in normal use, and may be hard to reproduce or to distinguish from an
Elasticsearch bug.

This commit introduces a new `POST /_snapshot/.../_analyse` API which exercises
the more problematic corners of the repository implementation looking for
correctness bugs and measures the details of the performance of the repository
under concurrent load.
2021-02-16 14:24:40 +00:00