Changes:
* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
* Remove frozen tier restriction for ESS
* Remove section from 'Use ES for time series data'
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This commit revives the documentation of the "Clear Cache" and
"Shard Stats" APIs of Searchable Snapshots that was removed
in #62217. This is a partial revert of the commit b545c55 with
some light wording changes.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Removes the experimental status for the frozen tier / shared_cache searchable snapshots for the 7.13 release.
Also adapts docs that URL repositories are now supported in 7.13 for searchable snapshots.
We previously allowed but deprecated the ability for the shared cache to
be positively sized on nodes without the frozen role. This is because we
only allocate shared_cache searchable snapshots to nodes with the frozen
role. This commit completes our intention to deprecate/remove this
ability.
This adds additional documentation for shared_cache searchable snapshots that are targeting the frozen tier:
- it generalizes the introduction section on searchable snapshots, mentioning that they come in two flavors now
as well as the relation to cold and frozen tiers,
- it expands the shared_cache section and
- it adds Cloud-specific instructions for getting started with the frozen tier
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Co-authored-by: debadair <debadair@elastic.co>
Co-authored-by: David Turner <david.turner@elastic.co>
This commit spells out how important repository reliability is to
searchable snapshots, and also documents a procedure for taking a backup
of a snapshot repository.
Relates #54944
A frozen tier is backed by an external object store (like S3) and caches only a
small portion of data on local disks. In this way, users can reduce hardware
costs substantially for infrequently accessed data. For the frozen tier we only
pull in the parts of the files that are actually needed to run a given search.
Further, we don't require the node to have enough space to host all the files.
We therefore have a cache that manages which file parts are available, and which
ones not. This node-level shared cache is bounded in size (typically in relation
to the disk size), and will evict items based on a LFU policy, as we expect some
parts of the Lucene files to be used more frequently than other parts. The level
of granularity for evictions is at the level of regions of a file, and does not
require evicting full files. The on-disk representation that was chosen for the
cold tier is not a good fit here, as it won't allow evicting parts of a file.
Instead we are using fixed-size pre-allocated files and have implemented our own
memory management logic to map regions of the shard's original Lucene files onto
regions in these node-level shared files that are representing the on-disk
cache.
This PR adds the core functionality to searchable snapshots to power such a
frozen tier:
- It adds the node-level shared cache that evicts file regions based on a LFU
policy
- It adds the machinery to dynamically download file regions into this cache and
serve their contents when searches execute.
- It extends the mount API with a new parameter, `storage`, which selects the
kind of local storage used to accelerate searches of the mounted index. If set
to `full_copy` (default, used for cold tier), each node holding a shard of the
searchable snapshot index makes a full copy of the shard to its local storage.
If set to `shared_cache`, the shard uses the newly introduced shared cache,
only holding a partial copy of the index on disk (used for frozen tier).
Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: David Turner <david.turner@elastic.co>
Taking a snapshot of a cluster containing searchable snapshot indices is
kind of mindbending. This commit adds docs to indicate that this does
work.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Today we recommend every index to have at least one replica in our
guidelines for designing a resilient cluster. This advice does not apply
to searchable snapshot indices. This commit adjusts the resiliency docs
to account for this. It also slightly adjusts the wording in the
searchable snapshots docs to be more consistent about the distinction
between a "searchable snapshot" and a "searchable snapshot index".
Clarify that searchable snapshots only result in cost savings for less
frequently accessed data and that the savings do not apply to the entire
cluster.
This commit removes the documentation for some specific Searchable Snapshot REST APIs:
- clear cache
- searchable snapshot stats
- repository stats
These APIs are low-level and are useful to investigate the behavior of snapshot
backed indices but we expect them to be removed in the future or to appear in
a different form.
Provides basic repository-level stats that will allow us to get some insight into how many
requests are actually being made by the underlying SDK. Currently only tracks GET and LIST
calls for S3 repositories. Most of the code is unfortunately boiler plate to add a new endpoint
that will help us better understand some of the low-level dynamics of searchable snapshots.
This commit merges the searchable-snapshots feature branch into master.
See #54803 for the complete list of squashed commits.
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>