When create template v2. mappings will add _doc. This will cause the created template to be inconsistent with the queried template.
In template class, add mappingsEquals method, to deal with this case.
reproduced:
MetadataIndexTemplateServiceTests.testAddComponentTemplate
when mappings are not null, the case will failed.
```
Template template = new Template(
Settings.builder().build(),
new CompressedXContent("{\"properties\":{\"@timestamp\":{\"type\":\"date\"}}}"),
ComponentTemplateTests.randomAliases()
);
ComponentTemplate componentTemplate = new ComponentTemplate(template, 1L, new HashMap<>());
```
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Fixes the text field mapper and the analyzers class that also retained parameter references that go really heavy.
Makes `TextFieldMapper` take hundreds of bytes compared to multiple kb per instance.
closes#73845
Closes#76812. Closes#77126.
OsProbe was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.
Note that we have to open access to all of /sys/fs/cgroup because with
cgroups v2, the files we need are in an unpredictably location.
Currently we configure per-field postings formats by asking the MapperService
for the MappedFieldType of the field in question, and then checking to see if it
is a completion field. If no MappedFieldType is available, we emit a warning.
However, MappedFieldTypes are for search fields only, and so we end up emitting
warnings for hidden sub-fields that have no corresponding field type, such as
prefix or phrase accelerator fields on text mappers.
This commit reworks things so that the MappingLookup is responsible for defining
per-field postings formats, and will detect CompletionFieldMappers at build time.
All fields that are not mapped to a completion field will just get the default postings
format. This also means that we no longer need a logger instance on CodecService.
Fixes#77183
It is beneficial to sort segments within a datastream's index
by desc order of their max timestamp field, so
that the most recent (in terms of timestamp) segments
will be first.
This allows to speed up sort query on @timestamp desc field,
which is the most common type of query for datastreams,
as we are mostly concerned with the recent data.
This patch addressed this for writable indices.
Segments' sorter is different from index sorting.
An index sorter by itself is concerned about the order of docs
within an individual segment (and not how the segments are organized),
while the segment sorter is only used during search and allows
to start docs collection with the "right" segment,
so we can terminate the collection faster.
This PR adds a property to IndexShard `isDataStreamIndex` that
shows if a shard is a part of datastream.
`AbstractFilteringTestCase` had a ton of carefully formatted
`XContentBuilder` calls that would have been totally unreadable after
the formatter worked its magic. So I formatted a bunch of them by hand
and extracted the rest to json files.
Today ConfiguredHostsResolver and TransportAddressConnector are declared
within PeerFinder, for no particular reason. This commit lifts them to
the top level.
Today it's possible to open two connections to a node, and then we
notice when registering the second connection and close it instead.
Fixing #67873 will require us to keep tighter control over the identity
and lifecycle of each connection, and opening redundant connections gets
in the way of this. This commit adds a check for an existing connection
_after_ marking the connection as pending, which guarantees that we
don't open those redundant connections.
Today `AbstractRefCounted` has a `name` field which is only used to
construct the exception message when calling `incRef()` after it's been
closed. This isn't really necessary, the stack trace will identify the
reference in question and give loads more useful detail besides. It's
also slightly irksome to have to name every single implementation.
This commit drops the name and the constructor parameter, and also
introduces a handy factory method for use when there's no extra state
needed and you just want to run a method or lambda when all references
are released.
Previously, we handled the case of a write request to a system index
alias without a backing index by auto-creating the primary index. This
had the unfortunate side-effect of making it impossible to auto-create
non-primary system indices. This commit fixes the bug so that we can
handle both cases.
* Add internal cluster test for system index auto create
* Allow auto-creation of non-primary indices for a system index pattern
* Use primary index if autocreate is called with system index alias name
The SearchHit.getFields() methods returns both document and metadata
fields commingled. This commit adds new methods to retrieve them
separately.
Fixes#77171
When doing out of order finalizations we would leak shard level metadata blobs at times.
This commit enhances the cleanup logic after finalization to catch these leaked blobs
and adds a test that would without this fix trip the leaked blobs assertion in the test
infrastructure.
In certain concurrent indexing scenarios where there are deletes
executed and then a new indexing operation, the following engine
considers those as updates breaking one of the assumed invariants.
Closes#72527
Investigating the heap use of mapper instances I found this.
It seems quite a bit of overhead for these instances goes into
the builder field. In other mappers we retain the script service
and the script outright, so I did the same thing here to make these
instances a little smaller.
We have to account for queued up clones when dealing with nodes dropping out
and start them when they become ready to execute because of a node leaving the cluster.
Added test to reproduce the issue in #77101 and another test to verify that the more complex
case of clone queued after snapshot queued after clone still works correctly as well.
The solution here is the most direct fix I could think of and the by far easiest to backport.
That said, I added a TODO that asks for a follow-up that should allow for completely removing
the duplicate code across handling shard updates and external changes. The difference between
the two ways of updating the state is a left-over from the time before we had concurrent
operations and has become a needless complexity nowadays.
closes#77101
This adds the pattern into the error message returned when trying to
fetch fields. So this:
```
POST _search {
"fields": [ { "field": "*", "format": "date_time" } ]
}
```
Will return an error message like
```
error fetching [foo] which matches [*]: Field [foo] of type [keyword] doesn't support formats
```
In a number of places, we read and write binary data into byte arrays using lucene's
DataInput and DataOutput abstractions. In lucene 9 these abstractions are changing
the endianness of their read/writeInt methods. To avoid dealing with this formatting
change, this commit changes things to use elasticsearch StreamInput/StreamOutput
abstractions instead, which have basically the same API but will preserve endianness.
Relates to #73324
This adds two utility methods for to validate the parameters to the
`docValueFormat` method and replaces a pile of copy and pasted code with
calls to them. They just emit a standard error message if the any
unsupported parameters are provided.
rate aggregation should support being a sub-aggregation
of a composite agg.
The catch is that the composite aggregation source
must be a date histogram. Other sources can be present
but their must be exactly one date histogram source
otherwise the rate aggregation does not know which
interval to compare its unit rate to.
closes https://github.com/elastic/elasticsearch/issues/76988
Closes#76812.
`OsProbe` was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.
This PR implements support for multiple validators to a FieldMapper.Parameter.
The Parameter#setValidator method was replaced by Parameter#addValidator that can be called multipled times
to add validation to a parameter.
All validators of a parameter will be executed in the same order as they have been added and if any of them fails all validation will failed.
This commit cleans up some cruft left over from older versions of the
`ClusterApplierService`:
- `UpdateTask` doesn't need to implement lots of interfaces and give
access to its internals, it can just pass appropriate arguments to
`runTasks()`.
- No need for the `runOnApplierThread` override with a default priority,
just have callers be explicit about the priority.
- `submitStateUpdateTask` takes a config which never has a timeout, may
as well just pass the priority and remove the dead code
- `SafeClusterApplyListener` doesn't need to be a
`ClusterApplyListener`, may as well just be an `ActionListener<Void>`.
- No implementations of `ClusterApplyListener` care about the source
argument, may as well drop it.
- Adds assertions to prevent `ClusterApplyListener` implementations from
throwing exceptions since we just swallow them.
- No need to override getting the current time in the
`ClusterApplierService`, we can control this from the `ThreadPool`.
The randomization of the repo version often wasn't used because of the repository cache.
Force re-creating the repository every time we manually mess with the versions.
The QueryStringQuery parser assumes that wildcard queries should use normalized values in queries.
The KeywordScriptFieldType did not support this so was throwing an error. Given there is currently no concept of normalisation in scripted fields I assume it is safe to just add support for this in the same way un-normalized wildcard queries are handled - it feels right that they should behave the same rather than throw an error.
Added a test too.
Closes#76838
This PR adds the `REPLACE` shutdown type. As of this PR, `REPLACE` behaves identically to `REMOVE`.
Co-authored-by: Lee Hinman <lee@writequit.org>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
`TransportFieldCapabilitiesAction` currently holds a lot of logic. This PR
breaks it up into smaller pieces and simplifies its large `doExecute` method.
Simplifying the class will help before we start to make field caps
optimizations.
Changes:
* Factor some methods out of `doExecute` to reduce its length
* Pull `AsyncShardAction` out into its own class to simplify and better match
the code structure in 7.x
when rounding UTC timestamps we convert a timestamp from UTC to local, round this to closest midnight, then we convert back to UTC.
This means, that for a timestamp close to a DST we need to make sure we collected a transition that will be needed when converting back to UTC.
To do this, we decrease the minUtcMillis by 2* unit to sure that the additional transition that could affect the timestamp is also fetched and the correct minimum is used in further lookups
closes#73995
* [TEST] Implement HotThreads unit tests
Add unit tests for the internal HotThreads logic for calculating and
sorting threads by CPU, Wait and Blocked "hotness". Also adds tests
for identifying certain threads as idle, as well as supported report
types (e.g. cpu, wait, blocked).
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
We can have a freak situation here where running the get snapshots
request concurrently with a delete produces a missing snapshot exception
if a snasphot is deleted just as its metadata is fetched.
This is a known issue and a fix is tricky (SLM etc. work around this issue
in tests and prod code by using the ignore-unavailable flag for example).
In this test we can easily fix the problem by just using the deterministic waiting
on cluster state before asserting that the snapshots are gone from the repo.
closes#76549