Commit Graph

33 Commits

Author SHA1 Message Date
Stuart Tettemer d8b2c52c82
Metrics refactor - split registry and service (#101154)
This splits out the registry and the service, which makes testing easier and removes much of the delegation from the old `APMMeter` to `Instruments` (now renamed `APMMeterRegistry`).

APMMeterService takes care of the lifecycle and APMMeterRegistry holds the instruments.
2023-10-23 13:28:46 -05:00
Przemyslaw Gomulka 3465a2bf18
Fix metric gauge creation model (#100609)
OTEL gauges should follow the callback model otherwise they will not be sent by
apm java agent. (or use BatchCallback)
This commit changes the gagues creation model to return Observable*Gauge
and uses AtomicLong/Double to store current value which will be polled when
metrics are exported (and callback is called)
2023-10-10 13:28:02 -05:00
Simon Cooper d81dbfa8da
Fix race condition in InstrumentsConcurrencyTests (#100518)
Fix a race condition between the two threads in
InstrumentsConcurrencyTests. If the second thread gets the lock first,
the test fails.

Fixes #100251
2023-10-09 11:57:13 -04:00
Stuart Tettemer 110dd5ed16
Tracing: Use doPriv when working with spans, use SpanId (#100232)
`SpanId` is used when explicitly closing the trace in `executeQueryPhase` to avoid double closing the associated task.

`doPrivileged` avoids hitting `java.lang.UnsupportedOperationException: Cannot define class using reflection: access denied ("java.lang.reflect.ReflectPermission" "suppressAccessChecks")` when classes are sometimes injected while switching spans.

Removed `default Releasable withScope(Task task)` from the Tracer API because it automatically created a span id and, in one of the three uses, that SpanId was necessary to close the span.

Fixes: #100072
2023-10-05 12:58:18 -05:00
Luca Cavanna 5bfaa7d2f0 Address bad merge
Adjust the RegExp import in APMTracer
2023-10-02 16:10:16 +02:00
Luca Cavanna 689a1e490a Merge branch 'main' into lucene_snapshot_9_8 2023-10-02 13:56:12 +02:00
Lorenzo Dematté cc572fd92d
Moved APM service version from Version to Build.version() (#100084) 2023-10-02 12:12:35 +02:00
Przemyslaw Gomulka b856bf264d
Update the elastic-apm-agent version (#100064)
The latest version contains a fix to allow sending metrics to APM server. also adds a apm agent jvm options
"enable_experimental_instrumentations", "true"
which is required to enable the otel-metrics-instrumentation.

relates https://github.com/elastic/elasticsearch/pull/99832
2023-09-29 14:35:04 -05:00
Stuart Tettemer f8d09e9c6c
APM Metering API (#99832)
Adds Metering instrument interfaces and adapter implementations for opentelemetry instrument types:
* Gauge - a single number that can go up or down
* Histogram - bucketed samples
* Counter - monotonically increasing summed value
* UpDownCounter - summed value that may decrease

Supports both Long* and Double* versions of the instruments.

Instruments can be registered and retrieved by name through APMMeter which is available via the APMTelemetryProvider.

The metering provider starts as the open telemetry noop provider.

`telemetry.metrics.enabled` turns on metering.
2023-09-28 19:35:46 -05:00
Luca Cavanna 15c87b681c Merge branch 'main' into lucene_snapshot_9_8 2023-09-28 12:19:14 +02:00
Przemyslaw Gomulka eca41871aa
Use TelemetryProvider in Plugin::createComponents (#99737)
in order to avoid adding yet anther parameter to createComponents
a Tracer interface is replaced with TelemetryProvider.
this allows to get both Tracer and Metric (in the future) interfaces
2023-09-22 14:48:11 +02:00
Przemyslaw Gomulka 0efa67821d
Rename TracerPlugin to TelemetryPlugin (#99735)
with the support of metrics the TracerPlugin name is no longer adequate. Renaming this to TelemetryPlugin.
Also introducing TelemetryProvider interface. While it is only used in Node.java at the moment to fetch Tracer instance, it is intended to be used in Plugin::createComponents (to be done in separate commit due to
the broad scope of this method)
This will allow for plugins to get access to both Tracer and Metric interfaces
without the need to add yet another argument to createComponents

Also adding internal subpackage in module/apm so that it is more obvious
which packages are not exported
2023-09-22 13:35:36 +02:00
Luca Cavanna 270de88ea0 Merge branch 'main' into lucene_snapshot_9_8 2023-09-20 20:41:55 +02:00
Przemyslaw Gomulka b6747b48ba
Rename tracing to telemetry package (#99710)
This commit renames the tracing to telemetry.tracing in both xpack/APM and elasticserach's org.elasticsearch.tracing.Tracer (the api)
the xpack/APM is renamed as follows:
org.elasticsearch.telemetry.apm - the only exported package
org.elasticsearch.telemetry.apm.settings - APMSettings
org.elasticsearch.telemetry.apm.tracing - APMTracer

org.elasticsearch.tracing.Tracer is moved to org.elasticsearch.telemetry.tracing.Tracer (responsible for majority of the changes in this PR)
2023-09-20 16:58:02 +02:00
Przemyslaw Gomulka c7c3f877e8
Add java.net.NetPermission for apm's plugin security (#99474)
when apm is enabled it throws a security manager exception:
java.security.AccessControlException: access denied ("java.net.NetPermission" "getProxySelector")
This commit adds a permission so that apm can be enabled
2023-09-19 20:45:37 +02:00
elasticsearchmachine daaafffe1e Merge remote-tracking branch 'origin/main' into lucene_snapshot 2023-09-14 10:05:48 +00:00
Stuart Tettemer 886e35fa76
Tracer requires io.opentelemetry.api (#99550) 2023-09-13 14:43:06 -05:00
Stuart Tettemer cb380afb03
APM module-info (#99548)
Add module-info.java for APM. This allows it to be excluded in other builds.
2023-09-13 12:19:40 -05:00
Adrien Grand e7356a0680
Remove now useless RegExp wrapper. (#99226)
Parsing regexps no longer raises stack overflows thanks to apache/lucene#12462.
2023-09-06 16:53:12 +02:00
Ievgen Degtiarenko d9b6c5ae29
Wire IndicesService to plugins (#97081)
This change exposes IndicesService to the plugins via Plugin#createComponents
2023-06-27 18:02:23 +02:00
Albert Zaharovits 669c50cbc7
ES APM traces for HTTP requests include authn duration (#96205)
Following the changes in #95112, which relocated the calls
into the AuthenticationService that authenticate HTTP
requests, the authentication duration was no longer
comprised in between the Tracer#startTrace and
Tracer#stopTrace. Consequently, the span records
didn't cover the authentication duration any longer.

This PR remedies that by changing the Tracer
implementation, i.e. APMTracer, to look for the trace start
time instant in the transient thread context and use that
when starting traces (overriding the now default).
The trace start time is set in the thread context when
the request-wise thread context is first populated
(with HTTP request headers).
2023-05-18 19:57:10 +03:00
Armin Braun af5c11702b
Use string list setting throughout codebase (#95901)
We can dry things up a little here and also making things a little faster
(in case we missed a corner case where a list setting is hot) with the optimized
string list setting constructor.
2023-05-09 11:22:28 +02:00
Rory Hunter fe1083f6c5
Upgrade spotless plugin to 6.17.0 (#94994)
Fixes #82794. Upgrade the spotless plugin, which addresses the issue
around formatting `instanceof` expressions. Formatting of statements
including lambdas seems to have improved too.
2023-04-04 10:03:32 +01:00
David Turner 431a7b2f53
Destringify APM tracer interface (#94864)
Today the APM `Tracer` interface identifies each span by a raw string,
but in practice there is structure to these strings: task-related spans
have IDs like `task-NNNN` and spans that relate to REST requests have
IDs like `rest-NNNN`. This convention is distributed across the codebase
a little too widely, so with this commit we centralise it into a
`SpanId` class, and introduce specific overrides for `Task` and
`RestRequest` to avoid callers needing to construct IDs themselves.
2023-03-30 10:19:39 +01:00
SylvainJuge 16c5b7e258
Update apm agent to 1.36.0 (#94716)
Fixes #94689.

The APM agent version 1.33.0 fails to start on JDK 20, which prevents
the APM integration to work as expected.  As a consequence, the
tracing does not work.

When setting `ELASTIC_APM_LOG_LEVEL=debug` and
`ELASTIC_APM_LOG_FILE=/tmp/log.txt`, the agent log shows that there
is an issue with accessing `Unsafe` (sorry I don't have the exact
stack trace).

There was a few changes in APM agent regarding the security manager
(SM) in recent versions, and updating the agent seems to make it
work as expected.

However, there is one known caveat so far
(https://github.com/elastic/apm-agent-java/issues/3074), keeping
the agent with `debug` log level with `ELASTIC_APM_LOG_LEVEL=debug`
makes it trigger another security exception when trying to establish
connection with apm-server because the agent prints few details if
a proxy is used or not (which is forbidden by default by the SM and
isn't yet wrapped in a privileged call.
2023-03-24 20:50:06 +00:00
Ievgen Degtiarenko ad229dd70e
Update createComponents to supply AllocationService instead of AllocationDeciders (#92785) 2023-01-10 14:18:33 +01:00
Rory Hunter 31ac6b0cc8
Make redaction configurable for APM tracing (#92358)
Closes #92338.

When tracing REST requests with APM, we capture HTTP headers as labels
on the trace, but redact sensitive values. However, we can't know ahead
of time what are all possible sensitive values.

Push this redaction into the tracer, and make the redaction terms
configurable. Switch the defaults to the APM Java agent's defaults.
2022-12-15 09:23:19 +00:00
Rick Boyd f7bb5e02c5
Support profiling queries in Tracer (#90574)
This pull request adds the necessary support, and implementation, for profiling queries in the Tracer.

In order to use the APM Agent's inferred spans functionality, the active span's context has to be open in the current thread. This PR adds context-sensitive methods to the Tracer interface, implements them in APMTracer, and makes use of them in the private SearchService.executeQueryPhase(), which is on the stack for a lot of our most critical operations.
2022-10-04 08:45:16 -04:00
Yannick Welsch a00a85cd8f
Safeguard RegExp use against StackOverflowError (#84624)
Closes #82923 Closes #82923
2022-09-14 05:36:39 +09:30
Nikola Grcevski fc819609a1
Add allocation deciders in createComponents (#89836)
With this change we are adding the allocation deciders
in create components we can simplify the use in the
Autoscaling plugin and implement reserved state handler
in the future.
2022-09-07 09:28:07 -04:00
Rory Hunter c541610fb5
Upgrade OpenTelemetry API and remove workaround (#89438)
Closes #89414. Remove the workaround from #89135 that addressed #89107,
and instead upgrade the OpenTelemetry API, which contains a fix for the
underlying issue.
2022-08-18 14:43:45 +01:00
David Turner c08111b5b7
Avoid expensive call to Span.fromContextOrNull(null) (#89135)
Workaround for #89107
2022-08-05 02:07:15 +09:30
Rory Hunter 512bfebc10
Provide tracing implementation using OpenTelemetry + APM agent (#88443)
Part of #84369. Implement the `Tracer` interface by providing a
module that uses OpenTelemetry, along with Elastic's APM
agent for Java.

See the file `TRACING.md` for background on the changes and the
reasoning for some of the implementation decisions.

The configuration mechanism is the most fiddly part of this PR. The
Security Manager permissions required by the APM Java agent make
it prohibitive to start an agent from within Elasticsearch
programmatically, so it must be configured when the ES JVM starts.
That means that the startup CLI needs to assemble the required JVM
options.

To complicate matters further, the APM agent needs a secret token
in order to ship traces to the APM server. We can't use Java system
properties to configure this, since otherwise the secret will be
readable to all code in Elasticsearch. It therefore has to be
configured in a dedicated config file. This in itself is awkward,
since we don't want to leave secrets in config files. Therefore,
we pull the APM secret token from the keystore, write it to a config
file, then delete the config file after ES starts.

There's a further issue with the config file. Any options we set
in the APM agent config file cannot later be reconfigured via system
properties, so we need to make sure that only "static" configuration
goes into the config file.

I generated most of the files under `qa/apm` using an APM test
utility (I can't remember which one now, unfortunately). The goal
is to setup up a complete system so that traces can be captured in
APM server, and the results in Elasticsearch inspected.
2022-08-03 14:13:31 +01:00