-
Notifications
You must be signed in to change notification settings - Fork 23
add dynamically disabled instrumentation capability #422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jackshirazi
merged 7 commits into
elastic:main
from
jackshirazi:dynamic-instrumentation
Nov 21, 2024
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
ec204e7
add dynamically disabled instrumentation capability
jackshirazi 6b67554
spotless
jackshirazi fb0779c
spotless
jackshirazi 8835435
fix access and concurrency for single-threaded checker
jackshirazi 80114b5
Update custom/src/main/java/co/elastic/otel/config/DynamicInstrumenta…
jackshirazi df90a7d
feedback changes: daemon thread and priority note/todo
jackshirazi a7e2244
spotless
jackshirazi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
269 changes: 269 additions & 0 deletions
269
custom/src/main/java/co/elastic/otel/config/DynamicInstrumentation.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,269 @@ | ||
| /* | ||
| * Licensed to Elasticsearch B.V. under one or more contributor | ||
| * license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright | ||
| * ownership. Elasticsearch B.V. licenses this file to you under | ||
| * the Apache License, Version 2.0 (the "License"); you may | ||
| * not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package co.elastic.otel.config; | ||
|
|
||
| import io.opentelemetry.api.GlobalOpenTelemetry; | ||
| import io.opentelemetry.api.trace.Tracer; | ||
| import io.opentelemetry.api.trace.TracerProvider; | ||
| import io.opentelemetry.sdk.common.InstrumentationScopeInfo; | ||
| import io.opentelemetry.sdk.internal.ComponentRegistry; | ||
| import io.opentelemetry.sdk.internal.ScopeConfigurator; | ||
| import io.opentelemetry.sdk.trace.SdkTracerProvider; | ||
| import io.opentelemetry.sdk.trace.SdkTracerProviderBuilder; | ||
| import io.opentelemetry.sdk.trace.internal.TracerConfig; | ||
| import java.lang.reflect.Field; | ||
| import java.lang.reflect.InvocationTargetException; | ||
| import java.lang.reflect.Method; | ||
| import java.util.ArrayList; | ||
| import java.util.HashMap; | ||
| import java.util.HashSet; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Set; | ||
| import java.util.concurrent.ConcurrentHashMap; | ||
| import java.util.concurrent.ConcurrentMap; | ||
| import java.util.logging.Level; | ||
| import java.util.logging.Logger; | ||
|
|
||
| /** | ||
| * Notes: 1. The instrumentation can't have been disabled by configuration, eg using | ||
| * -Dotel.instrumentation.[name].enabled=false as in that case it is never initialized so can't be | ||
| * "re-enabled" 2. The specific instrumentation name is used, you can see these by setting this | ||
| * class logging level to j.u.l.Level.CONFIG 3. The disable/re-enable is eventually consistent, | ||
| * needing the application to pass a synchronization barrier to take effect - but for most | ||
| * applications these are very frequent | ||
| */ | ||
| public class DynamicInstrumentation { | ||
|
|
||
| private static final Logger logger = Logger.getLogger(DynamicInstrumentation.class.getName()); | ||
| public static final String INSTRUMENTATION_NAME_PREPEND = "io.opentelemetry."; | ||
| // note the option can't be an env because no OSes support changing envs while the program runs | ||
| public static final String INSTRUMENTATION_DISABLE_OPTION = | ||
| "elastic.otel.java.disable_instrumentations"; | ||
|
|
||
| private static Object getField(String fieldname, Object target) { | ||
| try { | ||
| Field field = target.getClass().getDeclaredField(fieldname); | ||
| field.setAccessible(true); | ||
| return field.get(target); | ||
| } catch (NoSuchFieldException | IllegalAccessException e) { | ||
| throw new IllegalStateException( | ||
| "Error getting " + fieldname + " from " + target.getClass(), e); | ||
| } | ||
| } | ||
|
|
||
| private static Object call(String methodname, Object target) { | ||
| try { | ||
| Method method = target.getClass().getDeclaredMethod(methodname); | ||
| method.setAccessible(true); | ||
| return method.invoke(target); | ||
| } catch (NoSuchMethodException | IllegalAccessException | InvocationTargetException e) { | ||
| throw new IllegalStateException( | ||
| "Error calling " + methodname + " on " + target.getClass(), e); | ||
| } | ||
| } | ||
|
|
||
| private static <T> Object call( | ||
| String methodname, Object target, T arg1, Class<? super T> arg1Class) { | ||
| try { | ||
| Method method = target.getClass().getDeclaredMethod(methodname, arg1Class); | ||
| method.setAccessible(true); | ||
| return method.invoke(target, arg1); | ||
| } catch (NoSuchMethodException | IllegalAccessException | InvocationTargetException e) { | ||
| throw new IllegalStateException( | ||
| "Error calling " + methodname + " on " + target.getClass() + "(" + arg1Class + ")", e); | ||
| } | ||
| } | ||
|
|
||
| // SdkTracerProviderBuilder.setTracerConfigurator(ScopeConfigurator<<TracerConfig>> configurator) | ||
| // here because it's not currently public | ||
| public static SdkTracerProviderBuilder setTracerConfigurator( | ||
| SdkTracerProviderBuilder sdkTracerProviderBuilder, | ||
| ScopeConfigurator<TracerConfig> configurator) { | ||
| call("setTracerConfigurator", sdkTracerProviderBuilder, configurator, ScopeConfigurator.class); | ||
| return sdkTracerProviderBuilder; | ||
| } | ||
|
|
||
| // SdkTracerProvider.getTracerConfig(InstrumentationScopeInfo instrumentationScopeInfo) | ||
| // here because it's not currently public | ||
| private static TracerConfig getTracerConfig( | ||
| SdkTracerProvider provider, InstrumentationScopeInfo instrumentationScopeInfo) { | ||
| return (TracerConfig) | ||
| call("getTracerConfig", provider, instrumentationScopeInfo, InstrumentationScopeInfo.class); | ||
| } | ||
|
|
||
| // SdkTracer.getInstrumentationScopeInfo() | ||
| // here because it's not currently public | ||
| private static InstrumentationScopeInfo getInstrumentationScopeInfo(Tracer sdkTracer) | ||
| throws NoSuchFieldException, IllegalAccessException { | ||
| return (InstrumentationScopeInfo) call("getInstrumentationScopeInfo", sdkTracer); | ||
| } | ||
|
|
||
| // Not an existing method | ||
| // SdkTracerProvider.updateTracerConfigurations() | ||
| // updates all tracers with the current SdkTracerProvider.tracerConfigurator | ||
| // Code implementation equivalent to | ||
| // this.tracerSdkComponentRegistry | ||
| // .getComponents() | ||
| // .forEach( | ||
| // sdkTracer -> | ||
| // sdkTracer.updateTracerConfig( | ||
| // getTracerConfig(sdkTracer.getInstrumentationScopeInfo()))); | ||
| // where SdkTracer.updateTracerConfig(TracerConfig tracerConfig) is equivalent to | ||
| // this.tracerEnabled = tracerConfig.isEnabled(); | ||
| private static void updateTracerConfigurations(TracerProvider provider) { | ||
| if (!(provider instanceof SdkTracerProvider)) { | ||
| provider = (TracerProvider) getField("delegate", provider); | ||
| } | ||
| ComponentRegistry<Tracer> tracerSdkComponentRegistry = | ||
| (ComponentRegistry<Tracer>) getField("tracerSdkComponentRegistry", provider); | ||
| SdkTracerProvider finalProvider = (SdkTracerProvider) provider; | ||
| final List<String> activatedTracers; | ||
| if (logger.isLoggable(Level.CONFIG)) { | ||
| activatedTracers = new ArrayList<>(); | ||
| } else { | ||
| activatedTracers = null; | ||
| } | ||
| tracerSdkComponentRegistry | ||
| .getComponents() | ||
| .forEach( | ||
| sdkTracer -> { | ||
| try { | ||
| InstrumentationScopeInfo instrumentationScopeInfo = | ||
| getInstrumentationScopeInfo(sdkTracer); | ||
| TracerConfig tConfig = getTracerConfig(finalProvider, instrumentationScopeInfo); | ||
| Field tracerEnabledField = sdkTracer.getClass().getDeclaredField("tracerEnabled"); | ||
| tracerEnabledField.setAccessible(true); | ||
| // Update is synced but the reader is NOT necessarily so this is eventual | ||
| // consistency, takes effect when the application passes a sync boundary | ||
| synchronized (sdkTracer) { | ||
| tracerEnabledField.set(sdkTracer, tConfig.isEnabled()); | ||
| } | ||
| if (logger.isLoggable(Level.CONFIG)) { | ||
| String name = instrumentationScopeInfo.getName(); | ||
| if (name.startsWith(INSTRUMENTATION_NAME_PREPEND)) { | ||
| name = name.substring(INSTRUMENTATION_NAME_PREPEND.length()); | ||
| } | ||
| activatedTracers.add(name); | ||
| activatedTracers.add(tConfig.isEnabled() ? "enabled" : "disabled"); | ||
| } | ||
| } catch (NoSuchFieldException | IllegalAccessException e) { | ||
| throw new RuntimeException(e); | ||
| } | ||
| }); | ||
| if (logger.isLoggable(Level.CONFIG)) { | ||
| logger.log(Level.CONFIG, "Activated Tracers: " + activatedTracers); | ||
| } | ||
| } | ||
|
|
||
| public static void reenableTracesFor(String instrumentationName) { | ||
| UpdatableConfigurator.INSTANCE.put( | ||
| InstrumentationScopeInfo.create(INSTRUMENTATION_NAME_PREPEND + instrumentationName), | ||
| TracerConfig.enabled()); | ||
| updateTracerConfigurations(GlobalOpenTelemetry.getTracerProvider()); | ||
| } | ||
|
|
||
| public static void disableTracesFor(String instrumentationName) { | ||
| UpdatableConfigurator.INSTANCE.put( | ||
| InstrumentationScopeInfo.create(INSTRUMENTATION_NAME_PREPEND + instrumentationName), | ||
| TracerConfig.disabled()); | ||
| updateTracerConfigurations(GlobalOpenTelemetry.getTracerProvider()); | ||
| } | ||
|
|
||
| public static class UpdatableConfigurator implements ScopeConfigurator<TracerConfig> { | ||
| public static final UpdatableConfigurator INSTANCE = new UpdatableConfigurator(); | ||
| private final ConcurrentMap<String, TracerConfig> map = new ConcurrentHashMap<>(); | ||
|
|
||
| private UpdatableConfigurator() {} | ||
|
|
||
| @Override | ||
| public TracerConfig apply(InstrumentationScopeInfo scopeInfo) { | ||
| return map.getOrDefault(scopeInfo.getName(), TracerConfig.defaultConfig()); | ||
| } | ||
|
|
||
| public void put(InstrumentationScopeInfo scope, TracerConfig tracerConfig) { | ||
| map.put(scope.getName(), tracerConfig); | ||
| } | ||
| } | ||
|
|
||
| static { | ||
| if ("true".equals(System.getProperty(INSTRUMENTATION_DISABLE_OPTION + ".checker")) | ||
| || "true".equals(System.getenv("ELASTIC_OTEL_JAVA_DISABLE_INSTRUMENTATIONS_CHECKER"))) { | ||
| Thread checker = new Thread(new OptionChecker(), "Elastic dynamic_instrumentation checker"); | ||
| checker.setDaemon(true); | ||
| checker.start(); | ||
| } | ||
| } | ||
|
|
||
| static class OptionChecker implements Runnable { | ||
| private Map<String, Boolean> alreadyDisabled = new HashMap<>(); | ||
JonasKunz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| // Note that if the property and the API are both used to specify enablement | ||
| // for a particular instrument, and this thread is executing, the property | ||
| // will take priority if the instrument is in the property - by virtue of running | ||
| // more frequently; but won't if the instrument is removed from the property! | ||
| // TODO define priority of enablement by source of disabler | ||
| @Override | ||
| public void run() { | ||
| while (true) { | ||
| String disableList; | ||
| synchronized (this) { | ||
| disableList = System.getProperty(INSTRUMENTATION_DISABLE_OPTION); | ||
| } | ||
| if (disableList != null && !disableList.trim().isEmpty()) { | ||
| // some values in the disable_instrumentations list | ||
| Set<String> toBeEnabled = null; | ||
| if (!alreadyDisabled.isEmpty()) { | ||
| toBeEnabled = new HashSet<>(alreadyDisabled.keySet()); | ||
| } | ||
| for (String toBeDisabled : disableList.split(",")) { | ||
| toBeDisabled = toBeDisabled.trim(); | ||
| if (alreadyDisabled.containsKey(toBeDisabled)) { | ||
| // already disabled and keep it that way | ||
| if (toBeEnabled != null) { | ||
| toBeEnabled.remove(toBeDisabled); | ||
| } | ||
| } else { | ||
| DynamicInstrumentation.disableTracesFor(toBeDisabled); | ||
| alreadyDisabled.put(toBeDisabled, Boolean.TRUE); | ||
| } | ||
| } | ||
| if (toBeEnabled != null) { | ||
| for (String instrumentation : toBeEnabled) { | ||
| DynamicInstrumentation.reenableTracesFor(instrumentation); | ||
| alreadyDisabled.remove(instrumentation); | ||
| } | ||
| } | ||
| } else { | ||
| // empty list so anything currently disabled should be re-enabled | ||
| if (!alreadyDisabled.isEmpty()) { | ||
| for (String instrumentation : new HashSet<>(alreadyDisabled.keySet())) { | ||
| DynamicInstrumentation.reenableTracesFor(instrumentation); | ||
| alreadyDisabled.remove(instrumentation); | ||
| } | ||
| } | ||
| } | ||
| try { | ||
| Thread.sleep(1000L); | ||
| } catch (InterruptedException ignored) { | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
57 changes: 57 additions & 0 deletions
57
custom/src/test/java/co/elastic/otel/config/DynamicInstrumentationTest.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| /* | ||
| * Licensed to Elasticsearch B.V. under one or more contributor | ||
| * license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright | ||
| * ownership. Elasticsearch B.V. licenses this file to you under | ||
| * the Apache License, Version 2.0 (the "License"); you may | ||
| * not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package co.elastic.otel.config; | ||
|
|
||
| import static org.assertj.core.api.Assertions.assertThat; | ||
| import static org.assertj.core.api.Assertions.assertThatThrownBy; | ||
|
|
||
| import io.opentelemetry.sdk.common.InstrumentationScopeInfo; | ||
| import io.opentelemetry.sdk.internal.ScopeConfigurator; | ||
| import io.opentelemetry.sdk.trace.SdkTracerProvider; | ||
| import io.opentelemetry.sdk.trace.SdkTracerProviderBuilder; | ||
| import java.lang.reflect.Method; | ||
| import java.lang.reflect.Modifier; | ||
| import org.junit.jupiter.api.Test; | ||
|
|
||
| public class DynamicInstrumentationTest { | ||
| @Test | ||
| // Functional testing is in DynamicInstrumentationSmokeTest | ||
| // These tests are so that when the SDK implementation stops being | ||
| // experimental, we switch from reflection to actual method calls | ||
| public void checkForPublicImplementations() throws NoSuchMethodException, ClassNotFoundException { | ||
|
|
||
| Method method1 = | ||
| SdkTracerProviderBuilder.class.getDeclaredMethod( | ||
| "setTracerConfigurator", ScopeConfigurator.class); | ||
| assertThat(Modifier.toString(method1.getModifiers())).isNotEqualTo("public"); | ||
|
|
||
| Method method2 = | ||
| SdkTracerProvider.class.getDeclaredMethod( | ||
| "getTracerConfig", InstrumentationScopeInfo.class); | ||
| assertThat(Modifier.toString(method2.getModifiers())).isNotEqualTo("public"); | ||
|
|
||
| Class<?> sdkTracer = Class.forName("io.opentelemetry.sdk.trace.SdkTracer"); | ||
| Method method3 = sdkTracer.getDeclaredMethod("getInstrumentationScopeInfo"); | ||
| assertThat(Modifier.toString(method3.getModifiers())).isNotEqualTo("public"); | ||
|
|
||
| assertThatThrownBy( | ||
| () -> SdkTracerProvider.class.getDeclaredMethod("updateTracerConfigurations")) | ||
| .isInstanceOf(NoSuchMethodException.class); | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the tools we have available upstream, I can see why we need to do the config update this way, though I was wondering (not sure if it has been discussed) if instead of evaluating a field here, we could instead change that line by something like
if (!tracerConfig.isEnabled()) {instead? That way we might not need to keep a background thread alive constantly checking for config changes and applying them one by one by getting a new config and then resetting the tracerEnabled field as we could do the check reactively wheneverspanBuilderis called.I guess a downside might be that, depending on what's done in the
TracerConfig.isEnabled()impl, it might cause some delays, though a good implementation shouldn't take long to return.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the delay concerns are a blocker, I think another alternative might be to make
TracerConfigobservable and then make theSdkTracerimpl subscribe to changes in it, updating itstracerEnabledfield whenever TracerConfig notifies about a change.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not a blocker. It does not need to be immediate, it's fine for the instrumentation to be disabled but that not take effect for a while. Also, adding an observer-subscription model doesn't change the delay, because the delay is from
SdkTracer.tracerEnabledbeing non-volatile. callbacks wouldn't change anything because the state change needs to propagate across threads and only making it volatile would speed that. I originally had it volatile in the PR that made it non-final, but the maintainers decided to go with a non-volatile implementation, at least for nowThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The background thread checker is no on by default. I expect normal operation for something else - eg an OpAMP client - to do regular checking from central config (and maybe property and config file, to be decided) and update the instrumentation enablement from that. The background thread checker is there mainly so that testing can be done cleanly, though it is fully operational if someone wanted to use it in production
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think I didn't properly explain what I meant by delay, I was more talking about the possibility that by changing this condition to always query the config by calling
TracerConfig.isEnabled()instead of evaluating a field, then it could cause a performance overhead to the instrumented app in case that the providedTracerConfigimpl was poorly created, for example, let's say aTracerConfigimpl reads a file every time itsisEnabledmethod is called, then that could block the span creation process for a little while, causing the host app performance to get affected every time it creates a span. On the other hand, if we decide to keep the field then I agree we should make it volatile, though I was more of the opinion of removing that field and instead evaluating the config on every span creation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though it makes sense what you mentioned about the OpAMP client triggering the update process when it gets a new config. I guess with the alternatives I mentioned I was also trying to make the "update process" something that couldn't be triggered from the outside since I'm not sure if it'd be desirable for any part of the code that might have access to the Otel instance (including via the GlobalOpenTelemetry singleton) to make config changes. However, if it's only a matter of adding the
updateTracerConfigurations()method without params, but only as a sort of "notify changes" signal to get the tracer to re-evaluate its existing config provider, then it should be fine.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to open the TracerConfig to be implementable by the user, but the maintainers preferred to keep it as is, essentially a boolean. So atm it's not possible for it to be inefficient