Skip to content

Conversation

@rli
Copy link
Contributor

@rli rli commented Oct 17, 2024

The OpenTelemetry SDK can automatically attach the same trace ID to spans recorded under the same 'context'. This is somewhat challenging because the JetBrains platform frequently requires switching threads and coroutine scope contexts, which loses the ThreadLocal state needed to connect parent->child when spans are started.

This on its own is not too difficult, but becomes a significant ergonomic challenge when we also want to bring some type-safety to how span metadata is built. This PR allows spans generated in aws/aws-toolkit-common#899 to automatically perform context propagation.

See internal documentation for details. A contributor-friendly version of the doc will be written after we validate this pattern with some initial metrics.

License

I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions
Copy link

github-actions bot commented Oct 17, 2024

Qodana Community for JVM

8 new problems were found

Inspection name Severity Problems
Unused symbol 🔶 Warning 3
Unstable API Usage 🔶 Warning 1
Non-distinguishable logging calls ◽️ Notice 2
Function or property has platform type ◽️ Notice 1
Redundant 'requireNotNull' or 'checkNotNull' call ◽️ Notice 1

💡 Qodana analysis was run in the pull request mode: only the changed files were checked
☁️ View the detailed Qodana report

Contact Qodana team

Contact us at [email protected]

try {
val item = TraceRequestMarshaler.create(listOf(data))

httpPost(traceUrl, contentLength = item.binarySerializedSize.toLong(), contentType = ContentType.XProtobuf) {

Check warning

Code scanning / QDJVMC

Unstable API Usage Warning

'httpPost(java.lang.String, long, com.intellij.platform.util.http.ContentType, kotlin.jvm.functions.Function2,? extends java.lang.Object>, kotlin.coroutines.Continuation)' is marked unstable with @ApiStatus.Internal
} catch (e: CancellationException) {
throw e
} catch (e: ConnectException) {
thisLogger().warn("Cannot export (url=$traceUrl): ${e.message}")

Check notice

Code scanning / QDJVMC

Non-distinguishable logging calls Note

Similar log messages
} catch (e: CancellationException) {
throw e
} catch (e: ConnectException) {
thisLogger().warn("Cannot export (url=$traceUrl): ${e.message}")

Check notice

Code scanning / QDJVMC

Non-distinguishable logging calls Note

Similar log messages
import java.io.ByteArrayOutputStream
import java.net.ConnectException

private class BasicOtlpSpanProcessor(

Check warning

Code scanning / QDJVMC

Unused symbol Warning

Class "BasicOtlpSpanProcessor" is never used
}
}

private class SigV4OtlpSpanProcessor(

Check warning

Code scanning / QDJVMC

Unused symbol Warning

Class "SigV4OtlpSpanProcessor" is never used
}

companion object {
fun getSdk() = service<OTelService>().sdk

Check warning

Code scanning / QDJVMC

Unused symbol Warning

Function "getSdk" is never used
import com.intellij.platform.diagnostic.telemetry.helpers.use as ijUse
import com.intellij.platform.diagnostic.telemetry.helpers.useWithScope as ijUseWithScope

val AWS_PRODUCT_CONTEXT_KEY = ContextKey.named<AWSProduct>("pluginDescriptor")

Check notice

Code scanning / QDJVMC

Function or property has platform type Note

Declaration has type inferred from a platform call, which can lead to unchecked nullability issues. Specify type explicitly as nullable or non-nullable.
}
setParent(parent)
}
requireNotNull(parent)

Check notice

Code scanning / QDJVMC

Redundant 'requireNotNull' or 'checkNotNull' call Note

Redundant 'requireNotNull' call
@rli rli marked this pull request as ready for review October 28, 2024 20:53
@rli rli requested a review from a team as a code owner October 28, 2024 20:53
it.identity(creds.resolveIdentity().get())
it.request(httpRequest)
it.payload(payload)
it.putProperty(AwsV4HttpSigner.SERVICE_SIGNING_NAME, "osis")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this property name stand for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open search ingestion service

.setResource(
Resource.create(
Attributes.builder()
.put(AttributeKey.stringKey("os.type"), SystemInfoRt.OS_NAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would we use this recorded information other than telemetry?
Should this have all the fields that we record for the telemetry publisher?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are common fields understood by OTLP processors

fun `context propagates from parent to child when child overrides context`() {
spanBuilder("tracer", "parentSpan").use {
// parent->child relationship is still maintained because Context.current() will return parent context
spanBuilder("anotherTracer", "childSpan").setParent(Context.current().with(AWS_PRODUCT_CONTEXT_KEY, AWSProduct.AMAZON_Q_FOR_VS_CODE)).use {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change this to AMAZON_Q_FOR_JETBRAINS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intentional so we know for sure it's working and not just picking up a default field somewhere

assertThat(parent.parentSpanContext.traceId).isEqualTo(TraceId.getInvalid())
assertThat(child.parentSpanContext.traceId).isEqualTo(parent.spanContext.traceId)
assertThat(parent.getAttribute(PLUGIN_ATTRIBUTE_KEY)).isNotEqualTo(child.getAttribute(PLUGIN_ATTRIBUTE_KEY))
assertThat(child.getAttribute(PLUGIN_ATTRIBUTE_KEY)).isEqualTo("Amazon Q For VS Code")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would also change to JB?

}

override fun onEnd(span: ReadableSpan) {
println(span)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this?

Copy link
Contributor Author

@rli rli Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helpful for debugging without a debugger but this one can be removed

} catch (e: CancellationException) {
throw e
} catch (e: ConnectException) {
thisLogger().warn("Cannot export (url=$traceUrl): ${e.message}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this also be log.error?

abstract class AbstractSpanBuilder<
BuilderType : AbstractSpanBuilder<BuilderType, SpanType>,
SpanType : AbstractBaseSpan<SpanType>,
>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should the > be on the previous line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't get formatter to do it

parent = if (s is AbstractBaseSpan<*> && s.context != null) {
s.context.with(Span.fromContext(parent))
} else {
parent.with(AWS_PRODUCT_CONTEXT_KEY, resolvePluginName())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should resolvePluginName be under a try catch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returns null if fails

@rli rli merged commit 06181dd into main Nov 5, 2024
16 of 17 checks passed
@rli rli deleted the rli/otel branch November 5, 2024 00:48
rli added a commit to aws/aws-toolkit-common that referenced this pull request Nov 5, 2024
## Problem
At the moment, metrics in the JetBrains extensions are collected as
singular datapoints independent of each other. Modern observability
frameworks, as well as our long-term goals are to be able to correlate
these data flows over a single user flow. This requires the 'trace ID'
concept which is high-friction in the current model.

## Solution
Generate 'fluent builders' that wrap the OpenTelemetry SDK to allow us
to leverage the SDK to record trace/span IDs automatically, but still
expose a good amount of type safety based on the metric schema. This PR
references base classes proposed in
aws/aws-toolkit-jetbrains#4982.

## License

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
rli added a commit that referenced this pull request Nov 6, 2024
…5027)

* Depends on #4982
* Depends on aws/aws-toolkit-common#899
* Rename Unit -> MetricUnit directly in SDK codegen
* Use newer provider convention syntax for input/output in generator task
* expose `passive`/`unit`/`value` as special fields
* add smoke test for `validateRequiredAttributes`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants