-
Notifications
You must be signed in to change notification settings - Fork 464
chore(tracing): move meta/metrics type checking to the encoder #14982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 4.0-breaking-changes
Are you sure you want to change the base?
chore(tracing): move meta/metrics type checking to the encoder #14982
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 251 ± 9 ms. The average import time from base is: 265 ± 8 ms. The import time difference between this PR and base is: -14.1 ± 0.4 ms. Import time breakdownThe following import paths have shrunk:
|
The benefit here is we can do a first pass on meta/metrics to filter out bad values before we try to pack them, this way we can throw out individual bad values instead of throwing out the whole span/trace.
463283d to
619d041
Compare
Performance SLOsComparing candidate brettlangdon/encoding.skip_unsupported_types (0657610) with baseline 4.0-breaking-changes (b2cee41) ❌ Test Failures (2 suites)❌ otelspan - 20/22✅ add-eventTime: ✅ 40.151ms (SLO: <47.150ms 📉 -14.8%) vs baseline: -0.2% Memory: ✅ 43.784MB (SLO: <47.000MB -6.8%) vs baseline: +4.9% ❌ add-metricsTime: ✅ 316.569ms (SLO: <344.800ms -8.2%) vs baseline: -0.2% Memory: ❌ 652.206MB (SLO: <630.000MB +3.5%) vs baseline: +4.7% ❌ add-tagsTime: ✅ 287.700ms (SLO: <314.000ms -8.4%) vs baseline: -0.2% Memory: ❌ 654.021MB (SLO: <630.000MB +3.8%) vs baseline: +5.0% ✅ get-contextTime: ✅ 80.034ms (SLO: <92.350ms 📉 -13.3%) vs baseline: -0.3% Memory: ✅ 39.637MB (SLO: <46.500MB 📉 -14.8%) vs baseline: +4.9% ✅ is-recordingTime: ✅ 37.914ms (SLO: <44.500ms 📉 -14.8%) vs baseline: -0.2% Memory: ✅ 43.126MB (SLO: <47.500MB -9.2%) vs baseline: +4.9% ✅ record-exceptionTime: ✅ 58.493ms (SLO: <67.650ms 📉 -13.5%) vs baseline: +0.9% Memory: ✅ 39.831MB (SLO: <47.000MB 📉 -15.3%) vs baseline: +4.6% ✅ set-statusTime: ✅ 43.942ms (SLO: <50.400ms 📉 -12.8%) vs baseline: +0.4% Memory: ✅ 43.130MB (SLO: <47.000MB -8.2%) vs baseline: +4.6% ✅ startTime: ✅ 37.462ms (SLO: <43.450ms 📉 -13.8%) vs baseline: +0.5% Memory: ✅ 43.178MB (SLO: <47.000MB -8.1%) vs baseline: +4.7% ✅ start-finishTime: ✅ 82.099ms (SLO: <88.000ms -6.7%) vs baseline: +0.1% Memory: ✅ 34.505MB (SLO: <46.500MB 📉 -25.8%) vs baseline: +4.7% ✅ start-finish-telemetryTime: ✅ 83.414ms (SLO: <89.000ms -6.3%) vs baseline: -0.2% Memory: ✅ 34.603MB (SLO: <46.500MB 📉 -25.6%) vs baseline: +5.0% ✅ update-nameTime: ✅ 39.063ms (SLO: <45.150ms 📉 -13.5%) vs baseline: +0.7% Memory: ✅ 43.429MB (SLO: <47.000MB -7.6%) vs baseline: +4.7% ❌ telemetryaddmetric - 29/30✅ 1-count-metric-1-timesTime: ✅ 2.968µs (SLO: <20.000µs 📉 -85.2%) vs baseline: +0.8% Memory: ✅ 32.047MB (SLO: <34.000MB -5.7%) vs baseline: +4.5% ✅ 1-count-metrics-100-timesTime: ✅ 209.354µs (SLO: <220.000µs -4.8%) vs baseline: +4.4% Memory: ✅ 32.086MB (SLO: <34.000MB -5.6%) vs baseline: +4.6% ✅ 1-distribution-metric-1-timesTime: ✅ 3.283µs (SLO: <20.000µs 📉 -83.6%) vs baseline: -1.0% Memory: ✅ 32.047MB (SLO: <34.000MB -5.7%) vs baseline: +4.6% ❌ 1-distribution-metrics-100-timesTime: ❌ 220.914µs (SLO: <220.000µs +0.4%) vs baseline: +2.5% Memory: ✅ 31.968MB (SLO: <34.000MB -6.0%) vs baseline: +4.2% ✅ 1-gauge-metric-1-timesTime: ✅ 2.204µs (SLO: <20.000µs 📉 -89.0%) vs baseline: ~same Memory: ✅ 32.047MB (SLO: <34.000MB -5.7%) vs baseline: +4.6% ✅ 1-gauge-metrics-100-timesTime: ✅ 136.928µs (SLO: <150.000µs -8.7%) vs baseline: -0.3% Memory: ✅ 32.086MB (SLO: <34.000MB -5.6%) vs baseline: +4.6% ✅ 1-rate-metric-1-timesTime: ✅ 3.090µs (SLO: <20.000µs 📉 -84.5%) vs baseline: +0.2% Memory: ✅ 32.027MB (SLO: <34.000MB -5.8%) vs baseline: +4.3% ✅ 1-rate-metrics-100-timesTime: ✅ 225.974µs (SLO: <250.000µs -9.6%) vs baseline: +5.1% Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.6% ✅ 100-count-metrics-100-timesTime: ✅ 21.124ms (SLO: <22.000ms -4.0%) vs baseline: +4.2% Memory: ✅ 32.067MB (SLO: <34.000MB -5.7%) vs baseline: +4.7% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.299ms (SLO: <2.300ms 🟡 ~same) vs baseline: +2.5% Memory: ✅ 32.047MB (SLO: <34.000MB -5.7%) vs baseline: +4.4% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.407ms (SLO: <1.550ms -9.2%) vs baseline: +0.5% Memory: ✅ 32.086MB (SLO: <34.000MB -5.6%) vs baseline: +4.8% ✅ 100-rate-metrics-100-timesTime: ✅ 2.286ms (SLO: <2.550ms 📉 -10.3%) vs baseline: +4.2% Memory: ✅ 32.047MB (SLO: <34.000MB -5.7%) vs baseline: +4.5% ✅ flush-1-metricTime: ✅ 4.418µs (SLO: <20.000µs 📉 -77.9%) vs baseline: -1.6% Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.7% ✅ flush-100-metricsTime: ✅ 173.734µs (SLO: <250.000µs 📉 -30.5%) vs baseline: -1.0% Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +4.8% ✅ flush-1000-metricsTime: ✅ 2.114ms (SLO: <2.500ms 📉 -15.5%) vs baseline: -1.0% Memory: ✅ 32.873MB (SLO: <34.500MB -4.7%) vs baseline: +5.0% 🟡 Near SLO Breach (3 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 20.509ms (SLO: <22.300ms -8.0%) vs baseline: +0.2% Memory: ✅ 65.428MB (SLO: <67.000MB -2.3%) vs baseline: +4.9% ✅ exception-replay-enabledTime: ✅ 1.346ms (SLO: <1.450ms -7.2%) vs baseline: +0.5% Memory: ✅ 64.585MB (SLO: <67.000MB -3.6%) vs baseline: +4.8% ✅ iastTime: ✅ 20.434ms (SLO: <22.250ms -8.2%) vs baseline: +0.2% Memory: ✅ 65.476MB (SLO: <67.000MB -2.3%) vs baseline: +4.8% ✅ profilerTime: ✅ 15.237ms (SLO: <16.550ms -7.9%) vs baseline: +0.1% Memory: ✅ 53.733MB (SLO: <54.500MB 🟡 -1.4%) vs baseline: +4.7% ✅ resource-renamingTime: ✅ 20.532ms (SLO: <21.750ms -5.6%) vs baseline: -0.2% Memory: ✅ 65.512MB (SLO: <67.000MB -2.2%) vs baseline: +4.9% ✅ span-code-originTime: ✅ 25.412ms (SLO: <28.200ms -9.9%) vs baseline: +0.5% Memory: ✅ 67.748MB (SLO: <69.500MB -2.5%) vs baseline: +4.6% ✅ tracerTime: ✅ 20.454ms (SLO: <21.750ms -6.0%) vs baseline: -0.2% Memory: ✅ 65.475MB (SLO: <67.000MB -2.3%) vs baseline: +4.8% ✅ tracer-and-profilerTime: ✅ 22.035ms (SLO: <23.500ms -6.2%) vs baseline: -0.2% Memory: ✅ 66.637MB (SLO: <67.500MB 🟡 -1.3%) vs baseline: +4.9% ✅ tracer-dont-create-db-spansTime: ✅ 19.296ms (SLO: <21.500ms 📉 -10.3%) vs baseline: -0.4% Memory: ✅ 65.495MB (SLO: <66.000MB 🟡 -0.8%) vs baseline: +4.9% ✅ tracer-minimalTime: ✅ 16.585ms (SLO: <17.500ms -5.2%) vs baseline: -0.2% Memory: ✅ 65.451MB (SLO: <66.000MB 🟡 -0.8%) vs baseline: +4.9% ✅ tracer-nativeTime: ✅ 20.465ms (SLO: <21.750ms -5.9%) vs baseline: -0.1% Memory: ✅ 71.450MB (SLO: <72.500MB 🟡 -1.4%) vs baseline: +4.8% ✅ tracer-no-cachesTime: ✅ 18.452ms (SLO: <19.650ms -6.1%) vs baseline: +0.3% Memory: ✅ 65.425MB (SLO: <67.000MB -2.4%) vs baseline: +4.8% ✅ tracer-no-databasesTime: ✅ 18.712ms (SLO: <20.100ms -6.9%) vs baseline: -0.3% Memory: ✅ 65.287MB (SLO: <67.000MB -2.6%) vs baseline: +4.6% ✅ tracer-no-middlewareTime: ✅ 20.127ms (SLO: <21.500ms -6.4%) vs baseline: -0.2% Memory: ✅ 65.454MB (SLO: <67.000MB -2.3%) vs baseline: +4.7% ✅ tracer-no-templatesTime: ✅ 20.274ms (SLO: <22.000ms -7.8%) vs baseline: +0.1% Memory: ✅ 65.441MB (SLO: <67.000MB -2.3%) vs baseline: +4.8% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.210ms (SLO: <19.850ms -8.3%) vs baseline: +1.4% Memory: ✅ 65.294MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +5.0% ✅ errortracking-enabled-userTime: ✅ 18.155ms (SLO: <19.400ms -6.4%) vs baseline: +0.7% Memory: ✅ 65.274MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.8% ✅ tracer-enabledTime: ✅ 18.067ms (SLO: <19.450ms -7.1%) vs baseline: ~same Memory: ✅ 65.254MB (SLO: <66.500MB 🟡 -1.9%) vs baseline: +4.9% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.591ms (SLO: <4.750ms -3.3%) vs baseline: ~same Memory: ✅ 61.892MB (SLO: <65.000MB -4.8%) vs baseline: +4.8% ✅ appsec-postTime: ✅ 6.612ms (SLO: <6.750ms -2.0%) vs baseline: +0.2% Memory: ✅ 62.030MB (SLO: <65.000MB -4.6%) vs baseline: +4.9% ✅ appsec-telemetryTime: ✅ 4.588ms (SLO: <4.750ms -3.4%) vs baseline: ~same Memory: ✅ 61.991MB (SLO: <65.000MB -4.6%) vs baseline: +4.9% ✅ debuggerTime: ✅ 1.869ms (SLO: <2.000ms -6.5%) vs baseline: +0.2% Memory: ✅ 45.377MB (SLO: <47.000MB -3.5%) vs baseline: +4.8% ✅ iast-getTime: ✅ 1.859ms (SLO: <2.000ms -7.0%) vs baseline: -0.3% Memory: ✅ 42.369MB (SLO: <49.000MB 📉 -13.5%) vs baseline: +4.9% ✅ profilerTime: ✅ 1.909ms (SLO: <2.100ms -9.1%) vs baseline: ~same Memory: ✅ 46.282MB (SLO: <47.000MB 🟡 -1.5%) vs baseline: +4.2% ✅ resource-renamingTime: ✅ 3.379ms (SLO: <3.650ms -7.4%) vs baseline: ~same Memory: ✅ 52.278MB (SLO: <53.500MB -2.3%) vs baseline: +4.9% ✅ tracerTime: ✅ 3.373ms (SLO: <3.650ms -7.6%) vs baseline: +0.2% Memory: ✅ 52.258MB (SLO: <53.500MB -2.3%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 3.361ms (SLO: <3.650ms -7.9%) vs baseline: +0.1% Memory: ✅ 58.395MB (SLO: <60.000MB -2.7%) vs baseline: +5.1%
|
Description
The benefit here is we can do a first pass on meta/metrics to filter out bad values before we try to pack them, this way we can throw out individual bad values instead of throwing out the whole span/trace.
Testing
Risks
This will have a negative performance impact for now until #14943 can be merged.
Right now we do type checking in
set_tag/set_metric. With this PR we will do the type checking there and in the encoder. Once #14943 merges we'll remove the type checking fromset_tag/set_metricand only have it in the encoder. It should have the benefit of using the C-API for the type checking (faster), and we don't do type checking on spans that aren't going to get encoded.Note: the end result could be a tad slower on average still because we are keeping
for k, v in meta.items():but then we are adding afor k, v in filtered_meta_list:so the additional iteration could bite us. It could be worth trying to find a way to pack the meta/metrics without first knowing the end size.Additional Notes