@@ -96,6 +96,11 @@ task, but not currently run per-platform, which means there is no way to find
96
96
out the status of clippy per platform without running it on that platform as a
97
97
developer.
98
98
99
+ ### import rustup-macros::{integration,unit}_ test into test modules
100
+
101
+ These test helpers add pre-and-post logic to tests to enable the use of tracing
102
+ inside tests, which can be helpful for tracking down behaviours in larger tests.
103
+
99
104
## Version numbers
100
105
101
106
If you ever see a released version of rustup which has ` :: ` in its version string
@@ -217,8 +222,8 @@ break our testing (like `RUSTUP_TOOLCHAIN`, `SHELL`, `ZDOTDIR`, `RUST_BACKTRACE`
217
222
But if you want to debug locally, you may need backtrace. ` RUSTUP_BACKTRACE `
218
223
is used like ` RUST_BACKTRACE ` to enable backtraces of failed tests.
219
224
220
- ** NOTE** : This is a backtrace for the test, not for any rustup process running
221
- in the test
225
+ ** NOTE** : This is a backtrace for the test, not for any subprocess invocation of
226
+ rustup process running in the test
222
227
223
228
``` bash
224
229
$ RUSTUP_BACKTRACE=1 cargo test --release --test cli-v1 -- remove_toolchain_then_add_again
@@ -273,3 +278,74 @@ test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 26 filtered out
273
278
274
279
error: test failed, to rerun pass ' --test cli-v1'
275
280
```
281
+
282
+ ## Tracing
283
+
284
+ The feature "otel" can be used when building rustup to turn on Opentelemetry
285
+ tracing with an OLTP GRPC exporter. This requires protoc installed, which can be
286
+ downloaded from GitHub or installed via package manager.
287
+
288
+ The normal [ OTLP environment
289
+ variables] ( https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md )
290
+ can be used to customise its behaviour, but often the simplest thing is to just
291
+ run a Jaeger docker container on the same host:
292
+
293
+ ``` sh
294
+ docker run -d --name jaeger -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 -e COLLECTOR_OTLP_ENABLED=true -p 6831:6831/udp -p 6832:6832/udp -p 5778:5778 -p 16686:16686 -p 4317:4317 -p 4318:4318 -p 14250:14250 -p 14268:14268 -p 14269:14269 -p 9411:9411 jaegertracing/all-in-one:latest
295
+ ```
296
+
297
+ Then build rustup-init with tracing:
298
+
299
+ ``` sh
300
+ cargo build --features=otel
301
+ ```
302
+
303
+ Run the operation you want to analyze:
304
+
305
+ ``` sh
306
+ RUSTUP_FORCE_ARG0=" rustup" ./target/debug/rustup-init show
307
+ ```
308
+
309
+ And [ look in Jaeger for a trace] ( http://localhost:16686/search?service=rustup ) .
310
+
311
+ ### Tracing and tests
312
+
313
+ The custom macro ` rustup_macros::test ` adds a prelude and suffix to each test to
314
+ ensure that there is a tracing context setup, that the test function is a span,
315
+ and that the spans from the test are flushed. Build with features=otel to
316
+ use this feature.
317
+
318
+ ### Adding instrumentation
319
+
320
+ The ` otel ` feature uses conditional compilation to only add function instrument
321
+ when enabled. Instrumenting a currently uninstrumented function is mostly simply
322
+ done like so:
323
+
324
+ ``` rust
325
+ #[cfg_attr(feature = " otel" , tracing:: instrument(err, skip_all))]
326
+ ```
327
+
328
+ ` skip_all ` is not required, but some core structs don't implement Debug yet, and
329
+ others have a lot of output in Debug : tracing adds some overheads, so keeping
330
+ spans lightweight can help avoid frequency bias in the results - where
331
+ parameters with large debug in frequently called functions show up as much
332
+ slower than they are.
333
+
334
+ Some good general heuristics:
335
+
336
+ - Do instrument slow blocking functions
337
+ - Do instrument functions with many callers or that call many different things,
338
+ as these tend to help figure the puzzle of what-is-happening
339
+ - Default to not instrumenting thin shim functions (or at least, only instrument
340
+ them temporarily while figuring out the shape of a problem)
341
+ - Be way of debug build timing - release optimisations make a huge difference,
342
+ though debug is a lot faster to iterate on. If something isn't a problem in
343
+ release don't pay it too much heed in debug.
344
+
345
+ ### Caveats
346
+
347
+ Cross-thread propogation isn't connected yet. This will cause instrumentation in
348
+ a thread to make a new root span until it is fixed. If any Tokio runtime-related
349
+ code gets added in those threads this will also cause a panic. We have a couple
350
+ of threadpools in use today; if you need to instrument within that context, use
351
+ a thunk to propogate the tokio runtime into those threads.
0 commit comments