-
Notifications
You must be signed in to change notification settings - Fork 5.3k
[Tests][UserEvents] Add userevents functional runtime tests #122134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Tests][UserEvents] Add userevents functional runtime tests #122134
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive end-to-end functional tests for the .NET user_events feature, which was introduced in PR #115265. The tests validate that .NET Runtime user events can be emitted via EventPipe and collected using Microsoft's record-trace tool from the one-collect project.
Key Changes
- New shared test infrastructure that orchestrates tracing via
record-trace, manages tracee processes, and validates collected events - Three test scenarios validating different user_events use cases: runtime events (basic), custom EventSource events (managedevent), and multi-threaded event emission (multithread)
- Build system integration to restore dependencies, copy executables, and handle Helix-specific requirements like permission restoration
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
src/tests/tracing/userevents/common/UserEventsTestRunner.cs |
Shared test orchestration logic for launching record-trace, running tracee processes, and validating traces |
src/tests/tracing/userevents/common/UserEventsRequirements.cs |
Environment validation checking kernel version, glibc version, and user_events availability |
src/tests/tracing/userevents/common/userevents_common.csproj |
Common project configuration with dependencies and record-trace executable deployment |
src/tests/tracing/userevents/common/NuGet.config |
NuGet source configuration for Microsoft.OneCollect.RecordTrace package |
src/tests/tracing/userevents/basic/basic.cs |
Test scenario for runtime AllocationSampled events |
src/tests/tracing/userevents/basic/basic.csproj |
Project configuration for basic scenario |
src/tests/tracing/userevents/basic/basic.script |
record-trace script for collecting runtime events |
src/tests/tracing/userevents/managedevent/managedevent.cs |
Test scenario for custom EventSource events |
src/tests/tracing/userevents/managedevent/managedevent.csproj |
Project configuration for managedevent scenario |
src/tests/tracing/userevents/managedevent/managedevent.script |
record-trace script for collecting custom events |
src/tests/tracing/userevents/multithread/multithread.cs |
Test scenario for concurrent event emission across multiple threads |
src/tests/tracing/userevents/multithread/multithread.csproj |
Project configuration for multithread scenario |
src/tests/tracing/userevents/multithread/multithread.script |
record-trace script for collecting multi-threaded events |
src/tests/tracing/userevents/README.md |
Documentation explaining test structure and approach |
src/tests/build.proj |
Added userevents_common.csproj to restore projects list |
src/tests/Common/helixpublishwitharcade.proj |
Added logic to restore execute permissions for extra test executables on Helix |
eng/Versions.props |
Updated TraceEvent version and added MicrosoftOneCollectRecordTrace version |
UserEvents functional runtime tests differ from other runtime tests because it depends on OneCollect's Record-Trace tool to enable a userevents-based eventpipe session and to collect events. By design, Record-Trace requires elevated privileges, so these tests invoke a record-trace executable with sudo. When tests run on Helix, test artifacts are stripped of their permissions, so the test infrastructure was modified to give record-trace execute permissions (helix-extra-executables.list). Moreover, to avoid having one copy of record-trace per scenario, which in turn requires re-adding execute permissions for each, more modifications were added to copy over a single record-trace executable that would be used by all scenarios (OutOfProcess marker). Additionally, in Helix environments, TMPDIR is set to a helix specific temporary directory like /datadisks/disk1/work/<id>/t, and at this time, record-trace only scans /tmp/ for the runtime's diagnostic ports. So as a workaround, the tracee apps are spawned with TMPDIR set to /tmp. Lastly, the job steps to run tests on AzDO prevents restoring individual runtime test projects. Because record-trace is currently only resolvable through the dotnet-diagnostics-tests source, userevents_common.csproj was added to the group of projects restored at the beginning of copying native test components to restore Microsoft.OneCollect.RecordTrace.
016e77c to
086eb16
Compare
|
/ba-g "The failure is this unrelated issue #118603" |
With user_events support added in #115265, this PR looks to test a few end-to-end user_events scenario.
Alternative testing approaches considered
Existing EventPipe runtime tests
Existing EventPipe tests under
src/tests/tracing/eventpipeare incompatible with testing the user_events scenario due to:Starting EventPipeSessions through DiagnosticClient ❌
DiagnosticClient does not have the support to send the IPC command to start a user_events based EventPipe session, because it requires the user_events_data file descriptor to be sent using SCM_RIGHTS (see https://github.com/dotnet/diagnostics/blob/main/documentation/design-docs/ipc-protocol.md#passing_file_descriptor).
Using an EventPipeEventSource to validate events streamed through EventPipe ❌
User_events based EventPipe sessions do not stream events. Instead, events are written to configured TraceFS tracepoints, and currently only RecordTrace from https://github.com/microsoft/one-collect/ is capable of generating
.nettracetraces from tracepoint user_events.Native EventPipe Unit Tests
There are Mono Native EventPipe tests under
src/mono/mono/eventpipe/testthat are not hooked up to CI. These unit tests are built through linking the shared EventPipe interface library against Mono's EventPipe runtime shims and using Mono's test runner. To update these unit tests into the standard runtime tests structure, a larger investment is needed to either migrate EventPipe from using runtime shims to a OS Pal source shared by coreclr/nativeaot/mono (see #118874 (comment)) or build an EventPipe shared library specifically for the runtime test using a runtime-agnostic shim.As existing mono unit tests don't currently test IPC commands, coupled with no existing runtime infrastructure to read events from tracepoints, there would be even more work on top of updating mono native eventpipe unit tests to even test the user_events scenario.
End-to-End Testing Added
A low-cost approach to testing .NET Runtime's user_events functionality leverages RecordTrace from https://github.com/microsoft/one-collect/, which is already capable of starting user_events based EventPipe sessions and generating
.nettraces. (Note: dotnet-trace wraps around RecordTrace)Despite adding an external dependency which allows RecordTrace failures to fail the end-to-end test, user_events was initially added with the intent to depend on RecordTrace for the end-to-end scenario, and there are no other ways to functionally test a user_events based eventpipe session.
Approach
Each scenario uses the same pattern:
Scenario invokes the shared test runner
User events scenarios can differ in their tracee logic, the events expected in the .nettrace, the record-trace script used to collect those events, and how long it takes for the tracee to emit them and for record-trace to resolve symbols and write the .nettrace. To handle this variance, UserEventsTestRunner lets each scenario pass in its scenario-specific record-trace script path, the path to its test assembly (used to spawn the tracee process), a validator that checks for the expected events from the tracee, and optional timeouts for both the tracee and record-trace to exit gracefully.
UserEventsTestRunnerorchestrates tracing and validationUsing this configuration, UserEventsTestRunner first checks whether user events are supported. It then starts record-trace with the scenario’s script and launches the tracee process so it can emit events. After the run completes, the runner stops both the tracee and record-trace, opens the resulting .nettrace with EventPipeEventSource, and applies the scenario’s validator to confirm that the expected events were recorded. Finally, it returns an exit code indicating whether the scenario passed or failed.
Dependencies:
Helix Nuances
UserEvents functional runtime tests differ from other runtime tests because it depends on OneCollect's Record-Trace tool to enable a userevents-based eventpipe session and to collect events. By design, Record-Trace requires elevated privileges, so these tests invoke a record-trace executable with sudo.
When tests run on Helix, test artifacts are stripped of their permissions, so the test infrastructure was modified to give record-trace execute permissions (helix-extra-executables.list). Moreover, to avoid having one copy of record-trace per scenario, which in turn requires re-adding execute permissions for each, more modifications were added to copy over a single record-trace executable that would be used by all scenarios (OutOfProcess marker).
Additionally, in Helix environments, TMPDIR is set to a helix specific temporary directory like /datadisks/disk1/work//t, and at this time, record-trace only scans /tmp/ for the runtime's diagnostic ports. So as a workaround, the tracee apps are spawned with TMPDIR set to /tmp.
Lastly, the job steps to run tests on AzDO prevents restoring individual runtime test projects. Because record-trace is currently only resolvable through the dotnet-diagnostics-tests source, userevents_common.csproj was added to the group of projects restored at the beginning of copying native test components to restore Microsoft.OneCollect.RecordTrace.