Skip to content

[Diangostics][dotnet-trace] Add collect-linux verb #47894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mdh1418
Copy link
Member

@mdh1418 mdh1418 commented Aug 7, 2025

Summary

Add new collect-linux verb to utilize new user_events based event tracing.


Internal previews

📄 File 🔗 Preview link
docs/core/diagnostics/dotnet-trace.md dotnet-trace performance analysis utility
docs/core/diagnostics/eventsource-collect-and-view-traces.md Collect and View EventSource Traces

@mdh1418 mdh1418 requested review from tommcdon and a team as code owners August 7, 2025 20:53
@dotnetrepoman dotnetrepoman bot added this to the August 2025 milestone Aug 7, 2025
@mdh1418 mdh1418 marked this pull request as draft August 7, 2025 21:16
@mdh1418 mdh1418 force-pushed the diagnostics_dotnet_trace_collect_linux branch from bc293c0 to e1a3ec7 Compare August 7, 2025 21:19
@@ -43,7 +43,7 @@ The `dotnet-trace` tool:
* Is a cross-platform .NET Core tool.
* Enables the collection of .NET Core traces of a running process without a native profiler.
* Is built on [`EventPipe`](./eventpipe.md) of the .NET Core runtime.
* Delivers the same experience on Windows, Linux, or macOS.
* On Linux, provides additional integration with kernel user_events for native tracing tool compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to tweak the wording a bit later, but not focusing on this for the moment :)

[--stopping-event-provider-name <stoppingEventProviderName>]
[--stopping-event-event-name <stoppingEventEventName>]
[--stopping-event-payload-filter <stoppingEventPayloadFilter>]
[--event-filters <list-of-comma-separated-event-filters>]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we avoid adding this right now to keep the scope of changes smaller. I'm guessing it would be only rarely used. My understanding is that most dotnet-trace users are looking for simple configurations with modest excess events rather than complex configurations that capture the minimal possible set of events.

It certainly might be useful to return to this and add it later, but I'd rather not pile up too many new UI features now.

### Synopsis

```dotnetcli
dotnet-trace collect-linux
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can prune away some of these options

  • buffersize - If OneCollect can reasonably pick the size then we may not need an option to explicitly set it.
  • diagnostics-port - I think we'd only need this in some advanced scenarios. Given that we could be profiling multiple processes maybe we'd even need multiple ports? I'd suggest lets leave it out for now and wait to see what happens.
  • resume-runtime - if we don't have diagnostics port then we shouldn't need this.
  • event-filters - I'd suggest we leave this out for now to keep things simpler.
  • tracepoint-configs - can we leave this out? If the user cares about the tracepoint names that imples they are going to use some other tool besides dotnet-trace to record the events. But if that is true I'm not sure why they'd want dotnet-trace to be creating a nettrace file as well?


- **`--kernel-events <list-of-kernel-events>` (optional)**

A comma-separated list of kernel event categories to include in the trace. These events are automatically grouped into kernel-named tracepoints. Available categories include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comma-separated list of kernel event categories to include ...

Are these things being enumerated categories, or individual events? They sound like individual events.


### Options

`dotnet-trace collect-linux` supports all the same options as [`dotnet-trace collect`](#dotnet-trace-collect), excluding `--dsrouter`, and additionally offers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wind up having both a set of omissions and a set of additions I'm guessing it will be easier to describe the options explicitly rather than as a relative reference to the collect verb. I think copy-and-paste for options that are the same is completely fine.

[-n, --name <name>]
[--diagnostic-port]
[-o|--output <trace-file-path>]
[-p|--process-id <pid>]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the collect verb it is required to specify either the process-id, name, command or dsrouter. I'm guessing that requirement won't exist for collect-linux and not specifying any of them is equivalent to collecting all processes on the machine? Mentioning the options here is what I'd expect, but we'd need to describe the "collect all processes by default" behavior somewhere.

[--diagnostic-port]
[-o|--output <trace-file-path>]
[-p|--process-id <pid>]
[--profile <profile-name>]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to decide what profiles are available and which events they each collect. I expect this is mostly the same as for the 'collect' verb, but cpu-sampling probably should be different. We also might want a thread-time profile that collects context switches.

If we do update these profiles, we should strongly consider also renaming the highly misleading "cpu-sampling" profile for the collect verb. Currently that profile collects thread-time information, not CPU samples.

# All enabled events from MyCustomProvider will be written to MyCustomProvider_custom_events
```

- **`--kernel-events <list-of-kernel-events>` (optional)**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should work with Beau to figure out what kind of event names can be supported. For example https://man7.org/linux/man-pages/man1/perf-record.1.html shows a whole bunch of different things can be specified as an event and I have no idea what part of that OneCollect handles.

I'm hoping we can at least support anything in /sys/kernel/tracing/available_events as well as the symbolic PMU events like cpu-cycles.

| `net` | Network-related events | `net:netif_rx`, `net:net_dev_xmit` |
| `fs` | Filesystem I/O events | `ext4:*`, `vfs:*` |
| `mm` | Memory management events | `kmem:*`, `vmscan:*` |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone specified "sched:sched_wakeup,sched:sched_switch" what provider name, event name, and field lists would we expect to show up in the nettrace file? (This level of detail may not go into the docs, but we should understand it ourselves to decide what info should go in the docs)

- **perf**: Use `perf list user_events*` to see available events
- **System monitoring tools**: Any tool that can consume Linux tracepoints

### Examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lower in this doc are some examples of using dotnet-trace collect in various situations. No need just yet, but we'd probably want to update or add to those examples for the collect-linux functionality once we've clarified what it will be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants