-
Notifications
You must be signed in to change notification settings - Fork 6k
[Diangostics][dotnet-trace] Add collect-linux verb #47894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Diangostics][dotnet-trace] Add collect-linux verb #47894
Conversation
bc293c0
to
e1a3ec7
Compare
@@ -43,7 +43,7 @@ The `dotnet-trace` tool: | |||
* Is a cross-platform .NET Core tool. | |||
* Enables the collection of .NET Core traces of a running process without a native profiler. | |||
* Is built on [`EventPipe`](./eventpipe.md) of the .NET Core runtime. | |||
* Delivers the same experience on Windows, Linux, or macOS. | |||
* On Linux, provides additional integration with kernel user_events for native tracing tool compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to tweak the wording a bit later, but not focusing on this for the moment :)
[--stopping-event-provider-name <stoppingEventProviderName>] | ||
[--stopping-event-event-name <stoppingEventEventName>] | ||
[--stopping-event-payload-filter <stoppingEventPayloadFilter>] | ||
[--event-filters <list-of-comma-separated-event-filters>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest we avoid adding this right now to keep the scope of changes smaller. I'm guessing it would be only rarely used. My understanding is that most dotnet-trace users are looking for simple configurations with modest excess events rather than complex configurations that capture the minimal possible set of events.
It certainly might be useful to return to this and add it later, but I'd rather not pile up too many new UI features now.
### Synopsis | ||
|
||
```dotnetcli | ||
dotnet-trace collect-linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can prune away some of these options
- buffersize - If OneCollect can reasonably pick the size then we may not need an option to explicitly set it.
- diagnostics-port - I think we'd only need this in some advanced scenarios. Given that we could be profiling multiple processes maybe we'd even need multiple ports? I'd suggest lets leave it out for now and wait to see what happens.
- resume-runtime - if we don't have diagnostics port then we shouldn't need this.
- event-filters - I'd suggest we leave this out for now to keep things simpler.
- tracepoint-configs - can we leave this out? If the user cares about the tracepoint names that imples they are going to use some other tool besides dotnet-trace to record the events. But if that is true I'm not sure why they'd want dotnet-trace to be creating a nettrace file as well?
|
||
- **`--kernel-events <list-of-kernel-events>` (optional)** | ||
|
||
A comma-separated list of kernel event categories to include in the trace. These events are automatically grouped into kernel-named tracepoints. Available categories include: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comma-separated list of kernel event categories to include ...
Are these things being enumerated categories, or individual events? They sound like individual events.
|
||
### Options | ||
|
||
`dotnet-trace collect-linux` supports all the same options as [`dotnet-trace collect`](#dotnet-trace-collect), excluding `--dsrouter`, and additionally offers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wind up having both a set of omissions and a set of additions I'm guessing it will be easier to describe the options explicitly rather than as a relative reference to the collect verb. I think copy-and-paste for options that are the same is completely fine.
[-n, --name <name>] | ||
[--diagnostic-port] | ||
[-o|--output <trace-file-path>] | ||
[-p|--process-id <pid>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the collect verb it is required to specify either the process-id, name, command or dsrouter. I'm guessing that requirement won't exist for collect-linux and not specifying any of them is equivalent to collecting all processes on the machine? Mentioning the options here is what I'd expect, but we'd need to describe the "collect all processes by default" behavior somewhere.
[--diagnostic-port] | ||
[-o|--output <trace-file-path>] | ||
[-p|--process-id <pid>] | ||
[--profile <profile-name>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to decide what profiles are available and which events they each collect. I expect this is mostly the same as for the 'collect' verb, but cpu-sampling probably should be different. We also might want a thread-time profile that collects context switches.
If we do update these profiles, we should strongly consider also renaming the highly misleading "cpu-sampling" profile for the collect verb. Currently that profile collects thread-time information, not CPU samples.
# All enabled events from MyCustomProvider will be written to MyCustomProvider_custom_events | ||
``` | ||
|
||
- **`--kernel-events <list-of-kernel-events>` (optional)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should work with Beau to figure out what kind of event names can be supported. For example https://man7.org/linux/man-pages/man1/perf-record.1.html shows a whole bunch of different things can be specified as an event and I have no idea what part of that OneCollect handles.
I'm hoping we can at least support anything in /sys/kernel/tracing/available_events as well as the symbolic PMU events like cpu-cycles.
| `net` | Network-related events | `net:netif_rx`, `net:net_dev_xmit` | | ||
| `fs` | Filesystem I/O events | `ext4:*`, `vfs:*` | | ||
| `mm` | Memory management events | `kmem:*`, `vmscan:*` | | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone specified "sched:sched_wakeup,sched:sched_switch" what provider name, event name, and field lists would we expect to show up in the nettrace file? (This level of detail may not go into the docs, but we should understand it ourselves to decide what info should go in the docs)
- **perf**: Use `perf list user_events*` to see available events | ||
- **System monitoring tools**: Any tool that can consume Linux tracepoints | ||
|
||
### Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lower in this doc are some examples of using dotnet-trace collect in various situations. No need just yet, but we'd probably want to update or add to those examples for the collect-linux functionality once we've clarified what it will be.
Summary
Add new
collect-linux
verb to utilize new user_events based event tracing.Internal previews