-
-
Notifications
You must be signed in to change notification settings - Fork 550
Darwin: Add GPU monitoring support #1855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
darwin/Platform.c
Outdated
| goto cleanup; | ||
|
|
||
| #if __LP64__ | ||
| unsigned long long gpuTime = (unsigned long long) Hashtable_get(gps, (ht_key_t) pid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't do this. This makes code hard to maintain. You are trying to type-pun, aren't you, by treating the void* member as uintptr_t and storing an integer to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and given the only thing stored is an uint64_t it's much more efficient (in the long run) to use this storage directly instead of fragmenting the memory with thousands of malloc calls over time.
BUT, instead of polluting the code line with unsigned long long int abominations, just using uintptr_t here is much clearer and much friendlier on the eye. Cf. note in my review regarding some precautions related to this.
BenBE
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the + in GPU Time+? We should try to minimize active allocations here to avoid memory fragmentation where possible.
darwin/Platform.c
Outdated
| CFRelease(accumulatedGPUTimeRef); | ||
|
|
||
| gpuTime += accumulatedGPUTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| CFRelease(accumulatedGPUTimeRef); | |
| gpuTime += accumulatedGPUTime; | |
| gpuTime += accumulatedGPUTime; | |
| CFRelease(accumulatedGPUTimeRef); |
Re-order things to have resource management as last action in branch.
darwin/Platform.c
Outdated
| cleanup: | ||
| IOObjectRelease(entry); | ||
| } | ||
| IOObjectRelease(iterator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cleanup: | |
| IOObjectRelease(entry); | |
| } | |
| IOObjectRelease(iterator); | |
| cleanup: | |
| IOObjectRelease(entry); | |
| } | |
| IOObjectRelease(iterator); |
Labels should go first column of the line. Also let's give the code some space.
darwin/Platform.c
Outdated
| if (!data) { | ||
| data = xCalloc(1, sizeof(gpuTime)); | ||
| Hashtable_put(gps, (ht_key_t) pid, data); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not handle this early when fetching the item from the hashtable. This will create some entries with zeros, but it makes the overall handling much simpler, as you can skip later NULL checks (knowing you have memory).
Another thing I kinda dislike here is the amount of malloc calls (memory fragmentation) this causes. But given we are talking ull here (why not write uint64_t instead?), why not drop this hole thing entirely and store the actual data into the pointers directly using uintptr_t? Given the unit of gpuTime this might be limiting on 32 bit systems, of which we do not have too many (~0) left that would also support this interface, I guess …
Given this only affects Darwin 10.13 and 10.14 in compat mode, let's put an assert(sizeof(uintptr_t) > sizeof(uint32_t)); in and be done on that front …
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenBE No, I can think of a better idea. It is to remove this new Hashtable, and instead expand DarwinProcess for this GPU time information.
Consider that this new Hashtable also uses PID as the indexing key, the two tables can be merged to one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Explorer09 That make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful that you may encounter items where no process object has been stored yet. There are some items in the process struct that are only set upon first encountering the entry. Apart from this, the idea makes sense.
|
Please rebase things properly, cf. style guide for details as to why … |
|
I use GPU TIME+ to keep it consistent with TIME+, in terms of numerical formatting. However, considering that there is no "GPU TIME" without the plus sign, I can change it back. |
It's fine. I just wanted to know the origins. |
BenBE
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please squash/rebase your changes/commits. More details in the style guide. ;-)
darwin/Platform.c
Outdated
| const Table* table = &dpt->super.super; | ||
| DarwinProcess* dp = (DarwinProcess*) Hashtable_get(table->table, pid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ProcessTable_findProcess cast to DarwinProcess*, which is always valid on Darwin. Will return NULL if no such process exists.
Also avoids this extra variable only used once.
| io_struct_inband_t buffer; | ||
| uint32_t size = sizeof(buffer); | ||
| if (IORegistryEntryGetProperty(entry, "IOUserClientCreator", buffer, &size) != KERN_SUCCESS) | ||
| goto cleanup; | ||
|
|
||
| pid_t pid; | ||
| if (sscanf(buffer, "pid %d", &pid) != 1) | ||
| goto cleanup; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide some (short) inline documentation about what the contents of buffers are expected to be. Reference to corresponding documentation is fine; throw a copy to archive.org if possible in case the link breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I don't like the assumption that pid_t always has the same width as int. It's better to sscanf into an int-type buffer then cast to pid_t when using.
darwin/Platform.c
Outdated
| // Clear GPU time and percent for processes not found in the GPU process list | ||
| const Vector* rows = dpt->super.super.rows; | ||
| int table_size = Vector_size(rows); | ||
| for (int i = 0; i < table_size; ++i) { | ||
| DarwinProcess* dp = (DarwinProcess*) Vector_get(rows, i); | ||
| if (!dp->gpu_time_updated) { | ||
| dp->gpu_time = 0; | ||
| dp->gpu_percent = 0.0F; | ||
| } | ||
| IOObjectRelease(iterator); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not reset these all back in ProcessTable_goThroughEntries? They'll stay 0 if no update occured.
Keep the gpu_time_updated flag for rendering to shadow/darken the entry if no GPU time was used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot be reset, as the percentage requires calculating the difference from the last gpu time value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shatyuka If the gpu_time stored in DarwinProcess is cumulative, you shouldn't reset it here.
Also, you probably don't need the gpu_time_updated boolean flag.
My thought is this: First, loop over all DarwinProcess entries and set gpu_percent to zero. Then, loop on the accumulatedGPUTimeRef objects and update the gpu_time and gpu_percent with the respective processes.
The processes that did not have GPU time updated would keep the old gpu_time value but gpu_percent reset to zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Explorer09 Yes, I've considered that approach as well. My original intention was to handle situations where the GPU context is released by the process.
Keeping the GPU time is also a good solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A process can have multiple IOGPUDeviceUserClient entries, so it appears we must add an additional variable to accumulate values during iteration. I overlooked this when deleting the Hashtable, and the current implementation has some issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shatyuka What should happen if a process has multiple IOGPUDeviceUserClient entries? Are those values in the entries supposed to sum up or what?
Yes, sum up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is calculating the delta between the previous run and the current one supposed to work in situations where one of the contexts got removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current behavior is that when the total GPU time decreases, the GPU percentage calculation for this process is skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current behavior is that when the total GPU time decreases, the GPU percentage calculation for this process is skipped.
I agree with this behavior. When the cumulative GPU time decreases, we "update" our cached value to a lower one but show 0% for the percentage. The next measurement would compute the right percentage then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is acceptable, it's not fully correct technically.
Example:
Two components for one process specifying 1000 and 2000 time units were needed previously. Now on the next run one resource is freed and we are left with 2500 as the single usage value for the process. So while we were at 3000 previously and are now at 2500, we had a usage of (at least) 500, but will report 0.
linux/LinuxProcess.c
Outdated
| [SCHEDULERPOLICY] = { .name = "SCHEDULERPOLICY", .title = "SCHED ", .description = "Current scheduling policy of the process", .flags = PROCESS_FLAG_SCHEDPOL, }, | ||
| #endif | ||
| [TIME_GPU] = { .name = "GPU_TIME", .title = "GPU TIME+", .description = "Total GPU time", .flags = PROCESS_FLAG_LINUX_GPU, .defaultSortDesc = true, }, | ||
| [TIME_GPU] = { .name = "GPU_TIME", .title = "GPU TIME", .description = "Total GPU time", .flags = PROCESS_FLAG_LINUX_GPU, .defaultSortDesc = true, }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is meant as an analog to TIME+ for indicating the total, including the + is fine.
linux/LinuxProcess.c
Outdated
| case AUTOGROUP_NICE: | ||
| return SPACESHIP_NUMBER(p1->autogroup_nice, p2->autogroup_nice); | ||
| case GPU_PERCENT: { | ||
| case PERCENT_GPU: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question. What motivated you to rename the field from GPU_PERCENT to PERCENT_GPU? (Similar question to GPU_TIME to TIME_GPU.)
For me, the renaming of the fields doesn't make them easier to read.
This is beside the compatibility problem I pointed out to you before, which you have fixed. I just wonder why this idea in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I noticed that the previous percentages were all named like PERCENT_CPU and PERCENT_MEM. Of course, due to compatibility issues, these modifications are now meaningless. Should I revert it back to GPU_PERCENT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I noticed that the previous percentages were all named like
PERCENT_CPUandPERCENT_MEM.
Oh I see. Thus it was an error when the GPU percentage fields were introduced in htop Linux monitoring.
Because htop had no requirement on whether the word PERCENT should be placed at the start or at the end.
If you grep -r TIME in the htop codebase, you would get what I mean: We had a lot of fields with "TIME" word at the end. STARTTIME, CPU_TOTAL_TIME, GUEST_TIME, etc.
Of course, due to compatibility issues, these modifications are now meaningless. Should I revert it back to GPU_PERCENT?
It's not my call. Maybe BenBE can decide this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given there are other fields like PERCENT_CPU etc. I actually prefer this change. Otherwise I'd already mention this change. Thus can stay as is rn. ;-)
Add gpu percentage and time monitoring to macOS.
The data is consistent with Activity Monitor, and the table header titles have been slightly adjusted.
Close: #1597
Related: #1632