Skip to content

Conversation

@shatyuka
Copy link
Contributor

@shatyuka shatyuka commented Jan 8, 2026

Add gpu percentage and time monitoring to macOS.
The data is consistent with Activity Monitor, and the table header titles have been slightly adjusted.

Close: #1597
Related: #1632

goto cleanup;

#if __LP64__
unsigned long long gpuTime = (unsigned long long) Hashtable_get(gps, (ht_key_t) pid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't do this. This makes code hard to maintain. You are trying to type-pun, aren't you, by treating the void* member as uintptr_t and storing an integer to it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and given the only thing stored is an uint64_t it's much more efficient (in the long run) to use this storage directly instead of fragmenting the memory with thousands of malloc calls over time.

BUT, instead of polluting the code line with unsigned long long int abominations, just using uintptr_t here is much clearer and much friendlier on the eye. Cf. note in my review regarding some precautions related to this.

@BenBE BenBE added the MacOS 🍏 MacOS / Darwin related issues label Jan 9, 2026
Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the + in GPU Time+? We should try to minimize active allocations here to avoid memory fragmentation where possible.

Comment on lines 861 to 863
CFRelease(accumulatedGPUTimeRef);

gpuTime += accumulatedGPUTime;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CFRelease(accumulatedGPUTimeRef);
gpuTime += accumulatedGPUTime;
gpuTime += accumulatedGPUTime;
CFRelease(accumulatedGPUTimeRef);

Re-order things to have resource management as last action in branch.

Comment on lines 876 to 879
cleanup:
IOObjectRelease(entry);
}
IOObjectRelease(iterator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cleanup:
IOObjectRelease(entry);
}
IOObjectRelease(iterator);
cleanup:
IOObjectRelease(entry);
}
IOObjectRelease(iterator);

Labels should go first column of the line. Also let's give the code some space.

Comment on lines 869 to 872
if (!data) {
data = xCalloc(1, sizeof(gpuTime));
Hashtable_put(gps, (ht_key_t) pid, data);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not handle this early when fetching the item from the hashtable. This will create some entries with zeros, but it makes the overall handling much simpler, as you can skip later NULL checks (knowing you have memory).

Another thing I kinda dislike here is the amount of malloc calls (memory fragmentation) this causes. But given we are talking ull here (why not write uint64_t instead?), why not drop this hole thing entirely and store the actual data into the pointers directly using uintptr_t? Given the unit of gpuTime this might be limiting on 32 bit systems, of which we do not have too many (~0) left that would also support this interface, I guess …

Given this only affects Darwin 10.13 and 10.14 in compat mode, let's put an assert(sizeof(uintptr_t) > sizeof(uint32_t)); in and be done on that front …

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenBE No, I can think of a better idea. It is to remove this new Hashtable, and instead expand DarwinProcess for this GPU time information.

Consider that this new Hashtable also uses PID as the indexing key, the two tables can be merged to one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Explorer09 That make sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful that you may encounter items where no process object has been stored yet. There are some items in the process struct that are only set upon first encountering the entry. Apart from this, the idea makes sense.

@BenBE
Copy link
Member

BenBE commented Jan 9, 2026

Please rebase things properly, cf. style guide for details as to why …

@shatyuka
Copy link
Contributor Author

I use GPU TIME+ to keep it consistent with TIME+, in terms of numerical formatting. However, considering that there is no "GPU TIME" without the plus sign, I can change it back.

@shatyuka shatyuka requested a review from BenBE January 10, 2026 08:53
@BenBE
Copy link
Member

BenBE commented Jan 11, 2026

I use GPU TIME+ to keep it consistent with TIME+, in terms of numerical formatting. However, considering that there is no "GPU TIME" without the plus sign, I can change it back.

It's fine. I just wanted to know the origins.

Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash/rebase your changes/commits. More details in the style guide. ;-)

Comment on lines 825 to 826
const Table* table = &dpt->super.super;
DarwinProcess* dp = (DarwinProcess*) Hashtable_get(table->table, pid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ProcessTable_findProcess cast to DarwinProcess*, which is always valid on Darwin. Will return NULL if no such process exists.

Also avoids this extra variable only used once.

Comment on lines 815 to 822
io_struct_inband_t buffer;
uint32_t size = sizeof(buffer);
if (IORegistryEntryGetProperty(entry, "IOUserClientCreator", buffer, &size) != KERN_SUCCESS)
goto cleanup;

pid_t pid;
if (sscanf(buffer, "pid %d", &pid) != 1)
goto cleanup;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide some (short) inline documentation about what the contents of buffers are expected to be. Reference to corresponding documentation is fine; throw a copy to archive.org if possible in case the link breaks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I don't like the assumption that pid_t always has the same width as int. It's better to sscanf into an int-type buffer then cast to pid_t when using.

Comment on lines 883 to 892
// Clear GPU time and percent for processes not found in the GPU process list
const Vector* rows = dpt->super.super.rows;
int table_size = Vector_size(rows);
for (int i = 0; i < table_size; ++i) {
DarwinProcess* dp = (DarwinProcess*) Vector_get(rows, i);
if (!dp->gpu_time_updated) {
dp->gpu_time = 0;
dp->gpu_percent = 0.0F;
}
IOObjectRelease(iterator);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reset these all back in ProcessTable_goThroughEntries? They'll stay 0 if no update occured.

Keep the gpu_time_updated flag for rendering to shadow/darken the entry if no GPU time was used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot be reset, as the percentage requires calculating the difference from the last gpu time value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shatyuka If the gpu_time stored in DarwinProcess is cumulative, you shouldn't reset it here.
Also, you probably don't need the gpu_time_updated boolean flag.

My thought is this: First, loop over all DarwinProcess entries and set gpu_percent to zero. Then, loop on the accumulatedGPUTimeRef objects and update the gpu_time and gpu_percent with the respective processes.

The processes that did not have GPU time updated would keep the old gpu_time value but gpu_percent reset to zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Explorer09 Yes, I've considered that approach as well. My original intention was to handle situations where the GPU context is released by the process.
Keeping the GPU time is also a good solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A process can have multiple IOGPUDeviceUserClient entries, so it appears we must add an additional variable to accumulate values during iteration. I overlooked this when deleting the Hashtable, and the current implementation has some issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shatyuka What should happen if a process has multiple IOGPUDeviceUserClient entries? Are those values in the entries supposed to sum up or what?

Yes, sum up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is calculating the delta between the previous run and the current one supposed to work in situations where one of the contexts got removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior is that when the total GPU time decreases, the GPU percentage calculation for this process is skipped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior is that when the total GPU time decreases, the GPU percentage calculation for this process is skipped.

I agree with this behavior. When the cumulative GPU time decreases, we "update" our cached value to a lower one but show 0% for the percentage. The next measurement would compute the right percentage then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is acceptable, it's not fully correct technically.

Example:
Two components for one process specifying 1000 and 2000 time units were needed previously. Now on the next run one resource is freed and we are left with 2500 as the single usage value for the process. So while we were at 3000 previously and are now at 2500, we had a usage of (at least) 500, but will report 0.

[SCHEDULERPOLICY] = { .name = "SCHEDULERPOLICY", .title = "SCHED ", .description = "Current scheduling policy of the process", .flags = PROCESS_FLAG_SCHEDPOL, },
#endif
[TIME_GPU] = { .name = "GPU_TIME", .title = "GPU TIME+", .description = "Total GPU time", .flags = PROCESS_FLAG_LINUX_GPU, .defaultSortDesc = true, },
[TIME_GPU] = { .name = "GPU_TIME", .title = "GPU TIME", .description = "Total GPU time", .flags = PROCESS_FLAG_LINUX_GPU, .defaultSortDesc = true, },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is meant as an analog to TIME+ for indicating the total, including the + is fine.

case AUTOGROUP_NICE:
return SPACESHIP_NUMBER(p1->autogroup_nice, p2->autogroup_nice);
case GPU_PERCENT: {
case PERCENT_GPU: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question. What motivated you to rename the field from GPU_PERCENT to PERCENT_GPU? (Similar question to GPU_TIME to TIME_GPU.)

For me, the renaming of the fields doesn't make them easier to read.

This is beside the compatibility problem I pointed out to you before, which you have fixed. I just wonder why this idea in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I noticed that the previous percentages were all named like PERCENT_CPU and PERCENT_MEM. Of course, due to compatibility issues, these modifications are now meaningless. Should I revert it back to GPU_PERCENT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I noticed that the previous percentages were all named like PERCENT_CPU and PERCENT_MEM.

Oh I see. Thus it was an error when the GPU percentage fields were introduced in htop Linux monitoring.

Because htop had no requirement on whether the word PERCENT should be placed at the start or at the end.

If you grep -r TIME in the htop codebase, you would get what I mean: We had a lot of fields with "TIME" word at the end. STARTTIME, CPU_TOTAL_TIME, GUEST_TIME, etc.

Of course, due to compatibility issues, these modifications are now meaningless. Should I revert it back to GPU_PERCENT?

It's not my call. Maybe BenBE can decide this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given there are other fields like PERCENT_CPU etc. I actually prefer this change. Otherwise I'd already mention this change. Thus can stay as is rn. ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

MacOS 🍏 MacOS / Darwin related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GPU monitoring (percentage and time) in Darwin/macOS

3 participants