-
Notifications
You must be signed in to change notification settings - Fork 188
Affinity CPU thread affinity parameter to gvadetect/gvaclassify #719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
125a9b8
9dac795
23579ac
3397bdc
551037d
8a0b709
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -80,6 +80,8 @@ | |||||||||||||||||||
|
|
||||||||||||||||||||
| #define DEFAULT_SHARE_VADISPLAY_CTX TRUE | ||||||||||||||||||||
|
|
||||||||||||||||||||
| #define DEFAULT_CORE_PINNING nullptr | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
Comment on lines
+83
to
+84
|
||||||||||||||||||||
| #define DEFAULT_CORE_PINNING nullptr |
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New core-pinning property and parsing logic are introduced here, but there are existing GStreamer pipeline unit tests under tests/unit_tests/tests_gstgva/ that exercise gvadetect/gvaclassify properties. Please add coverage for core-pinning (valid list/range, invalid format, and default behavior) to prevent regressions (especially around parse errors and mask application).
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpu_set_to_bitmask iterates i < num_cores and does bitmask |= (1ULL << i). If the system has 64+ online cores, shifting by >=64 is undefined behavior and the 64-bit mask can’t represent those cores anyway. Cap the loop at 64 (or at least guard if (i < 64)), and document/handle truncation explicitly.
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file now uses sysconf, cpu_set_t/CPU_* macros, and pthread_getaffinity_np, but it doesn’t include the required headers (commonly <unistd.h>, <sched.h>, <pthread.h> under Linux, and <windows.h> under Windows for GetProcessAffinityMask). Add explicit includes under the existing _WIN32 guards to avoid relying on transitive includes and to prevent build breaks.
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _WIN32 branch of set_core_pinning_mask has an unterminated if (GetProcessAffinityMask(...)) { block: there’s no closing } before #endif, which will break Windows builds. Ensure all opened braces in the Windows branch are properly closed before the #endif and the function’s closing brace.
| } | |
| } | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a possibility that we have a CPU with more than 64 cores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates a full copy of the input string, then std::istringstream on line 636 copies it again. For a read-only parameter, this can be wasteful.
Since this is C++17 code (using std::from_chars would be better anyway), consider using std::string_view for the parameter:
void gva_base_inference_set_core_pinning(GvaBaseInference *base_inference,
const gchar *range_str) {
std::string_view sv(range_str);
// Then parse sv directly without copying
}Or better yet, pass through to a helper function that takes std::string_view to avoid C string → string_view conversion overhead in the common path. 📎
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gva_base_inference_set_core_pinning constructs std::string str(range_str); without checking for range_str == nullptr. Since the property default is nullptr, and users can also set the property to NULL via g_object_set, this can crash. Handle null/empty strings explicitly (e.g., treat as “no override” and keep the existing mask, or report a settings error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace trimming only handles spaces and tabs. Consider using a more robust approach that handles all whitespace characters (including newlines, carriage returns, etc.):
| part.erase(0, part.find_first_not_of(" \t")); | |
| // Trim whitespace | |
| auto is_space = [](unsigned char c) { return std::isspace(c); }; | |
| part.erase(0, std::find_if_not(part.begin(), part.end(), is_space) - part.begin()); | |
| part.erase(std::find_if_not(part.rbegin(), part.rend(), is_space).base(), part.end()); |
This handles all standard whitespace characters and is more maintainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::stoi throws exceptions on invalid input, which is expensive for control flow. For performance-critical parsing, use C++17's std::from_chars which is:
- Exception-free (returns error code)
- Significantly faster (~10x in benchmarks)
- Doesn't allocate or use locale
#include <charconv>
int start, end;
auto [ptr1, ec1] = std::from_chars(part.data(), part.data() + dash_pos, start);
auto [ptr2, ec2] = std::from_chars(part.data() + dash_pos + 1, part.data() + part.size(), end);
if (ec1 == std::errc{} && ec2 == std::errc{}) {
// Valid parse
} else {
// Handle error without exception overhead
}Applies to lines 648, 649, and 657.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing bounds validation causes undefined behavior when core index >= 64. Shifting a 64-bit value by >= 64 bits is UB per C++ standard. Consider adding validation:
| for (int i = start; i <= end; i++) { | |
| // Set bits in the range | |
| for (int i = start; i <= end; i++) { | |
| if (i >= 0 && i < 64) { | |
| core_mask |= (1ULL << i); | |
| } else { | |
| GST_WARNING_OBJECT(base_inference, "Core index %d out of range [0-63], skipping", i); | |
| } | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as the range loop: missing bounds check before bit shift. Consider adding validation:
| core_mask |= (1ULL << std::stoi(part)); | |
| // Set single bit | |
| int core_id = std::stoi(part); | |
| if (core_id >= 0 && core_id < 64) { | |
| core_mask |= (1ULL << core_id); | |
| } else { | |
| GST_WARNING_OBJECT(base_inference, "Core index %d out of range [0-63], skipping", core_id); | |
| } |
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gva_base_inference_set_core_pinning builds a uint64_t mask using 1ULL << i where i comes from user input/ranges. Negative core IDs (e.g. "-1") or IDs >= 64 will cause undefined behavior when shifting. Validate parsed core IDs (and ranges) are within [0, 63] (and ideally within available CPU cores) before shifting/setting bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is outside the try/catch block. When parsing fails, core_mask remains 0 (from line 631 initialization) and gets assigned to core_pinning_mask, which silently disables all cores and overwrites any previously valid configuration.
Consider moving the assignment inside the try block after successful parsing to preserve the existing value on error:
| base_inference->core_pinning_mask = core_mask; | |
| } catch (const std::exception &e) { | |
| GST_ELEMENT_ERROR(base_inference, RESOURCE, SETTINGS, ("Invalid core-pinning format"), | |
| ("Failed to parse core-pinning property: %s", e.what())); | |
| return; // Preserve existing core_pinning_mask on error | |
| } | |
| // If string parsed without errors, update core pinning | |
| base_inference->core_pinning_mask = core_mask; |
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If parsing core-pinning fails, the catch logs an error but the function still overwrites core_pinning_mask with core_mask (which is 0 on error). That can later produce an empty cpu_set_t and make pthread_setaffinity_np fail. Only update core_pinning_mask when parsing succeeds; on error, keep the previous/default mask (or explicitly restore it).
Copilot
AI
Apr 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike several other property setters in this file (e.g., model, model-proc, labels) that check check_gva_base_inference_stopped() and warn/error when changed at runtime, core-pinning is applied unconditionally. Since affinity is only set during model creation, changing core-pinning while the element is running likely won’t take effect. Consider enforcing the same “only mutable while stopped” rule (or re-applying affinity when changed) to avoid a misleading runtime-configurable property.
| case PROP_CORE_PINNING: | |
| case PROP_CORE_PINNING: | |
| if (GST_STATE(base_inference) > GST_STATE_READY) { | |
| GST_WARNING_OBJECT(base_inference, | |
| "core-pinning property can only be changed while the element is stopped"); | |
| break; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getter outputs individual cores ("0,1,2,3") while the setter accepts range notation ("0-3"). This asymmetry means:
- Setting "0-3" then getting returns "0,1,2,3"
- Copy-pasting the getter output works, but loses the compact representation
Consider implementing range compaction in the getter to match the setter's format. For example:
- Consecutive cores 0,1,2,3 → "0-3"
- Non-consecutive 0,2,5,6,7 → "0,2,5-7"
This makes the property symmetric and more user-friendly for large core counts.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -834,6 +834,10 @@ InferenceImpl::Model InferenceImpl::CreateModel(GvaBaseInference *gva_base_infer | |
| ie_config[KEY_BASE]["frame-height"] = std::to_string(gva_base_inference->info->height); | ||
| } | ||
|
|
||
| // Set affinity mask that might set as the result of set_core_pinning_mask() call or | ||
| // set by the user directly via the core-pinning attribute. | ||
| SetAffinityMask(gva_base_inference->core_pinning_mask); | ||
|
|
||
| auto image_inference = ImageInference::createImageInferenceInstance( | ||
| memory_type, ie_config, allocator.get(), std::bind(&InferenceImpl::InferenceCompletionCallback, this, _1, _2), | ||
| std::bind(&InferenceImpl::PushFramesIfInferenceFailed, this, _1), std::move(va_dpy)); | ||
|
|
@@ -977,6 +981,39 @@ bool InferenceImpl::IsRoiSizeValid(const GstAnalyticsODMtd roi_meta) { | |
| return w > 1 && h > 1; | ||
| } | ||
|
|
||
| /** | ||
| * Pins current thread to the CPU core set as specified by affinity mask. | ||
| */ | ||
| void InferenceImpl::SetAffinityMask(uint64_t mask) { | ||
| GVA_INFO("Setting CPU affinity mask to 0x%lx\n", mask); | ||
| #ifndef _WIN32 | ||
| cpu_set_t cpuset; | ||
| CPU_ZERO(&cpuset); // Initialize to zero | ||
| int num_cores = sysconf(_SC_NPROCESSORS_ONLN); // Get number of available CPU cores | ||
| for (int core_id = 0; core_id < num_cores; ++core_id) { | ||
|
Comment on lines
+987
to
+993
|
||
| if (mask & (1ULL << core_id)) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here mask can have bits set beyond The input
This creates inconsistency: the user thinks they pinned to cores 0-63, but actually only pinned to 0-15. Please consider validating mask against actual core count: // After line 1016
if (mask >= (1ULL << num_cores)) {
GVA_WARNING("Affinity mask 0x%lx has bits set beyond available cores (%d), truncating\n", mask, num_cores);
mask &= ((1ULL << num_cores) - 1); // Clear invalid bits
} |
||
| CPU_SET(core_id, &cpuset); // Add the specific core | ||
| } | ||
| } | ||
|
Comment on lines
+992
to
+997
|
||
|
|
||
| pthread_t current_thread = pthread_self(); // Get current thread handle | ||
| int result = pthread_setaffinity_np(current_thread, sizeof(cpu_set_t), &cpuset); | ||
|
|
||
| if (result != 0) { | ||
| GVA_ERROR("Error pinning thread: (%d)\n", result); | ||
| } | ||
|
Comment on lines
+987
to
+1004
|
||
| #else | ||
| DWORD_PTR mask = static_cast<DWORD_PTR>(&mask); | ||
|
|
||
| HANDLE thread = GetCurrentThread(); // Get handle for the current thread | ||
| DWORD_PTR result = SetThreadAffinityMask(thread, mask); | ||
|
|
||
|
Comment on lines
+1006
to
+1010
|
||
| if (result == 0) { | ||
| GVA_ERROR("Error pinning thread: (%d)\n", GetLastError()); | ||
| } | ||
| #endif | ||
| } | ||
|
|
||
| /** | ||
| * Acquires output_frames_mutex with std::lock_guard. | ||
| */ | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic number 64 appears throughout the code. Consider defining it as a constant for maintainability:
Then use
MAX_CPU_CORESin all the bounds checks and getter loop instead of hardcoded 64.