Bunch of assorted perf improvements of hot path functions#8447
Bunch of assorted perf improvements of hot path functions#8447alexey-tikhonov merged 3 commits intoSSSD:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces several performance improvements for hot-path functions. The changes include using conditional debug logging to avoid unnecessary work, and adding caching to avoid repeated string conversions.
My review has identified two main issues:
- A potential portability bug in
sss_is_ascii_lowercasedue to comparing acharagainst0x7Fwithout casting tounsigned char. - A critical thread-safety issue in the new caching function
sss_get_lc_dom_name, which uses static variables without any locking, leading to race conditions in a multi-threaded environment.
Please address these points to ensure the code is both correct and robust.
064ac71 to
2a0c7c9
Compare
justin-stephenson
left a comment
There was a problem hiding this comment.
Ack, changes make sense to me. Thank you.
2a0c7c9 to
b8944ca
Compare
|
(amended commit message of one of patches) |
|
Note: Covscan is green. |
b8944ca to
0a3dcee
Compare
|
Argh... Processed a wrong PR :-/ |
0a3dcee to
b8944ca
Compare
src/util/usertools.c
Outdated
| } | ||
| } else { | ||
| cache_ctx = talloc_new(NULL); | ||
| if (!cache_ctx) return NULL; |
There was a problem hiding this comment.
Coding style: "Do not use !pointer"
| } else { | ||
| cache_ctx = talloc_new(NULL); | ||
| if (!cache_ctx) return NULL; | ||
| hret = hash_create(0, &lc_dom_name_cache, NULL, NULL); |
There was a problem hiding this comment.
Hi,
I wonder why there is no issue with the test running under valgrind. Since the hash table and it's entries are never freed shouldn't valgrind complain?
bye,
Sumit
There was a problem hiding this comment.
b8944ca to
8f8509f
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces several performance improvements for hot path functions. The changes include using DEBUG_CONDITIONAL to reduce logging overhead, optimizing string-to-lowercase conversion by checking for already-lowercase ASCII strings, and adding a cache for lowercased domain names. While the optimizations are valuable, the new caching mechanism in usertools.c is not thread-safe and could lead to race conditions. I've provided a critical comment with a suggested fix to address this.
|
Note: Covscan is green. |
sumit-bose
left a comment
There was a problem hiding this comment.
Hi,
thank you for the updates, I have no further comments, ACK.
bye,
Sumit
Both `perf` and manual measurement confirms ~6..8% perf gain in the test case: - INITGROUPS lookup for a user that is a member of 5k groups, no groups were cached; - debug_level = 3 - debug_microseconds = true Note `debug_microseconds = true` - without this setting impact isn't that dramatic. Reviewed-by: Justin Stephenson <jstephen@redhat.com> Reviewed-by: Sumit Bose <sbose@redhat.com>
In vast majority of cases strings are ascii and lowercase. In other cases overhead added should be negligible. Reviewed-by: Justin Stephenson <jstephen@redhat.com> Reviewed-by: Sumit Bose <sbose@redhat.com>
This helper is heavily used, including in hot paths. Since number of domains used is very limited, hash table used for caching should be very small and lookup much more efficient as compared with `sss_tc_utf8_str_tolower()` Assisted-by: Claude Code (Opus 4.6) Reviewed-by: Justin Stephenson <jstephen@redhat.com> Reviewed-by: Sumit Bose <sbose@redhat.com>
8f8509f to
3ff1964
Compare
Those patches are based on a profiling of a following test case:
tu1andtu2are members of 5k LDAP groups (RFC2307 case, no nested groups)time id tu1@ldap.test | tr ',' '\n' | wc -lis executed(Note you need to tweak both LDAP (limits) and SSSD (
timeout,client_idle_timeout) settings to make this req complete successfully).Combined, those patches reduce lookup time on my laptom from ~51+s to ~45-s (at least with
debug_microseconds = true)Note that much more fruitful optimizations are possible under
sdap_initgr_common_store(), including getting rid of O(N^2) loop oversdap_get_group_primary_name()insdap_add_incomplete_groups(), and some of those optimization will make patches in this PR kind of obsolete for this specific test case. But given that those helpers are heavily used across all code base, I think it's still worth consideration.