Skip to content

Conversation

@jiebinn
Copy link
Owner

@jiebinn jiebinn commented Oct 19, 2022

The mutex lock cost lots of cycles in t_blosc_do_job when the thread number is large. The updating to global values of protected part is not that heavy. It will save lots of cycles if we Replace the mutex lock with spin lock.

Apply the patch and test the c-blosc2/bench.
$ ./bench/b2bench blosclz noshuffle single 160 83886080 4

Score gain: 7.4%
CPU: ICX 8380 x 2 sockets
Core number: 160 threads

int rc2;

/* Initialize mutex and condition variable objects */
pthread_spin_init(&context->count_spin, PTHREAD_PROCESS_SHARED);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTHREAD_PROCESS_SHARED why we need this flag?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTHREAD_PROCESS_SHARED why we need this flag?

int pthread_spin_init(pthread_spinlock_t *lock, int pshared);

The pshared must be one of the following.
PTHREAD_PROCESS_PRIVATE/PTHREAD_PROCESS_SHARED
The 1st one has limited to the threads that are forked from the same process.
The 2nd one could be obtained by all the threads.

Reference Link: https://man7.org/linux/man-pages/man3/pthread_spin_init.3.html

blosc/context.h Outdated
pthread_t *threads;
struct thread_context *thread_contexts; /* only for user-managed threads */
pthread_spinlock_t count_spin;
pthread_mutex_t count_mutex;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if count_mutex is unused, we should remove it.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if count_mutex is unused, we should remove it.

Done.

@jiebinn jiebinn force-pushed the mitigate_lock_contention branch from bff61e0 to 5dffc7c Compare October 20, 2022 01:22
@jiebinn jiebinn requested a review from guowangy October 20, 2022 01:23
@jiebinn jiebinn force-pushed the mitigate_lock_contention branch from 5dffc7c to 51fda93 Compare October 20, 2022 01:30
The mutex lock cost lots of cycles in t_blosc_do_job when the thread
number is large. The updating to global values of protected part is
not that heavy. It will save lots of cycles if we Replace the mutex
lock with spin lock.

Apply the patch and test the c-blosc2/bench.
$ ./bench/b2bench blosclz noshuffle single 160 83886080 4

Score gain: 7.4%
CPU: ICX 8380 x 2 sockets
Core number: 160 threads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants