-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Closed as not planned
Closed as not planned
Copy link
Labels
extension-modulesC modules in the Modules dirC modules in the Modules dirperformancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement
Description
Feature or enhancement
Proposal:
We can improve performance of random.randbelow
(and methods depending on that functionality such as random.shuffle
) by creating a C implementation of _random._randbelow_with_getrandbits
. A quick prototype with a straightforward port of the Python code shows we can gain a factor 2 or better in performance for many cases.
Benchmark:
rng.getrandbits(8) 59.75599982775748 [ns]
rng._randbelow_with_getrandbits(4) 230.81 [ns]
rng._randbelow_with_getrandbits_c(4) 83.28 [ns]
rng._randbelow_with_getrandbits(10) 208.89 [ns]
rng._randbelow_with_getrandbits_c(10) 83.92 [ns]
rng._randbelow_with_getrandbits(123) 179.30 [ns]
rng._randbelow_with_getrandbits_c(123) 60.23 [ns]
rng._randbelow_with_getrandbits(129) 225.87 [ns]
rng._randbelow_with_getrandbits_c(129) 79.82 [ns]
rng._randbelow_with_getrandbits(314) 225.01 [ns]
rng._randbelow_with_getrandbits_c(314) 112.94 [ns]
rng._randbelow_with_getrandbits(10**20) 324.53 [ns]
rng._randbelow_with_getrandbits_c(10**20) 135.60 [ns]
rng._randbelow_with_getrandbits(10**250) 1233.00 [ns]
rng._randbelow_with_getrandbits_c(10**250) 1048.73 [ns]
shuffle(x) 442.07 [us]
shuffle_c(x) 204.40 [us]
Prototype: main...eendebakpt:randbelow_c
Benchmark script
import random
import _random
import timeit
rng = random.Random()
number=100_000
dt=timeit.timeit('rng.getrandbits(8)', globals=globals(), number=number)
print(f'rng.getrandbits(8) {1e9*dt/number} [ns]')
print()
for a in [4, 10, 123, 129, 314, 10**20, 10**250]:
if a > 1e8:
number=100
else:
number=100_000
astr = a
if a==10**250:
astr ='10**250'
if a==10**20:
astr ='10**20'
dt=timeit.timeit(f'rng._randbelow_with_getrandbits({astr})', globals=globals(), number=number)
print(f'rng._randbelow_with_getrandbits({astr}) {1e9*dt/number:.2f} [ns]')
if hasattr(rng, '_randbelow_with_getrandbits_c'):
dt=timeit.timeit(f'rng._randbelow_with_getrandbits_c({astr})', globals=globals(), number=number)
print(f'rng._randbelow_with_getrandbits_c({astr}) {1e9*dt/number:.2f} [ns]')
print()
def shuffle(x):
"""Shuffle list x in place, and return None."""
randbelow = rng._randbelow_with_getrandbits
for i in reversed(range(1, len(x))):
# pick an element in x[:i+1] with which to exchange x[i]
j = randbelow(i + 1)
x[i], x[j] = x[j], x[i]
number=1000
x = list(range(1600))
dt=timeit.timeit('shuffle(x)', globals=globals(), number=number)
print(f'shuffle(x) {1e6*dt/number:.2f} [us]')
if hasattr(rng, '_randbelow_with_getrandbits_c'):
def shuffle_c(x):
"""Shuffle list x in place, and return None."""
randbelow = rng._randbelow_with_getrandbits_c
for i in reversed(range(1, len(x))):
# pick an element in x[:i+1] with which to exchange x[i]
j = randbelow(i + 1)
x[i], x[j] = x[j], x[i]
number=1000
x = list(range(1600))
dt=timeit.timeit('shuffle_c(x)', globals=globals(), number=number)
print(f'shuffle_c(x) {1e6*dt/number:.2f} [us]')
@rhettinger In your opinion: is the speed-up from the C implementation worthwhile when compared to the additional complexity of the C implementation? If so I will cleanup the code and make a PR
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response
devdanzin
Metadata
Metadata
Assignees
Labels
extension-modulesC modules in the Modules dirC modules in the Modules dirperformancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement