The power of coroutines in statistics #10147
rkompass
started this conversation in
Show and tell
Replies: 3 comments
-
That is a nice use of coroutines/generators. Another approach is to use closures: def amin():
s = None
def inner(x=None):
nonlocal s
if x is not None:
s = x if s is None else min(s, x)
return s
return inner demomymin = amin()
for x in range(10, 2, -1):
mymin(x) # Returns running minimum if required
print(f"minimum is {mymin()}") |
Beta Was this translation helpful? Give feedback.
0 replies
-
It occurred to me later that the use of closures can be generalised. This will work with any function of two variables: def do_func(func):
s = None
def inner(x=None):
nonlocal s
if x is not None:
s = x if s is None else func(s, x)
return s
return inner Demo: def demo():
def add(x, y):
return x + y
mymin = do_func(min) # Create a function for each operation
mymax = do_func(max)
mysum = do_func(add)
for x in range(10, 2, -1):
mymin(x) # Returns running minimum
mymax(x)
mysum(x)
print(f"minimum is {mymin()} max is {mymax()} sum is {mysum()}") # Final values |
Beta Was this translation helpful? Give feedback.
0 replies
-
Everything in one function: def min_max_sum(values):
iterator = iter(values)
_min = _max = _sum = next(iterator)
for value in iterator:
_min = min(_min, value)
_max = max(_max, value)
_sum += value
return _min, _max, _sum
def demo():
result = min_max_sum(range(10, 2, -1))
print("minimum is {} max is {} sum is {}".format(*result)) You can also easily convert it to a generator: def min_max_sum_gen(values):
iterator = iter(values)
_min = _max = _sum = next(iterator)
yield _min, _max, _sum
for value in iterator:
_min = min(_min, value)
_max = max(_max, value)
_sum += value
yield _min, _max, _sum
def demo2():
for row in min_max_sum_gen(range(10, 2, -1)):
print("minimum is {} max is {} sum is {}".format(*row)) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I want to share an approach I found when studying "Clean code in Python" by Mariano Anaya.
Basically in his book on page 253 he suggests doing statistics with a stream of data retrieved from a file (CSV format).
As he/we want to compute minimum, maximum and average he suggests tripling the stream with
itertools.tee
. So the processing function is:This looks elegant, but behind the curtain it is not. Actually
tee
stores the whole sequence an returns independent iterators for it, which are consumed bymin()
,max()
andsum()
. A consequence is, that on my Pico only a datafile of 1000 lines may be processed, 2000 lines lead already to a memory error.The idea: Why not program min, max, sum the other way round: We stuff data into them and get a result if desired.
This is achieved with coroutines:
Now the processing function looks like:
and I can process very (arbitrary?) long datasets. Speed is relatively the same, due to slowness of tee, probably.
It takes 8 s to process 10000 data this way.
If you like this approach, here is a coroutine for mean and standard deviation:
Beta Was this translation helpful? Give feedback.
All reactions