Lessons from a SO question (race, gather, map, for...) #17

2colours · 2025-04-05T13:51:52Z

2colours
Apr 5, 2025
Collaborator

https://stackoverflow.com/questions/79550260/use-gather-take-with-race

This question is worth investigating, I also played around with the code snippets.

the "race for race" paradigm

This seemed at first but then I noticed that if I delete either race from raku -e 'race for (^8).race(batch => 1, degree => 4) {sleep rand; .say}', the execution returns to sequential.
for serializing parallel sequences is documented behavior but the code snippet given there doesn't really explain why race for ^8 would seem sequential.

This is where I noticed that the given snippet, claimed to be "around 3x faster than the bare for", is actually around 2.5x slower, and the case is even worse with race! When I checked the performance characteristics with the oldest downloadable* (*via rakubrew) Rakudo version, running for ^100_000 { .say if .is-prime && ++$ %% 1000 }, the difference wasn't as significant but the bare for version was still faster (despite the fact $ indeed wasn't shared in the race version and therefore it didn't even print anything). Interesting observations:

when running without the static variable, logging all primes, it could be observed that the order is still de facto sequential with race for range - I'm yet to see an example where this is not the case!
kudos to the developers: not sure what they changed but the time difference between Rakudo version 2019.11 and 2025.03 is mind-blowing: 30/15 seconds (with/without race) versus 0.9/0.3 seconds on my setup!

Anyway: race for range does seem to do something but it's nothing useful so far: all examples were made slower by it, and the actual order didn't change. I could imagine this is a result of a bad batch/degree default - but I have no idea how to change it, so you are probably stuck with the "race for race" thing which is, again, required because for race linearizes the data.

the `gather race` problem

This is actually explained by Nick Logan's answer. gather/take operates with control exceptions and these control exceptions obviously(?) aren't shared across threads. This is very similar to the ++$ issue - seems clear and fair enough.

`gather/take` vs `map` vs `[do] for`

I can recall hearing this "urban legend" that do for does map under the hood - well, maybe there are cases where this can happen but surely not in the general case. As we have seen, for "serializes" (whatever the word is) parallel data, and fortunately, do for is consistent with it. However, map is notoriously a method that keeps data parallelizations.

Intermezzo: I was first uncertain if it's a good choice to refer to map as a method - after all, there is a functional version of it as well - but then it turns out, this behavior only applies to the method version. This can be seen by this example.
I wanted to figure out how the sub form works because the source part is easy to find. The REPL immediately punished me: entering (^5).race makes the prompt never return (using Terminal::LineEditor, at least). You have to ctrl-c out. This seems to be the case for basically anything that produces a parallel data structure in the REPL - sink has to be called explicitly, or else funny things can happen, like... from the method... IterationEnd showing up. Anyway, I couldn't figure out more than that the map subroutine somehow degrades the RaceSeq to a bare Seq. UPDATE: it turns out the +sigil itself already saw a List from the inside. I think this is the kind of "non-lazy lazy list", ie. a list that doesn't know about itself that it's lazy but it didn't generate the values yet. Either way: the wrong .map will be called.

Now, onto the race gather for scenario - first, when I ran this snippet: race gather for (^8).race(batch => 1, degree => 4) { say ++$; .take } in the REPL, I got a different type of funky error ("a worker (...) died at: continuationreset requires a continuation or code handle"), almost expected by now. :D Anyway, as you can see, here I'm explicitly using the anonymous state variable in order to deduce whether the code ran on the same thread - and it did: if you remove the gather/take related bits, the number printed will always be 1; with gather/take, it goes all the way up to 8.
This basically means there is no race gather for - it's an application of race on the result of gather for. (race can be misleading - it is indeed a keyword but it's more like a contextualizer than a control structure. If you do race { say "RACE"; }, after some waiting it will complain that the .race method cannot be called on a Boolean - the return type of say - value... maybe the REPL didn't hang previously, it just didn't time out yet? I don't know...)
If you check what gather for returns - that's a usual Seq, probably the "non-lazy lazy" type but in any case, a sequential data type.

Now we are at the point where we can make some conclusions about these data processing utilities:

for (without the race) "linearizes" data (I wouldn't be surprised if it used .list for that but I don't have any evidence)
.map is actually lower level - it has overrides for different data types and will "do the right thing" for the given data type
gather/take is also sequential by nature - it is a more stateful, customizable way to produce data on a single thread

In all cases, single-threaded data will default to Seqs - "non-lazy lazy" style, at least in the usual case. (I can imagine that for certain data, it can be "admittedly lazy" as well.)

(This is where I also noticed that --target=ast often takes a concerningly long time - much longer than running the script, even. Not sure if this is related but soon after I saw that my WSL died.)

2colours · 2025-04-09T12:11:31Z

2colours
Apr 9, 2025
Collaborator Author

Update: Timo Paulsen posted a huge answer on this one - I'm about to read it, maybe it gives more insight.

3 replies

2colours Apr 9, 2025
Collaborator Author

First note:

Just hyper for or race for do not guarantee that something is hypered, but a .hyper or .race call on something will.

This is a very strange statement that I think would deserve some backing. Just think about it: as we saw with static variables for example, the behavior of a code block can be completely different based on whether it executes in a shared context or completely independently.

2colours Apr 9, 2025
Collaborator Author

Second note: .map({ slip do gather { take "$_ ..."; sleep rand; take "... $_" } }) is an interesting paradigm :)

2colours Apr 9, 2025
Collaborator Author

Third note:

There is an example with channels. I'm still a bit confused with react and supply blocks but even more confused with this line:

We have to start a "worker" process if we want to actually immediately see values as they arrive.

Why, or how do we know this is the case? I thought the only problem would be that react is blocking (until done is called).

I would also be curious about the alternatives provided, what runs on what thread, what are the thread lifecycles here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lessons from a SO question (race, gather, map, for...) #17

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Lessons from a SO question (race, gather, map, for...) #17

Uh oh!

Uh oh!

2colours Apr 5, 2025 Collaborator

the "race for race" paradigm

the gather race problem

gather/take vs map vs [do] for

Replies: 1 comment · 3 replies

Uh oh!

2colours Apr 9, 2025 Collaborator Author

Uh oh!

2colours Apr 9, 2025 Collaborator Author

Uh oh!

2colours Apr 9, 2025 Collaborator Author

Uh oh!

2colours Apr 9, 2025 Collaborator Author

2colours
Apr 5, 2025
Collaborator

the `gather race` problem

`gather/take` vs `map` vs `[do] for`

Replies: 1 comment 3 replies

2colours
Apr 9, 2025
Collaborator Author

2colours Apr 9, 2025
Collaborator Author

2colours Apr 9, 2025
Collaborator Author

2colours Apr 9, 2025
Collaborator Author