Skip to content

Replace multiprocessing with threading#10

Open
coleaeason wants to merge 7 commits intomainfrom
cole/threading
Open

Replace multiprocessing with threading#10
coleaeason wants to merge 7 commits intomainfrom
cole/threading

Conversation

@coleaeason
Copy link

@coleaeason coleaeason commented Feb 17, 2026

Problem

On macOS() calling fork() after exec() can cause seg faults in system libraries because they are not thread safe. This causes segfaults because the memory locations copied from the parent into the child process via fork() are no longer valid.

One suggested fix is to use spawn() instead of fork() but that only works if all code can be pickled, which isn't always the case. The other option, which is implemented here, is to use threading instead.

Note, I implemented thread local versions of the env and other shared variables to replicate the current behavior. I know we internally have some desire to be able to share globals across execution threads for parallel tasks, but I didn't want to scope creep that into this PR, so we can address it later with a more thought out design.

This should also have some added benefits when run on ft python.

Fixes https://github.com/Expensify/Expensify/issues/601176
Fixes https://github.com/Expensify/Expensify/issues/591584

@coleaeason coleaeason self-assigned this Feb 17, 2026
@coleaeason coleaeason changed the title Cole/threading Replace multiprocessing with threading Feb 17, 2026
@coleaeason coleaeason marked this pull request as ready for review February 17, 2026 22:24
@coleaeason
Copy link
Author

@rafecolton as far as I can tell this works perfectly fine. I tested locally with saltfab. Salting servers, etc in parallel and serial all works.

Copy link
Member

@rafecolton rafecolton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly simple change! Biggest concerns are around eating exit codes and setting .daemon = True on all threads

# a multiprocessing.Queue.
def __init__(self, message=None, wrapped=None):
# Must allow for calling with zero args/kwargs for consistency.
def __init__(self, message, wrapped):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signature change seems extraneous, so unless it's needed, I'd rather see it reverted to reduce the diff and chances of bugs. Seems reasonable to update the comment, though would like to see it be a little more informative (i.e. consistency with what)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the comment without changing the code doesn't really make sense. The only reason this exists is a hack for pickling to work in a multiprocessing queue which we removed entirely. i can clarify the comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'm inclined to leave the code as it was then, but if you're confident this works, then we can move forward with an updated comment

Comment on lines +182 to +185
if isinstance(datum['result'], BaseException):
results[datum['name']]['exit_code'] = 1
else:
results[datum['name']]['exit_code'] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could eat exit codes, which I don't love. Is there any way to retain the exit codes if a command fails on the server?


def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
dict.__setattr__(self, '_tl', threading.local())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer unabbreviated variable names. This whole class seems a bit over-engineered - e.g. why do you need the _local() function as opposed to just naming the variable _local? Any why use this syntax vs just self._thread_local = threading.local()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this originally. I'm wondering if it's needed at all. I know claude made it so create support for non-local versions of the variables but I don't know that that actually is needed... I will re-review.

'env': local_env,
}
p = multiprocessing.Process(target=_parallel_wrap, kwargs=kwarg_dict)
p = threading.Thread(target=_parallel_wrap, kwargs=kwarg_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name p is no longer applicable, consider something more relevant (an unabbreviated) like thred

}
p = multiprocessing.Process(target=_parallel_wrap, kwargs=kwarg_dict)
p = threading.Thread(target=_parallel_wrap, kwargs=kwarg_dict)
p.daemon = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think this can result in background threads sticking around even if the main process is killed, is that right? Are you sure we want this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting point, i'll think about this a little more

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main process waits for all the threads to finish, so I don't think there is a concern about it killing threads prematurely. If somebody ^C's the main process, we actively want it to kill threads.

super().__init__(*args, **kwargs)
# Use dict.__setattr__ to avoid triggering _AttributeDict's __setattr__
# which would create a key in the dict.
dict.__setattr__(self, '_tl', threading.local())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer unabbreviated variable name

return ret


class _ThreadLocalAttributeDict(_AttributeDict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems somewhat redundant with _ThreadLocalHostConnectionCache? Not sure if they can be combined somehow

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only sort of. These thread local wrappers are wrapped around the previously defined classes, which are different from each other. The wrappers look similar because they are just making each of the classes thread local-able.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 would be great to have something less redundant but if that's not feasible for this round, then please consider the same comments as above - unabbreviated variable names & confirming if all the functions AI added are necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants