Conversation
|
@rafecolton as far as I can tell this works perfectly fine. I tested locally with saltfab. Salting servers, etc in parallel and serial all works. |
rafecolton
left a comment
There was a problem hiding this comment.
Surprisingly simple change! Biggest concerns are around eating exit codes and setting .daemon = True on all threads
| # a multiprocessing.Queue. | ||
| def __init__(self, message=None, wrapped=None): | ||
| # Must allow for calling with zero args/kwargs for consistency. | ||
| def __init__(self, message, wrapped): |
There was a problem hiding this comment.
Signature change seems extraneous, so unless it's needed, I'd rather see it reverted to reduce the diff and chances of bugs. Seems reasonable to update the comment, though would like to see it be a little more informative (i.e. consistency with what)
There was a problem hiding this comment.
Updating the comment without changing the code doesn't really make sense. The only reason this exists is a hack for pickling to work in a multiprocessing queue which we removed entirely. i can clarify the comment
There was a problem hiding this comment.
👍 I'm inclined to leave the code as it was then, but if you're confident this works, then we can move forward with an updated comment
| if isinstance(datum['result'], BaseException): | ||
| results[datum['name']]['exit_code'] = 1 | ||
| else: | ||
| results[datum['name']]['exit_code'] = 0 |
There was a problem hiding this comment.
This looks like it could eat exit codes, which I don't love. Is there any way to retain the exit codes if a command fails on the server?
|
|
||
| def __init__(self, *args, **kwargs): | ||
| super().__init__(*args, **kwargs) | ||
| dict.__setattr__(self, '_tl', threading.local()) |
There was a problem hiding this comment.
Prefer unabbreviated variable names. This whole class seems a bit over-engineered - e.g. why do you need the _local() function as opposed to just naming the variable _local? Any why use this syntax vs just self._thread_local = threading.local()?
There was a problem hiding this comment.
I looked at this originally. I'm wondering if it's needed at all. I know claude made it so create support for non-local versions of the variables but I don't know that that actually is needed... I will re-review.
| 'env': local_env, | ||
| } | ||
| p = multiprocessing.Process(target=_parallel_wrap, kwargs=kwarg_dict) | ||
| p = threading.Thread(target=_parallel_wrap, kwargs=kwarg_dict) |
There was a problem hiding this comment.
The name p is no longer applicable, consider something more relevant (an unabbreviated) like thred
| } | ||
| p = multiprocessing.Process(target=_parallel_wrap, kwargs=kwarg_dict) | ||
| p = threading.Thread(target=_parallel_wrap, kwargs=kwarg_dict) | ||
| p.daemon = True |
There was a problem hiding this comment.
Hm, I think this can result in background threads sticking around even if the main process is killed, is that right? Are you sure we want this?
There was a problem hiding this comment.
interesting point, i'll think about this a little more
There was a problem hiding this comment.
I think the main process waits for all the threads to finish, so I don't think there is a concern about it killing threads prematurely. If somebody ^C's the main process, we actively want it to kill threads.
| super().__init__(*args, **kwargs) | ||
| # Use dict.__setattr__ to avoid triggering _AttributeDict's __setattr__ | ||
| # which would create a key in the dict. | ||
| dict.__setattr__(self, '_tl', threading.local()) |
There was a problem hiding this comment.
prefer unabbreviated variable name
| return ret | ||
|
|
||
|
|
||
| class _ThreadLocalAttributeDict(_AttributeDict): |
There was a problem hiding this comment.
This seems somewhat redundant with _ThreadLocalHostConnectionCache? Not sure if they can be combined somehow
There was a problem hiding this comment.
Only sort of. These thread local wrappers are wrapped around the previously defined classes, which are different from each other. The wrappers look similar because they are just making each of the classes thread local-able.
There was a problem hiding this comment.
👍 would be great to have something less redundant but if that's not feasible for this round, then please consider the same comments as above - unabbreviated variable names & confirming if all the functions AI added are necessary
Problem
On
macOS()callingfork()afterexec()can cause seg faults in system libraries because they are not thread safe. This causes segfaults because the memory locations copied from the parent into the child process viafork()are no longer valid.One suggested fix is to use
spawn()instead offork()but that only works if all code can be pickled, which isn't always the case. The other option, which is implemented here, is to usethreadinginstead.Note, I implemented thread local versions of the
envand other shared variables to replicate the current behavior. I know we internally have some desire to be able to share globals across execution threads for parallel tasks, but I didn't want to scope creep that into this PR, so we can address it later with a more thought out design.This should also have some added benefits when run on ft python.
Fixes https://github.com/Expensify/Expensify/issues/601176
Fixes https://github.com/Expensify/Expensify/issues/591584