-
Notifications
You must be signed in to change notification settings - Fork 740
Stop abusing shared memory lock to protect exception #2509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Use a separate global lock instead. Fixes: bytecodealliance#2407
5c89022 to
73ec790
Compare
| exception_lock(wasm_inst); | ||
| if (data->exception != NULL) { | ||
| snprintf(wasm_inst->cur_exception, sizeof(wasm_inst->cur_exception), | ||
| "Exception: %s", data->exception); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for Only spread non "wasi proc exit" exception is ignored, see L1255, L1260 of the original file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's intentional. is it a problem? do you have a test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check was introduced in PR #1988.
Raising "wasi proc exit" exception is an intentional behavior of runtime, it is not an actual exception but somewhat like setting a flag to let current thread stop running opcodes, and after the thread stops and in the end of wasm_runtime_call_wasm, the thread will clear this exception, so it ends normally without exception thrown. For multi-threading, the thread doesn't spread "wasi proc exit" exception, but just set terminate flags of other threads to let them exit also.
It may cause unexpected behavior if thread A spreads this exception to other threads: other thread (let's say thread B) may stop running opcodes first, then handle the "wasi proc exit" exception and clear exceptions of other threads, including thread A. When thread A's exception is cleared, it may continue to run and throw "unreachable" exception (Note that after calling wasi_proc_exit, in most cases the next opcode is unreachable, the bytecodes are generated by emsdk or wasi-sdk). And eventually "unreachable" exception is thrown.
I believe we found the issue when testing the was-thread related test cases and then we fixed it, the issue occurred occasionally. If we want to reproduce it, we may try running the wasi-thread cases many times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check was introduced in PR #1988.
Raising "wasi proc exit" exception is an intentional behavior of runtime, it is not an actual exception but somewhat like setting a flag to let current thread stop running opcodes, and after the thread stops and in the end of wasm_runtime_call_wasm, the thread will clear this exception, so it ends normally without exception thrown. For multi-threading, the thread doesn't spread "wasi proc exit" exception, but just set terminate flags of other threads to let them exit also.
while proc exit is not a real trap, what the runtime should do is almost same as real traps.
ie. terminate all threads and return the exit/trap to the api user as the result of the whole "thread group".
It may cause unexpected behavior if thread A spreads this exception to other threads: other thread (let's say thread B) may stop running opcodes first, then handle the "wasi proc exit" exception and clear exceptions of other threads, including thread A. When thread A's exception is cleared, it may continue to run and throw "unreachable" exception (Note that after calling wasi_proc_exit, in most cases the next opcode is unreachable, the bytecodes are generated by emsdk or wasi-sdk). And eventually "unreachable" exception is thrown.
if it's a problem, real traps have the same problems, don't they?
my impression is that many (all?) of the code clearing other threads' exception are just broken: #2481
I believe we found the issue when testing the was-thread related test cases and then we fixed it, the issue occurred occasionally. If we want to reproduce it, we may try running the wasi-thread cases many times.
i guess i will restore this (IMO wrong) behavior for now because it isn't the main point of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check was introduced in PR #1988.
Raising "wasi proc exit" exception is an intentional behavior of runtime, it is not an actual exception but somewhat like setting a flag to let current thread stop running opcodes, and after the thread stops and in the end of wasm_runtime_call_wasm, the thread will clear this exception, so it ends normally without exception thrown. For multi-threading, the thread doesn't spread "wasi proc exit" exception, but just set terminate flags of other threads to let them exit also.while proc exit is not a real trap, what the runtime should do is almost same as real traps. ie. terminate all threads and return the exit/trap to the api user as the result of the whole "thread group".
Yes, almost the same, except it doesn't spread the exception to other threads and it clears the exception before it ends.
It may cause unexpected behavior if thread A spreads this exception to other threads: other thread (let's say thread B) may stop running opcodes first, then handle the "wasi proc exit" exception and clear exceptions of other threads, including thread A. When thread A's exception is cleared, it may continue to run and throw "unreachable" exception (Note that after calling wasi_proc_exit, in most cases the next opcode is unreachable, the bytecodes are generated by emsdk or wasi-sdk). And eventually "unreachable" exception is thrown.
if it's a problem, real traps have the same problems, don't they?
No, real traps are spread to other threads and terminate flags are also set for other threads, but the trap isn't cleared before the thread ends, so thread A's exception won't be cleared by thread B.
my impression is that many (all?) of the code clearing other threads' exception are just broken: #2481
Do you mean to unify wasm_runtime_set_exception(inst, NULL) and wasm_runtime_clear_exception(inst), and to remove some unneeded exception clear?
I believe we found the issue when testing the was-thread related test cases and then we fixed it, the issue occurred occasionally. If we want to reproduce it, we may try running the wasi-thread cases many times.
i guess i will restore this (IMO wrong) behavior for now because it isn't the main point of this PR.
Yes, had better restore this and fix it with other PR if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess i will restore this (IMO wrong) behavior for now because it isn't the main point of this PR.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's a problem, real traps have the same problems, don't they?
No, real traps are spread to other threads and terminate flags are also set for other threads, but the trap isn't cleared before the thread ends, so thread A's exception won't be cleared by thread B.
a real trap can misbehave in a similar way if the exception is suddenly cleared by the other thread.
my impression is that many (all?) of the code clearing other threads' exception are just broken: #2481
Do you mean to unify
wasm_runtime_set_exception(inst, NULL)andwasm_runtime_clear_exception(inst), and to remove some unneeded exception clear?
yes.
unifying two api is just cosmetic.
the other one is a bit cumbersome. i guess we need to investigate one-by-one to see if it should clear other threads' exceptions. (i guess most of them need to clear only the local exception.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks, it really takes effort to investigate them one by one.
wenyongh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
LGTM |
…e#2509) Use a separate global lock instead. Fixes: bytecodealliance#2407
Fixes: #2407