-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add new proxying modes foo__proxy: 'abort' and foo__proxy: 'abort_debug' #22648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ug', which will add compile time checks to verify that given JS function is not called in pthreads or Wasm Workers.
|
How to fix the lint errors? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable if there is use case for it. Are there existing system library function that you had in mind when designing this attribute?
src/jsifier.mjs
Outdated
| const insideWorker = (PTHREADS && WASM_WORKERS) | ||
| ? '(ENVIRONMENT_IS_PTHREAD || ENVIRONMENT_IS_WASM_WORKER)' | ||
| : (PTHREADS ? 'ENVIRONMENT_IS_PTHREAD' : 'ENVIRONMENT_IS_WASM_WORKER'); | ||
| prefix = `assert(!${insideWorker}, "Attempted to call function '${mangled}' inside a pthread/Wasm Worker, but this function has been declared to only be callable from the main browser thread");`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "but this function can only be called from the main thread"?
Using the term main browser thread I a little misleading because emscripten application can be run on workers where there is no main browser thread involved. We sometimes call this "main application thread", but I think just "main thread" here is clear enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proxying modes __proxy: 'sync' and __proxy: 'async' definitely mean to run the JS code in the main browser thread, and not in the thread that executes 'main'.
I.e. in PROXY_TO_PTHREAD builds, __proxy: 'sync' and __proxy: 'async' run the proxied JS code in the main browser thread, which is not the same thread as the main application thread (the main runtime thread)
The earlier PROXY_TO_WORKER build mode (which I presume is what you mean by "can be run on workers where there is no main browser thread involved") predates pthreads/SharedArrayBuffer, and does not interact with SAB, or the proxying architecture. So these proxying modes are not relevant there when pthreads are not used.
So I think saying "main browser thread" here is the correct thing to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "but this function can only be called from the main thread"?
Changed the message into form
"Attempted to call function '${mangled}' inside a pthread/Wasm Worker, but by using the proxying directive "'${proxyingMode}'", this function has been declared to only be callable from the main browser thread"
I would like to keep the mention about the proxy declaration that prevents the call. Otherwise a developer might be surprised to mistakenly ponder "how does it know this function is not callable from a worker?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proxying modes
__proxy: 'sync'and__proxy: 'async'definitely mean to run the JS code in the main browser thread, and not in the thread that executes 'main'.
The thread on which the emscripten-generated code is first run I call the main application thread, and it might be running in worker, in which case there is no main browser thread at all in the picture. When you write MAIN_THREAD_EM_ASM or __proxy: 'sync' its is the main application thread that runs you code. It just happens that this is also normally the main browser thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, has there been some kind of change in configuration that I am not aware of?
My expectation:
Building with -sPROXY_TO_WORKER: main() is run in a Worker thread. __proxy architecture is not available there.
Building with -pthread: main() is run in main browser thread. main browser thread == main application thread, __proxy targets main browser thread.
Building with -pthread -sPROXY_TO_PTHREAD: main() is run in a separate Worker thread, i.e. main application thread is not main browser thread. __proxy: 'sync' and MAIN_THREAD_EM_ASM() proxy to the main browser thread, and not to the main application thread.
In each case, proxying does target the main browser thread. Has this changed somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have several users that do this.
Is this case a scenario where the user manually spawns a new Worker(), and inside that Worker, they load up the Emscripten-compiled pthread-enabled program?
Such use case was never supported originally, because of the problem that in such a manual "I loaded the app into a Worker" mode, Emscripten would still need to be able to load some stuff (a lot of stuff) into the JS context of the main application thread, in order for any kind of general proxying to work. And Emscripten runtime JS libraries were not developed with the impression that the target thread that proxying occurs to, would not be the main browser thread.
For example, if I start searching for __proxy: 'sync' in /src/, there are functions that assume that they are always the main browser thread:
emscripten_set_window_title(): document.title only exists in main browser thread
emscripten_get_screen_size(): screen.width / screen.height exist only in main browser thread
emscripten_hide_mouse(): document.styleSheets only in main browser thread
emscripten_set_canvas_size: Browser.setCanvasSize() works only in main browser thread
emscripten_get_canvas_size: Module['canvas'] only works in main browser thread
emscripten_create_worker: requires Subworkers support in order to work (https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers#spawning_subworkers)
library_html5.js: everything here that has __proxy: 'sync' expects that proxy target lands to the main browser thread
library_egl.js, library_glut.js: everything with __proxy: 'sync' expects that proxy target is the main browser thread
library_openal.js, library_html5_webgl.js, library_sdl.js: likewise
that is why I added the whole PROXY_TO_PTHREAD mode so that users would not need to figure out how to manually launch Emscripten content main() in a Worker - they could reuse the system-provided mechanism for that.
Although now that I think about it, iirc subworkers are supported widely, so that is no longer a problem (in 2015 Chrome did not support subworkers)
So maybe those users that do go and launch their pthread builds in a Worker are careful to not use any of the above APIs, and just instead do generic computation there? I.e. those people are not using Canvases or library_browser.js or library_html5.js or any other DOM accessing things that Emscripten provides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I believe it was actually you who added those two functions back in 174a082)
Yeah, I added those, but that is not for supporting the "user manually launches a pthread-enabled build inside their manually created new Worker()".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this case a scenario where the user manually spawns a new Worker(), and inside that Worker, they load up the Emscripten-compiled pthread-enabled program?
Such use case was never supported originally, because of the problem that in such a manual "I loaded the app into a Worker" mode, Emscripten would still need to be able to load some stuff (a lot of stuff) into the JS context of the main application thread, in order for any kind of general proxying to work. And Emscripten runtime JS libraries were not developed with the impression that the target thread that proxying occurs to, would not be the main browser thread.
Yes exactly. It turns out that this configuration is supported, at least for some programs. Obviously such programs cannot use DOM or anything like that, but I believe we do at least some testing of this configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I added those, but that is not for supporting the "user manually launches a pthread-enabled build inside their manually created new Worker()".
Can you explain what you intended the difference between those two functions to be then? I seems that unless we support that configuration then then the main runtime thread and the main browser thread would always be the same? Unless I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very originally those functions intended to be a generic set to enable common handling of each of single-threaded, -pthread, and PROXY_TO_PTHREAD and PROXY_TO_WORKER build modes, so that users would be able to do multiple variants of builds from the same codebase and we'd be able to write system library code that would be aware of each of those modes.
So, for example, in a PROXY_TO_PTHREAD build, it would be expected to happen that emscripten_is_main_runtime_thread() == emscripten_is_main_browser_thread(). But then if one would make a PROXY_TO_WORKER build of the same, then those functions would change to return the different respective things. So that way one could have one codebase that was ready to mutate to any of those build modes.
But I did not originally intend for pthreads enabled builds to ever be loaded while inside a Worker(), since the __proxy: 'sync' semantics would proxy to the hosting Worker instead of going to the main browser thread, and we have never had any facilities to do proxying to the main browser thread in such a mode.
Although now that I think about it, yeah, it is probably the case that if user avoids all those library_x.js files that I listed above (that suffer from this problem), things will probably be pretty safe for them, even if they manually load pthread-enabled builds in their own Workers. And one can build "reasonable" use cases in the absence of relying on those library_x.js files.
I scanned library_browser.js and library_wasm_worker.js to see if there might be functions like this, but none stood out. The motivation for adding this is that at Unity I am seeing more and more feature developers add their own JS functions, and they generally copy-paste the __proxy: 'sync' directives when they are unsure about what that means. Many of these JS library functions should never use proxying, but we only ever call those JS functions in the main browser thread - if any pthread might call any of those JS functions, there would be trouble. So that is I why I wanted to make the proxying mechanism check this for me: by documenting each function that should be strictly main thread only, I can run the test suites to confirm that it indeeds happens to be the case, and that we are not getting any surprising unwanted proxying taking effect. |
|
It looks like there is this wholesale proxy: 'sync' directive that is happening with the filesystem, so changed https://github.com/emscripten-core/emscripten/pull/22648/files#diff-5de5cdf403acefc1dc87b4446a186049adc104ceca678d993f3d5a007bcafcc5R273 to make tests pass. |
src/library_wasi.js
Outdated
| #else | ||
| fd_write__deps: ['$printChar'], | ||
| #endif | ||
| fd_write__proxy: 'none', // Each Worker can do stdout/stderr printing on its own, without needing to proxy to the main thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure doWritev won't work if you call it on a thread. And I think its not true for printChar either since you would end up with a different printCharBuffers on each thread.. which is not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you would end up with a different
printCharBufferson each thread.. which is not correct.
I looked into this some decade ago, and from back then, I could not find defined semantics on how multiple threads writing on stdout/stderr should synchronize, when fflush is not called by the user. In most implementations, like ours, printing \n is an implicit fflush.
If multiple threads race to print to stdout/stderr, then it does seem to me it would be better semantics for each thread to construct full \n buffered lines that they would then race to print, rather than each thread mixing their individual character writes that could result in the data from multiple threads inter-mixing with each other? I could not find info to say such behavior would not be compliant.
Each Worker has a copy of MEMFS/library_tty.js in them, so they would each end up calling to default_tty_ops.put_char() and pushing out console.log() calls independently?
Having each thread synchronously block to perform stdout prints would seem like a rough performance overhead to have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the function is not only used for stdout/stderr, its used for all file access.
If we do make this change it would be very large one (basically making the filesystem multi-threaded) and I would at least expect a separate RP for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, you're right of course. I somehow was interpreting this from the perspective that this function was already just for the stdout/stderr writing case. That would not work. I'll think of something here.
src/jsifier.mjs
Outdated
| const insideWorker = (PTHREADS && WASM_WORKERS) | ||
| ? '(ENVIRONMENT_IS_PTHREAD || ENVIRONMENT_IS_WASM_WORKER)' | ||
| : (PTHREADS ? 'ENVIRONMENT_IS_PTHREAD' : 'ENVIRONMENT_IS_WASM_WORKER'); | ||
| prefix = `assert(!${insideWorker}, "Attempted to call function '${mangled}' inside a pthread/Wasm Worker, but by using the proxying directive '${proxyingMode}', this function has been declared to only be callable from the main browser thread");`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the assert function does not exist in release builds (at least not in MINIMAL_RUNTIME or STRICT mode). So maybe when ASSERTIONS is not set this needs to simply be an abort?
Or just skip the check and call it undefined behaviour in release builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good catch. Updated the code to an abort() and added that case as a test.
test/wasm_worker/proxy_abort.c
Outdated
| } | ||
|
|
||
| void worker_main() { | ||
| REPORT_RESULT(might_throw(test)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you not just do might_throw(proxied_js_function) here? (i.e. do we need the test function at all?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since that will throw an unhandled exception, and the test run will then hang without calling out to REPORT_RESULT() at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just mean the test wrapper function:
void test() {
proxied_js_function();
}
It seems like a pointless wrapper around proxied_js_function? Is it needed for some reason? Why can't the address of proxied_js_function be used in place of the address of test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that can be removed.
…hen there is no filesystem enabled.
| // When we don't have a filesystem enabled, we do not need to proxy fd_write | ||
| // to the main thread, because all we do is log out() and err() calls. | ||
| // This enables printf() to work from Wasm Workers. | ||
| fd_write__proxy: 'none', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a separate PR? It seems like a good idea but unrelated to the abort stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it back because it is needed by the correctness of this PR.
Adding the new ASSERTIONS time check that functions are not attempted to be proxied in MINIMAL_RUNTIME mode exposed this function being proxied. It is correct here to declare this as proxy none, and otherwise the test suite won't pass in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see so there is a new assertion at play here, independent of the new abort stuff.
I do think it would be nice to split that out so that the adding of the new abort feature can be added in isolation, but if you feel strongly that its no worth it then this change LGTM.
From a casual reading of the diff its quite hard to see how/where this new assertion was added (it looks like that assertion is pre-existing, but maybe not in all cases?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a casual reading of the diff its quite hard to see how/where this new assertion was added (it looks like that assertion is pre-existing, but maybe not in all cases?)
The previous code was
if (PTHREADS) {
...
} else if (WASM_WORKERS && ASSERTIONS) {
// add check: "assert(!ENVIRONMENT_IS_WASM_WORKER, "Attempted to call proxied function '${mangled}' in a Wasm Worker, but in Wasm Worker enabled builds, proxied function architecture is not available!");"
}The refactored code is
if (PTHREADS) {
...
}
if (WASM_WORKERS && ASSERTIONS) {
// add check: "assert(!ENVIRONMENT_IS_WASM_WORKER, "Attempted to call proxied function '${mangled}' in a Wasm Worker, but in Wasm Worker enabled builds, proxied function architecture is not available!");"
}When the original code was written PTHREADS vs WASM_WORKERS were mutually exclusive features - you couldn't build with both of them at the same time, so the two code blocks were effectively equivalent.
But then at some point that restriction was lifted that you could build with both of them, so the new code now accounts for that.
I agree it is a bit subtle, I am not sure if I did that fix knowingly at the time, but I am kind of wrangling a queue of more than 30 PRs here, so I'd prefer not to rewrite Git history PR commits just for the prose. I think this new check is more accurate to what we'll want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
It still seem odd though since I would have through that most of our tests are for either WASM_WORKERS or PTHREADS... so that assertion for fdwrite should be firing in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figuring out exactly what is going on here seems like a good reason to split out the fdwrite change... but i get that you might be tired on splitting up changes at this point :)
Personally I'd love to see the assertions fix + fdwrite fix land first, then the new modes. But I really appreciate all the PRs you've been landing recently and I don't want to discourage your too much so lgtm either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll defer this feature altogether now for the time being, to avoid any chances.
| snippet, | ||
| (args, body) => ` | ||
| function(${args}) { | ||
| assert(!ENVIRONMENT_IS_WASM_WORKER, "Attempted to call proxied function '${mangled}' in a Wasm Worker, but in Wasm Worker enabled builds, proxied function architecture is not available!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come the fdwrite syscall wasn't hitting this assertion already?
# Conflicts: # test/code_size/test_codesize_minimal_pthreads.json # test/code_size/test_codesize_minimal_pthreads_memgrowth.json
|
I'll close this for later when I have time to split up the PR. |
Add new proxying modes foo__proxy: 'abort' and foo__proxy: 'abort_debug', which will add compile time checks to verify that given JS function is not called in pthreads or Wasm Workers.