-
Couldn't load subscription status.
- Fork 929
Java bindings exception handling fix #1803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixed an error where if there were no MPI exceptions, a JNI error could still exist and not get handled. Signed-off-by: Nathaniel Graham <[email protected]>
|
@nrgraham23 could you open a PR on ompi-release for this commit? milestone 2.0.1. Thanks. |
| return JNI_TRUE; | ||
| } | ||
| else if (JNI_TRUE == jni_exception) { | ||
| return JNI_TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nrgraham23 the result of this function is not checked in collectives. shouldn't we "cancel" the native exception and throw a MPIException here ?
my knowledge of C/Java interaction is pretty limited, so the terminology I used might be incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggouaillardet @hppritcha @jsquyres @siegmargross
I would say no. JNI exceptions are directly related to problems with something the JNI has tried to do. It seems to me like this is a separate issue from MPI because it is a result of a user trying to do something they shouldn't in Java. In the case of the index out of bounds error that has brought this up, a java array is too short for the attempted call.
That said, I have thoroughly traced the way the exceptions are being thrown, and discovered that it is actually acting differently from the way I had originally thought. The ompi_java_exceptionCheck method is returning a true when a JNI error occurs, however it's return is almost never checked (you can do a "git grep ompi_java_exceptionCheck" from the top ompi directory to see what I am talking about).
Instead, in almost every case, the code continues to execute in the C code, and the Java exception is not actually thrown until it returns to Java land.
I propose we change it to either throw an exception in ompi_java_exceptionCheck back to Java so the code stops executing and change the method return to void, or change all of the places where ompi_java_exceptionCheck is being called to actually check the return value.
The first option is far less code, but the second option would allow us to do memory cleanup stuff. Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently, in such a case, the cleanup is performed (e.g. ompi_java_releaseWritePtr or ompi_java_releaseReadPtr) but a java.lang.*Exception is thrown instead of a mpi.MPIException, am i right ?
what if there is an MPI error (for example, root rank is negative ) ? in my understanding, no cleanup is performed.
at first, could/should we
- do the cleanup first
- and then
ompi_java_exceptionCheck?
an other way to put things is, is there any reason not to do any cleanup if the MPI subroutine returned an error ?
if the buffer is too short, throwing a mpi.MPIException is still half baked to me, since blocking MPI subroutines would throw a mpi.MPI_Exception but non blocking MPI subroutines would throw a java.lang.*Exception and likely after the MPI subroutine returned successfully.
bottom line, i do not feel comfortable with that kind of approach.
- having Java throw a
java.lang.*Exceptionis a valid option to me - having MPI Java bindings check user buffer (a la
memchecker) when directed by the user (this is not cheap from a performance point of view) can be seen as an interesting but new feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I have seen so far as I have started changing some of the error code is that cleanup is happening in the case of any exception (MPI included), but objects are being returned to Java that should not be. The JNI stuff gets cleaned up, but its possible there is some MPI related stuff that is not getting cleaned up. Ill have to look more closely at that, but a first pass probably won't include those changes.
I think the non blocking MPI subroutines get their errors through the Request object that is created, but I could be wrong about that. Its been a while since I have looked at that code.
I personally don't think we should be checking user buffers, but I suppose if someone wanted to add it as a feature they could. So long as there was a switch to turn it on or off. And that is probably something that would need to be discussed with more people.
Fixed an error where if there were no MPI exceptions, a
JNI error could still exist and not get handled.
Signed-off-by: Nathaniel Graham [email protected]
Fixes #1698
@hppritcha @jsquyres