Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions ompi/mpi/java/c/mpi_MPI.c
Original file line number Diff line number Diff line change
Expand Up @@ -1124,23 +1124,22 @@ void ompi_java_releasePtrArray(JNIEnv *env, jlongArray array,

jboolean ompi_java_exceptionCheck(JNIEnv *env, int rc)
{
jboolean jni_exception;

if (rc < 0) {
/* handle ompi error code */
rc = ompi_errcode_get_mpi_code (rc);
/* ompi_mpi_errcode_get_class CAN NOT handle negative error codes.
* all Open MPI MPI error codes should be > 0. */
assert (rc >= 0);
}
jni_exception = (*env)->ExceptionCheck(env);

if(MPI_SUCCESS == rc)
if(MPI_SUCCESS == rc && JNI_FALSE == jni_exception)
{
return JNI_FALSE;
}
else if((*env)->ExceptionCheck(env))
{
return JNI_TRUE;
}
else
else if(MPI_SUCCESS != rc)
{
int errClass = ompi_mpi_errcode_get_class(rc);
char *message = ompi_mpi_errnum_get_string(rc);
Expand All @@ -1154,6 +1153,9 @@ jboolean ompi_java_exceptionCheck(JNIEnv *env, int rc)
(*env)->DeleteLocalRef(env, jmessage);
return JNI_TRUE;
}
else if (JNI_TRUE == jni_exception) {
return JNI_TRUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrgraham23 the result of this function is not checked in collectives. shouldn't we "cancel" the native exception and throw a MPIException here ?
my knowledge of C/Java interaction is pretty limited, so the terminology I used might be incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggouaillardet @hppritcha @jsquyres @siegmargross

I would say no. JNI exceptions are directly related to problems with something the JNI has tried to do. It seems to me like this is a separate issue from MPI because it is a result of a user trying to do something they shouldn't in Java. In the case of the index out of bounds error that has brought this up, a java array is too short for the attempted call.

That said, I have thoroughly traced the way the exceptions are being thrown, and discovered that it is actually acting differently from the way I had originally thought. The ompi_java_exceptionCheck method is returning a true when a JNI error occurs, however it's return is almost never checked (you can do a "git grep ompi_java_exceptionCheck" from the top ompi directory to see what I am talking about).

Instead, in almost every case, the code continues to execute in the C code, and the Java exception is not actually thrown until it returns to Java land.

I propose we change it to either throw an exception in ompi_java_exceptionCheck back to Java so the code stops executing and change the method return to void, or change all of the places where ompi_java_exceptionCheck is being called to actually check the return value.

The first option is far less code, but the second option would allow us to do memory cleanup stuff. Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, in such a case, the cleanup is performed (e.g. ompi_java_releaseWritePtr or ompi_java_releaseReadPtr) but a java.lang.*Exception is thrown instead of a mpi.MPIException, am i right ?

what if there is an MPI error (for example, root rank is negative ) ? in my understanding, no cleanup is performed.

at first, could/should we

  • do the cleanup first
  • and then ompi_java_exceptionCheck ?

an other way to put things is, is there any reason not to do any cleanup if the MPI subroutine returned an error ?

if the buffer is too short, throwing a mpi.MPIException is still half baked to me, since blocking MPI subroutines would throw a mpi.MPI_Exception but non blocking MPI subroutines would throw a java.lang.*Exception and likely after the MPI subroutine returned successfully.

bottom line, i do not feel comfortable with that kind of approach.

  • having Java throw a java.lang.*Exception is a valid option to me
  • having MPI Java bindings check user buffer (a la memchecker) when directed by the user (this is not cheap from a performance point of view) can be seen as an interesting but new feature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I have seen so far as I have started changing some of the error code is that cleanup is happening in the case of any exception (MPI included), but objects are being returned to Java that should not be. The JNI stuff gets cleaned up, but its possible there is some MPI related stuff that is not getting cleaned up. Ill have to look more closely at that, but a first pass probably won't include those changes.

I think the non blocking MPI subroutines get their errors through the Request object that is created, but I could be wrong about that. Its been a while since I have looked at that code.

I personally don't think we should be checking user buffers, but I suppose if someone wanted to add it as a feature they could. So long as there was a switch to turn it on or off. And that is probably something that would need to be discussed with more people.

}
}

void* ompi_java_attrSet(JNIEnv *env, jbyteArray jval)
Expand Down