fix: Fix null pointer exception on reopen (for some readers)#4412
fix: Fix null pointer exception on reopen (for some readers)#4412tlambert03 wants to merge 1 commit intoome:developfrom
Conversation
|
after digging into this a bit more, I'm less confident that this is safe. I guess i'm just unclear on what the API contract is for What is the semantic meaning of (and, feel free to close this if you're certain it's an incorrect fix that needs more discussion) |
|
Thanks, @tlambert03. On the surface, this doesn't look like a problematic change, but it would be helpful to have a little more information in order to evaluate more carefully. Do you have a minimal example that we can use to reproduce the original
isn't necessarily widely tested. Can you explain a bit under what circumstances As a side note, we would need a signed Contributor License Agreement in order to merge this pull request once it passes review (see also https://ome-contributing.readthedocs.io/en/latest/third-party-policy.html#third-party-contributions). |
|
thanks @melissalinkert here's a minimal reproducer. If you pass in an import loci.formats.ImageReader;
public class ReopenBug {
public static void main(String[] args) throws Exception {
ImageReader reader = new ImageReader();
reader.setId(args[0]);
reader.openBytes(0, 0, 0, 512, 512);
reader.close(true);
reader.reopenFile();
reader.openBytes(0, 0, 0, 512, 512); // NPE here on develop!
reader.close();
System.out.println("Success!");
}
}
sure! The real challenge there lies in resource management. If we present something like: zarr_group = open_bio_zarr('some_file.svs')then it's a bit of a bummer to also have to either demand that the end user only ever use that object within some single context where the underlying reader is open. But on the other hand, we don't want to leak file resources and file handles. And if we completely spin up and tear down the whole thing on every partial read, then it's a performance disaster. So, that's why I wanted to use I will say that it's actually working quite nicely for all the other files I tried (ome.tiff, nd2, czi, dv, etc...). It just seems that a handful of the Readers aren't ready to support that. I can also just accept that this is a "risky maneuver", and we can just either raise file-specific exceptions or just refuse to use certain Reader classes with this pattern... but if it can be improved in a safe way, then I thought it worth trying |
Hello! my first PR here so please put me in my place if i'm suggesting something stupid :)
Over in https://github.com/imaging-formats/bffile I'm working on a new python wrapper for bio-formats (essentially an improved variant of the one contributed a while back to aicsimageio/bioio). As I was playing with some patterns I hit a
java.lang.java.lang.NullPointerExceptionupon callingreopenFile(), after having calledclose()withfileOnly=True.After some time with claude I came to this fix, but it touches the actual FormatReader base... I'll give you the full conversation we had here just so you can see my understanding (and naiveté 😄 )
Details
I asked it:
it said:
YES, there is a clear bug. Here's the evidence:
FormatReader.close(boolean fileOnly) (lines 946-956):
The Bug: When close(fileOnly=true) is called, the stream is closed but in is NOT set to null. This leaves in pointing to a closed RandomAccessInputStream
object.
Why This Matters:
NO, fixing it in FormatReader would be quite safe. Here's why:
I examined all the reopenFile() implementations in the codebase (~186 readers total). They fall into three patterns:
Pattern 1: Don't check, just recreate (TileJPEGReader, ICSReader, etc.):
✅ Would work fine with the fix (harmless to close null, and they recreate regardless)
Pattern 2: Explicitly null first (MinimalTiffReader):
✅ Would work fine with the fix (they already expect this behavior)
Pattern 3: Delegate to super (FlowSightReader, OMETiffReader):
✅ Would work fine with the fix (they rely on base class behavior)
Conclusion:
Adding in = null after in.close() in FormatReader.close(fileOnly=true) is safe and would not break any existing readers. In fact, it would fix
latent bugs in multiple readers.
Potentially Affected Readers:
Any reader that uses an initialization pattern like:
Actual Impact in Practice:
a. close(fileOnly=true) is called
b. Then reopenFile() or similar initialization is called
c. The initialization checks if (in == null) before creating a stream
Summary
The current workaround in MinimalTiffReader (setting in = null in reopenFile()) masks the bug but doesn't fix the root cause. Fixing it in FormatReader would be
the cleanest, safest, and most maintainable solution.
After making this change, all tests continued passing, and reopening SVS also passed. Happy to have it rejected or modified of course 👍