Skip to content

Conversation

@kaixiong
Copy link
Member

No description provided.

@kaixiong kaixiong added the bug label Jan 24, 2025
@kaixiong kaixiong requested a review from hartwork January 24, 2025 13:34
@kaixiong kaixiong self-assigned this Jan 24, 2025
Copy link
Member

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaixiong honestly I feel the fix is too tricky. If the idea is to have the stream be reset to position saved_stream_pos (which would be interesting to have a documented reason for, I'm still not sure why that's wanted or needed) I would expect that that seeking code is run (a) before every return nullptr or (b) outside the function or (c) in a goto error section (without intending to promote use of goto). What the pull request is currently doing is to have error-only code in the regular flow path. That's so weird to me (and undocumented) that I have a hard time signing up for it. If I missed something, that's likely adding to the argument of this being too tricky 😄 What do you think?

@kaixiong
Copy link
Member Author

@hartwork, the current front-facing loader function basically test-decodes image data using different decoders in sequence, so it relies on the stream position being reset properly on error. The contract has been to 'read all or read nothing'. The PNG loader is failing to uphold this in some cases and this PR is meant to fix this.

The change resets the stream position right after reading the signature. There's quite a number of calls to libpng after the signature test passes and each can fail (how likely I'm unable to tell though) and return early. So this saves quite a few repeated lines of cleanup (reset).

Conceptually, you could view the decoder broken into two phases, one testing the signature*, and one performing the actual decode. The second seek (in the patch) sets the stream position to the beginning of the data proper. I generally avoid gotos in C++ code. There's a similar pattern in the BMP code, though in that case, it's done because the header sizes are variable.

(Ideally, I would like to have a separate function for testing if some file data is of a particular format, kind of like what the file command does, but I don't really see a need for that at the moment.)

@kaixiong
Copy link
Member Author

kaixiong commented Jan 24, 2025

@hartwork Just to add, I did think it was a bit odd or confusing to reset-restore the stream position like this, but it made a lot of sense once I saw the decoder in two stages. This would even facilitate refactoring in the future.

@hartwork
Copy link
Member

@hartwork, the current front-facing loader function basically test-decodes image data using different decoders in sequence, so it relies on the stream position being reset properly on error. The contract has been to 'read all or read nothing'. The PNG loader is failing to uphold this in some cases and this PR is meant to fix this.

@kaixiong makes sense, thanks for elaborating.

So this saves quite a few repeated lines of cleanup (reset).

Yes, at the cost of making the code too tricky. You seem to want to go that route, so I'll do line-based review now, give me a second…


png_byte signature[8];
input.read (reinterpret_cast<char*> (signature), sizeof (signature));
input.seekg (saved_stream_pos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please add a comment to this line why we're rewinding the string here, the code does not make it obvious.
  • Is there a chance that seekg could kill an error state flag set by read above? I'm having trouble finding docs whether error flags are set until reset by the user or being auto-reset by every new API call?
  • PS: Does this have any effect on the input.seekg (saved_stream_pos); in line 93 being still needed or no longer needed?

Copy link
Member Author

@kaixiong kaixiong Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please add a comment to this line why we're rewinding the string here, the code does not make it obvious.

  • Is there a chance that seekg could kill an error state flag set by read above? I'm having trouble finding docs whether error flags are set until reset by the user or being auto-reset by every new API call?

According to this, seekg():

  1. Clears the eof flag before anything else
  2. Sets the input position only if fail() != true (i.e. either bad or fail flags are set)

So there are two failure scenarios:

  1. Partial read of signature. Both eof and fail flags are set, so seekg() does nothing. Clearly bad.
  2. Complete read of signature but stream ends immediately (no image data), so only eof is set. seekg() clears the eof and resets the input position. Okay, the decoding runs after and read errors will be caught there and then.
  • PS: Does this have any effect on the input.seekg (saved_stream_pos); in line 93 being still needed or no longer needed?

Did you mean line 98? Indeed, any read failure prior to that clean up routine will prevent the position reset, so we need to clear the fail flag with clear().

So, back to the drawing board..

input.read (reinterpret_cast<char*> (signature), sizeof (signature));
input.seekg (saved_stream_pos);

if (!input) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This reads rather weirdly to me, could we use .bad instead maybe? (https://cplusplus.com/reference/ios/ios/bad/))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ! operator tests for the 'fail' and 'bad' flags, so they're not equivalent. However, neither accounts for EOF it seems (I thought ! did). It looks like we ought to use !input.good() 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaixiong I would like !good a lot better 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartwork, sorry bad() is indeed equivalent to !. I seriously hate iostream.

@kaixiong kaixiong requested a review from hartwork January 24, 2025 20:36
Copy link
Member

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaixiong I like the new approach 👍 I think one of the seekg calls is missing (or could use?) error handling:

// Read PNG image data

// Skip to the first chunk, which comes right after the signature.
input.seekg (start_stream_pos + std::streampos {sizeof (signature)});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
input.seekg (start_stream_pos + std::streampos {sizeof (signature)});
if (!input.seekg (start_stream_pos + std::streampos {sizeof (signature)})) {
png_destroy_read_struct (&png_ptr, &info_ptr, &end_info);
return nullptr;
}

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subsequent read() calls (called via libpng hooks) will continue to keep the fail flag and trigger the cleanup in the setjmp() block, which also resets the stream position.

Specifically, read() first checks if good() is false (this happens if the seek fails), sets the fail flag again before returning.

One thing about the iostream (with exceptions off) is that once an operation fails, it basically stops trying to do anything. The remaining operations become effectively no-ops. So, one can safely issue a series of operations (e.g. read) and only check the status at the end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaixiong I think you're saying that you consider status quo good enough. That's alright with me, I will approve now…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaixiong I think you're saying that you consider status quo good enough. That's alright with me, I will approve now…

@hartwork Thank you.

In case I wasn't clear, there's a subtle difference between your suggested change and the original.

Namely that, with the change, when that seek() does fails, in particular when there's no more data to read (e.g. immediate end-of-file after signature), the handler fails to reset the stream position (only freeing the read struct).

The status quo on the other hand, will go through the entire cleanup process triggered on the first subsequent read() that is guaranteed to fail.

@kaixiong kaixiong merged commit 9f20838 into master Jan 25, 2025
6 checks passed
@kaixiong kaixiong deleted the fix-png-stream-pos-reset branch January 25, 2025 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants