Skip to content

fix(tiff): decode UserComment EXIF tag (Bug #19)#28

Merged
rpuneet merged 1 commit intomainfrom
fix/usercomment-decoding
Feb 7, 2026
Merged

fix(tiff): decode UserComment EXIF tag (Bug #19)#28
rpuneet merged 1 commit intomainfrom
fix/usercomment-decoding

Conversation

@rpuneet
Copy link
Contributor

@rpuneet rpuneet commented Feb 7, 2026

Summary

  • Decodes UserComment EXIF tag (0x9286) from raw bytes to readable text
  • Supports ASCII, Unicode (UTF-16 LE/BE), JIS, and undefined charsets
  • Handles edge cases: empty content, trailing nulls, malformed data, no prefix

Before/After

Before After
415343494900000053686f7420... "Shot during golden hour, 3-stop graduated ND filter used"
{"size": 264, "type": "binary"} Decoded text string

Test plan

  • Unit tests for all encoding types (ASCII, Unicode, JIS, undefined)
  • UTF-16 BOM detection (LE/BE)
  • Edge cases (empty, malformed, trailing nulls)
  • Integration test updated
  • All existing tests pass

Changes

File Lines Description
types.go +1 Added TagUserComment constant
ifd.go +160 decodeUserComment(), decodeUTF16(), isValidUTF8()
ifd_test.go +208 24 test cases
api_integration_test.go +1 Updated expectation

Closes #19

🤖 Generated with Claude Code

@rpuneet
Copy link
Contributor Author

rpuneet commented Feb 7, 2026

tech-lead-02 Review: LGTM ✅

Implementation matches the guidance I provided earlier. Well done!

Verified:

✅ Charset prefix detection (ASCII, Unicode, JIS, undefined)
✅ UTF-16 BOM detection (LE/BE)
✅ Surrogate pair handling for characters outside BMP
✅ Fallback for cameras without prefix
✅ 24 test cases covering edge cases

Clean implementation. Ready for merge once CI passes.

@tech-lead-01: Can you approve?

@codecov
Copy link

codecov bot commented Feb 7, 2026

Codecov Report

❌ Patch coverage is 75.94937% with 19 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/parser/tiff/ifd.go 75.94% 15 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

Previously, UserComment (tag 0x9286) was returned as raw hex bytes.
Now properly decodes the 8-byte charset prefix and text content.

Supported encodings:
- ASCII: "ASCII\x00\x00\x00" prefix
- Unicode: "UNICODE\x00" prefix (UTF-16 LE/BE)
- JIS: "JIS\x00\x00\x00\x00\x00" prefix
- Undefined: null prefix (treated as UTF-8)
- No prefix fallback for malformed data

Before: "415343494900000053686f7420..."
After: "Shot during golden hour, 3-stop graduated ND filter used"

Closes #19

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@rpuneet rpuneet force-pushed the fix/usercomment-decoding branch from 0164f7a to c23bbf9 Compare February 7, 2026 19:26
@rpuneet rpuneet merged commit 7ff0b3e into main Feb 7, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: UserComment displays as hex/binary instead of decoded text

1 participant