Online Discussion Forum: JPEG file structure? #3062
-
|
This question isn't related to ImageSharp per se, but I very much appreciated your earlier help, so I thought I'd ask it here. I'm playing around with writing a C# library to read/write the data elements/segments in various image files. I started this because I was curious to see how hard it would be to write something to read/write metadata (specifically keywords) and then got hooked on exploring the JPG file structure :). I'm not trying to write yet another image library as there are plenty of great ones already available (e.g., ImageSharp). While there's a lot of documentation online about the subject, much of it is not too clearly written and/or inconsistent/incomplete. I've looked at the official standards documents, too, but those, like most such, assume the reader knows a fair bit already; they're references, not tutorials. Do you know of online discussion forums where I could raise specific questions about JPG file structure? For example, I'm currently stuck on parsing the APP1 segment in my test JPG file because at one point there appear to be two extraneous 0x00 bytes between what I can recognize as standard IFD entries (the two bytes appear immediately after a four byte 0x00000000 entry which I believe indicates "no more IFDs in this particular IFD chain"). |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
APP1 in JPEG is just a container. JPEG itself only defines:
JPEG does not define what is inside that payload. In almost all real-world JPEGs, APP1 contains EXIF metadata. And EXIF is defined as: So when you start seeing things like:
You are no longer parsing “JPEG structure”. You are parsing a TIFF-style structure embedded inside APP1. That is by design, not a coincidence. Now to your actual issue: After the 4-byte That is normal. The EXIF/TIFF layout does not require the next structure to begin immediately after the IFD chain ends. Writers are allowed to:
So those two bytes are almost certainly just padding or slack space. They are not a new structure, and they are not an error. The important rule when parsing APP1 EXIF is: Do not parse sequentially assuming everything is tightly packed. If nothing points to those two bytes, they are just padding and should be ignored. |
Beta Was this translation helpful? Give feedback.
APP1 in JPEG is just a container. JPEG itself only defines:
FF E1JPEG does not define what is inside that payload.
In almost all real-world JPEGs, APP1 contains EXIF metadata. And EXIF is defined as:
So when you start seeing things like:
You are no longer parsing “JPEG structure”. You are parsing a TIFF-style structure embedded inside APP1. That is by design, not a coincidence.
Now to your actual issue:
After the 4-byte
0x00000000“no more IFDs” field, you see two0x00bytes that do not seem to belong.That…