Skip to content

Conversation

@KryziK
Copy link

@KryziK KryziK commented Dec 6, 2025

Update binary serialization documentation from Godot version < 4 to 4.5.1.

Disclosure: AI was used as a tool to help me quickly parse the marshalls.h and marshalls.cpp file into the full list of data structures now available in Godot 4.5.

AI was also helpful in answering questions I had about specific data structures while I wrote the new tables and documentation.

One such question was to understand why the existing docs state that a Variant::OBJECT can have 3 different serializations and not just 2, to which the reply was that a NULL object is technically just variant data type 0 (NIL) rather than an object type itself.

Through this back-and-forth Q&A session, I was able to better understand how to write some small nuances and notes into the documentation.

~~

It was definitely a small challenge learning how to use RST, so if there are any formatting or consistency issues I need to fix, please let me know.

I did manually write the final file, mainly leveraging copy/paste of tables from the old documentation to understand the RST format. I did combine this with some manually-reviewed notes based on answers I got from Claude. No part of this document was output directly by Claude without manual review and clean-up to try my best at keeping the changes high-quality.

Thank you for your time.

Update binary serialization documentation from Godot version < 4 to 4.5.1
@KryziK KryziK marked this pull request as ready for review December 6, 2025 22:15
@KryziK
Copy link
Author

KryziK commented Dec 6, 2025

#4945

@AThousandShips AThousandShips self-requested a review December 7, 2025 08:52
@AThousandShips
Copy link
Member

AThousandShips commented Dec 7, 2025

No part of this document was output directly by Claude without manual review and clean-up

So parts of this was generated by Claude?

Edit: reading this I can definitely tell this was written by an LLM, it is not particularly well worded and uses a lot of incorrect terminology and has a lot of grammatical and formatting errors, please go through it again and comment when you've fixed them and I can review after, I am not interested in reviewing it as it is right now

@AThousandShips AThousandShips added the needs work Needs additional work by the original author, someone else or in another repo label Dec 7, 2025
@KryziK
Copy link
Author

KryziK commented Dec 7, 2025

No part of this document was output directly by Claude without manual review and clean-up

So parts of this was generated by Claude?

Edit: reading this I can definitely tell this was written by an LLM, it is not particularly well worded and uses a lot of incorrect terminology and has a lot of grammatical and formatting errors, please go through it again and comment when you've fixed them and I can review after, I am not interested in reviewing it as it is right now

Thank you for your time.

Could you highlight an example of wrong terminology/grammar? I hand-wrote all of the sentences/statements and used the old documentation as my guide.

And, as we previously discussed, Claude was used to parse all of the variant types and provide me the new list of types, of which now there are many more than v3. However, I output that all into a separate file, and then manually went through that entire file and typed out the new documentation for the PR I submit. I did all of the manual creation of these little tables, their text blurbs, and the numbered types with their references.

Admittedly it was a lot of manual work to keep track of, so I could have made some mistakes or typos, but I'm not seeing anything obvious that is incorrect after a few passes through my own PR. The only one I did notice is my use of "sent" instead of "serialized", which was just semantics (sending values through the serializer to the file) and I can change that if that's your concern.

As far as formatting errors, I'm not sure which pieces or sections are not formatted properly. I took care to make the note sections look similar to the old doc's note sections, and tried to use all the same symbols, separators, etc..

Again, thank you for your time. I am working on this with my own free time and have no intention of disregarding the rules/guidelines; there's nothing I would gain from that anyways. I'm more than happy to redo any of the work necessary to bring it up to your standards, but indeed a pointer on where to start would help a lot.

I apologize if you feel as if your time is being wasted, and I understand any skepticism you may have.

I did take a look at the build log and fixed a typo I made with the spacing at the end of one of the tables. I am not sure how to re-queue the build.

@AThousandShips
Copy link
Member

AThousandShips commented Dec 8, 2025

Thank you for clarifying, I assumed given the phrasing which seemed to indicate you used it to write it, and the very AI looking errors and word choices, which were very different from the existing documentation and from the rest of the page

There are several unexplained changes, usage of unusual terminology, "type kind" is not used for example

I will take a look at some point this week, but please go over it and look at existing content to see the difference in terminology

Copy link
Member

@AThousandShips AThousandShips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A basic pass, needs some more work and tweaking but a good start

| 0 | 4 | Integer | val & 0x7FFFFFFF = element count, val & 0x80000000 = shared (bool) |
+----------+-------+-----------+----------------------------------------------------------------------+

Then what follows is, for each element, a key-value pair encoded one after
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, while from the original is a bit too much like prose, should be more technical

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored almost the entire document, including these blurbs, the format of tables, and the offset 'calculations' in tables. Please go ahead and take a look when you have the time.

| 0 | 4 | Integer | val & 0x7FFFFFFF = element count, val & 0x80000000 = shared (bool) |
+----------+-------+-----------+----------------------------------------------------------------------+

Then what follows is, for each element, a value encoded using this same format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

+----------+-------+-----------+------------------------------+
| Offset | Len | Type | Description |
+==========+=======+===========+==============================+
| 4 | 4 | Integer | String length (in byte, N) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| 4 | 4 | Integer | String length (in byte, N) |
| 4 | 4 | Integer | String length (in bytes, N) |


9: :ref:`Plane<class_plane>`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If flag ``ENCODE_FLAG_64`` is set (flags & 1 == 1), the floats are sent as 64 bit double precision:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say all types should be fully described for their 64 bit version, or none, the longer ones are more useful

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you mean? Are you suggesting I remove the 32-bit information from each variant? While 64-bit is more useful, we are describing a file structure here, so if it's possible that someone would have a file with a 32-bit value, wouldn't that still be important information as far as file formats go?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm suggesting either all versions should have a detailed 64 bit version, or none, the 64 bit version is only available in double precision builds so it's pretty unusual

So to be clear, the 32 bit version should always be presented as that's what most people will be working with

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help clarify something for me:

The existing docs state that the "Float" type always uses double precision (line 112 in the prior documentation). Wouldn't this be an 8-byte "double", and not a 4-byte "float" then? This seems to contradict the offsets and length in tables (and most standard IEEE documentation) that state that floats are 4 bytes?

And, having never used Godot, I was unaware that double-precision builds are unusual. Could you elaborate on this a tiny bit? Some things I'm reading say that Godot always uses 64-bit ints and floats. But in that case, they would never be just 4 bytes regardless of the build precision?

Just a little bit confused on this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that would be critical to understand to write this, so it might be an indication you might need additional help with this, I'd suggest reading things like the documentation for Vector2 to see what this difference is about, also here

But this is a critical point of understanding so please check the documentation and come back when you understand the engine core systems, it is better than me trying to explain to you how to write this page

Copy link
Author

@KryziK KryziK Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, understood. I think I had a handle on the correct answer, but as I was reading more I was straying from that.

The fact that Godot uses 64-bit floats internally/in the engine is irrelevant to this task, as it will still serialize that into a 32-bit or 64-bit value, depending upon the build flag.

When strictly talking about binary serialization/deserialization, 32-bit values are used (and as you said, much more common) unless the 64-bit build flag is set.

If that's correct, then the only thing I still want to read into (albeit not relevant to this PR) is how Godot ensures that the runtime double will never overflow a 32-bit space.

If I'm incorrect or going down the wrong line of thinking, let me know. But to summarize, it seems like for serialization, the documentation is accurate: the float class, as well as any floats in things like vector2, basis, etc., will use 32 bits unless otherwise specified by the build flag (and therefore the header flag). And in that case, I can remove all of the 64-bit versions of the tables and instead keep a blurb at the top indicating that the value in the header flag will control this across the board.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No rather float in Variant is always 64 bit, but Vector2/3/4 and other math types like Basis, Projection, Transform2D/3D, etc. use 32 or 64 bits depending on the compile time setting, so it's not strictly serialization related, though it doesn't matter here beyond the serialization part

Now the compiler flag will affect how these values are stored, but they will be parsed as 64 bit and then stored in 32 bit types if loading binary data from a double precision build in a single precision one, which is handled without issue (they just lose precision)

I'd say that the 64 bit entries can be useful, but it might be best to drop them, as they are niche, though they are useful. In that case I'd just keep the note indeed

Adding a note to the top about the encoding flag being controlled is also good

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you said "No", but were you talking about the instantiated/runtime variant types themselves or were you talking about serialization strictly? If you re-read my last message but apply everything I said only to serialization and not to what those types look like while running in the engine, is what I said correct?

If we ignore the runtime types, the compiler, and the engine and focus solely on serialization/deserialization,

Variant Float (type 3) is no different than any other type (Basis, Projection, etc.) in that it may be 32 or 64 bits when serialized, and the header will indicate which one it is.

If this is correct, then the idea that a Float variant is always 64 bits is irrelevant to us because we're not talking about what Godot does internally while running, we're just talking about what it encodes/decodes to a file.

Is this right?

Again, thank you for your time and patience. I think maybe I got a bit derailed by looking too much into the runtime/engine portion of things and not staying strictly focused on the serialization of that information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good clarification, the flag is a bit more complicated in that it signals the 64 bit storage scheme is active, forgot to clarify that the flag is used only if needing to store 64 bit values, this happens when:

  • Storing int which is outside the 32 signed range
  • Storing float which requires storing as such, i.e. storing a value as 32 bit would lose precision or range (internally checked with float(d) != d)
  • Storing any of the precision controlled types on a double prevision build

| 64 | 4 | Float | w.w (Column 3, Row 3) |
+----------+-------+---------+------------------------------+

Total of 16 floats (64 bytes for single precision, 128 bytes for double).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Total of 16 floats (64 bytes for single precision, 128 bytes for double).
Total of 16 floats (64 bytes for single precision, 128 bytes for double).

Not sure why this type alone has a description of the number of bytes it takes up?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was one of the lengthiest types, so it was mainly just for clarification. I was actually looking at an example of this type in ImHex and, having done some manual counting, it was just habit to add that information.

Would you prefer I remove it for consistency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say to add it to all or to none of the types that aren't a single value

+----------+-------+-----------+----------------------------------------+
| Offset | Len | Type | Description |
+==========+=======+===========+========================================+
| 0 | 4 | Integer | String length (N, including null term) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| 0 | 4 | Integer | String length (N, including null term) |
| 0 | 4 | Integer | String length (N, including the null terminator) |

+==========+=======+===========+========================================+
| 0 | 4 | Integer | String length (N, including null term) |
+----------+-------+-----------+----------------------------------------+
| 4 | N | Bytes | UTF-8 encoded string with null |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| 4 | N | Bytes | UTF-8 encoded string with null |
| 4 | N | Bytes | UTF-8 encoded string with null terminator |

+==================+=======+===========+===================================+
| 4 | 4 | Integer | Array length (Floats) |
+------------------+-------+-----------+-----------------------------------+
| 8..8+length\*4 | 4 | Float | IEEE 754 single-precision float |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| 8..8+length\*4 | 4 | Float | IEEE 754 single-precision float |
| 8+i\*4 | 4 | Float | IEEE 754 single-precision float |

For uniformity, bit confusing otherwise IMO, same for the int and byte versions

Copy link
Author

@KryziK KryziK Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted your opinion on this, actually. For some of the types with multiple consecutive tables, I wasn't sure if I should start each table at offset 4 or 0, or instead index them relative to the start of the data (most have offset 4 because of the type ID being specified at offset 0).

I could:

  1. Add the type header to every table, at offset 4, that way no single data structure is "composed" of multiple individual tables. Every table would fully define its structure, including the header.
  2. For tables with potentially repeating sections, I could specify a small series to showcase the structure, or a subtable.

I also wanted to ask if the godot docs support the :math: syntax? If so, I was going to maybe update the Len column for things like the String variant to be

:math:`\lceil N / 4 \rceil \cdot 4`

to indicate that the length of the string itself is Ceil(N / 4) * 4 (padded to 4 bytes) (⌈N/4⌉·4)- that way we don't have to specify that it's padded to 4 bytes in every single table?

+==================+=======+===========+===================================+
| 4 | 4 | Integer | Array length (Floats) |
+------------------+-------+-----------+-----------------------------------+
| 8..8+length\*8 | 8 | Float | IEEE 754 double-precision float |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| 8..8+length\*8 | 8 | Float | IEEE 754 double-precision float |
| 8+i\*8 | 8 | Float | IEEE 754 double-precision float |

@AThousandShips AThousandShips added enhancement area:manual Issues and PRs related to the Manual/Tutorials section of the documentation labels Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:manual Issues and PRs related to the Manual/Tutorials section of the documentation enhancement needs work Needs additional work by the original author, someone else or in another repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants