Skip to content

Planned work related to fixed-format serialization #20072

@JukkaL

Description

@JukkaL

We've discussed some changes to the fixed-format serialization format and library (librt.internal). The list below includes these and maybe one or two new ideas.

It would be nice to have the changes in the 1.19 mypy release and I'm planning to work on these -- but most of these aren't critical, as we don't guarantee that the cache format is stable.

  • Support 2-byte format for integers, instead of just 1 and 8 (possibly also 4 byte format).
  • Always use type tags (and end tags for composite data) to make the data "self-describing", i.e. we can have a generic tokenizer of .ff files and skip arbitrary objects easily without full deserialization.
  • Perform some fuzz testing to ensure deserialization fails gracefully.
  • Create a checked primitive type for Buffer so that the C functions don't need to check the Buffer parameter type repeatedly (export the type object via the capsule).
  • Experiment with using more pointers instead of integer index fields in BufferObject in case it would improve performance.

cc @ilevkivskyi

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions