Skip to content

Conversation

@maxdml
Copy link
Collaborator

@maxdml maxdml commented Oct 6, 2025

This PR does two things:

  • Move from registering types with gob during workflow registration / RunAsStep. This solves the case where the workflow or step types is an interface. We cannot identify the underlying concrete type and this will always panic.
  • Skip encoding if the value is a nil pointer wrapped in an interface (gob loses the ability to handle it properly.)

Example panic for the 2nd case:

failed to encode data: gob: gob: cannot encode nil pointer of type *dbos.PointerResultStruct inside interface

@maxdml maxdml changed the title More Gob fixes Lazy registration for gob encoding Oct 7, 2025
@maxdml maxdml force-pushed the handle-nilable-types branch from 400f98c to b43c5b6 Compare October 8, 2025 16:00
Copy link
Member

@kraftp kraftp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does deserialization during recovery work if types are only registered at serialization time?

@maxdml
Copy link
Collaborator Author

maxdml commented Oct 8, 2025

How does deserialization during recovery work if types are only registered at serialization time?

Good point, it doesn't and we need to keep the lazy registration from RunAsStep. During recovery:

  • If a recovered workflow is already done, nothing will happen: the input/output are not read during RunWorkflow, and the recovery process dismisses the polling handle it obtains from RunWorkflow
  • If a recovered workflow is not done, it'll re-execute all the steps, decoding the output of completed ones from the DB. That's where we need the RunAsStep lazy registration.

And to re-iterate the generalized challenge:

If an application reads any workflow/step input/output where the types have not been registered by the current code (e.g., previous version or recover a workflow which does ListWorkflows for other workflows that have not been seen by the runtime yet.), the read will error because we can't decode

Another challenge btw is being able to test all these edge cases. The gob registry is private, so not exactly easy to wipe and simulate, say, a fresh process recovery scenario.

@maxdml
Copy link
Collaborator Author

maxdml commented Oct 8, 2025

Given that it'll be very complicated, if not outright impossible, to understand yet-unseen user defined types, I think we should:

  • Factor out the serializer as an interface that be provided by the user
  • Use JSON as a default serializer. It'll work for most data.
  • Also provide a gob serializer that will do lazy registration (and we can do some more for the user) and document clearly what types must be registered.

@maxdml maxdml closed this Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants