Skip to content

Conversation

@NeoLegends
Copy link
Member

@NeoLegends NeoLegends commented May 9, 2025

This PR adds an adapted version of the v2 serialization written by @albertz. I intend for this PR to house discussion on what features to support in v2 serialization and what not to support.

The adaptations I made mainly concern themselves around being compatible with existing RETURNN configs that use the v1 serialization mechanisms like the Import, PartialImport, etc.

@NeoLegends
Copy link
Member Author

I brought back Dim serialization in a way where RETURNN is not imported in the manager.

known_modules: Collection[str] = (),
sis_path_handling: SisPathHandling = None,
):
from returnn.tensor import Dim, batch_dim, single_step_dim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import requires returnn to be available for i6_core, I think this is not the case in many setups, maybe we should make the import conditional.

Copy link
Collaborator

@Icemole Icemole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go through all of the code (until half of the _Serializer class), but here's my feedback. Feel free to fix what you think is appropriate :)


We handle those objects specially:
- primitive types (int, float, bool, str)
- Sisyphus Path objects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding Path serialization (1/2, also see below): as far as I understand, you also seem to be handling sisyphus variables by using _instanciate_delayed_copy (as seen here). Is this correct?

If it's indeed correct, are you sure that Path handling is triggered? I assume that you added this because you found some case in which it wasn't supported. But if it's not needed, maybe you could also drop explicit support for Path. Path inherits from AbstractPath, which in turn inherits from DelayedBase (source). That would make the code easier by also dropping e.g. SisPathHandling entirely.

Note: Path.get() calls Path.get_path(), as seen here, so either call is equivalent.

.strip()
)
assert isinstance(_base_sys_path_list, list) and all(isinstance(p, str) for p in _base_sys_path_list)
return _base_sys_path_list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you always use the output of this function along with the in operator. I'd recommend to convert to a set (all elements are strings anyway, so they're hashable).

return name.isidentifier() and not iskeyword(name)


class ReturnnConfigWithNewSerialization(ReturnnConfig):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReturnnConfigV2 is a shorter name, and matches the directory structure. Better or not, I leave it to you... I personally think it's better 🙂

except OSError:
# cannot get source, e.g. code is a lambda
assert name is not None
args = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it seems "obvious" that you're trying to serialize the function manually, the code seems extremely obscure. I'd recommend adding a source as a comment that we can look for in case of doubt about the functionality of this code.

Is it just an exhaustive enumeration of the second column in https://docs.python.org/3/library/inspect.html, type "code"? If so, here's a suggestion for a comment:

Suggested change
args = [
# Dump all elements that identify this piece of code as a function.
# The exhaustive enumeration below is taken from https://docs.python.org/3/library/inspect.html, type "code".
args = [

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this comes from https://github.com/rwth-i6/i6_core/blob/main/returnn/config.py#L293. The code is practically equivalent, so I strongly suggest to find ways of combining both chunks of code.

For a preliminary idea: you're only using _parse_python inside ReturnnConfigWithNewSerialization. Moreover, I see that the only difference is an assertion in the code (https://github.com/rwth-i6/i6_core/blob/main/returnn/config.py#L299) which you don't seem to have (is this intended?). Because of this two factors, I suggest to make the _parse_python function a method in ReturnnConfigWithNewSerialization with the following structure:

class ReturnnConfigWithNewSerialization(ReturnnConfig):
    # Nitpick: note the double underscore at the beginning to ideally match the definition of the superclass.
    def __parse_python(...):
        try:
            return super().__parse_python(...)
        except AssertionError:
            # Thrown by the superclass method in some circumstances: https://github.com/rwth-i6/i6_core/blob/main/returnn/config.py#L299.
            # We ignore the assertion on purpose because TODO.
            return self.__parse_python(code.get())


class ReturnnConfigWithNewSerialization(ReturnnConfig):
"""
Overwrites the serialization behavior of ReturnnConfig
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Overwrites the serialization behavior of ReturnnConfig
V2 RETURNN config with simpler and better serialization. This class is completely retrocompatible with respect to V1.
Allows serializing lambda functions as well as sisyphus `Path` and `Variable`, among the supported data types in the base RETURNN config class.
Migration is recommended.

If possible add more details to motivate people to migrate to this better serialization config class.

python_prolog_code,
serialized.as_serialized_code(),
python_epilog_code,
"# -*- mode: python; tab-width: 4 -*-\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this supposed to be at the top of the file?

known_modules={"returnn"},
extra_sys_paths=extra_sys_paths,
# instanciate_delayed should already have handled it (e.g. Path),
# or if it has not, then we want to keep it as it is (e.g. PtCheckpoint),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding Path serialization (2/2): from the comments here I see that you already thought of Path being handled by instanciate_delayed. Can you provide an example of an object (e.g. PtCheckpoint as per the comments) that might not be properly handled?

return "".join(code.py_code for code in self.code_list)


class _Serializer:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I strongly prefer having basic code as much to the top of the file as possible (e.g. _Serializer), and then code which builds from such code (e.g. serialize_config).

This also includes methods from the classes, e.g. _handle_next_queue_item would precede work_queue.

But that's my preference, maybe yours differs.

_Ref(value): name for name, value in self._internal_reserved_names.items()
}

def work_queue(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming the function for clarity and adding a few more details as a TL;DR for the docstring?

Suggested change
def work_queue(self):
def resolve_serialization_values(self):
"""
Runs the serialization queue, making sure that all values can be properly serialized.
If a value depends on another value, the latter is serialized first.



@dataclass
class SerializedConfig:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that as_serialization_collection is not used (isn't it?). I propose dropping this class and using "".join(...) instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants