Skip to content

Optional aggregate processors - a pain point #27

@acarapetis

Description

@acarapetis

I'm confused about the behaviour of aggregate processors (dictionary and in particular user_object) when using required=False. Consider the following example:

>>> import declxml as xml
>>>
>>> person = xml.dictionary('person', [
...     xml.string('name', required=True),
... ], required=False)
>>> root = xml.dictionary('root', [
...     person,
... ])
>>>
>>> xml.parse_from_string(root, '<root></root>')
{'person': {}}
>>> xml.serialize_to_string(root, {'person': {}})
'<root />'

I find the default value {} for missing optional dictionaries a bit strange. It's consistent with the primitive processors (that return e.g. '' instead of None or a missing dict key for strings), but it's counterintuitive to me that {'person': {}} is a valid input to the root processor, but {} is invalid for the person processor:

>>> xml.serialize_to_string(person, {})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/centos/declxml/declxml.py", line 356, in serialize_to_string
    root = root_processor.serialize(value, state)
  File "/home/centos/declxml/declxml.py", line 1009, in serialize
    self._serialize(end_element, value, state)
  File "/home/centos/declxml/declxml.py", line 1041, in _serialize
    child.serialize_on_parent(element, child_value, state)
  File "/home/centos/declxml/declxml.py", line 1261, in serialize_on_parent
    state.raise_error(MissingValue, self._missing_value_message(parent))
  File "/home/centos/declxml/declxml.py", line 1369, in raise_error
    raise exception_type(error_message)
declxml.MissingValue: Missing required value for element "name" at person/name

This becomes particularly annoying when it comes to optional user_object processors: when the corresponding element is omitted from the XML, the parser produces an instance of the custom class with no attributes set, which I would think is rarely meaningful.

In my current codebase I'm using the following workaround in my base model class, which results in missing objects parsing to None:

    def __new__(cls, *args, **kwargs):
        if all(v is None for v in kwargs.values()):
            return None
        return super().__new__(cls)

This could alternatively be achieved with an after_parse hook. Either way, though, it's not perfect: in the case where the subprocessor validates against an empty element, empty elements appear as if they are missing, and as far as I can tell there's no way to distinguish them. For example, if we set required=False on person.name, the documents <root /> and <root><person/></root> will parse to the same structure.

Changing the way optional processors work would obviously be a large breaking change, but what do you think about adding a default= parameter to dictionary, user_object (and maybe array)? Then users could simply specify default=None.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions