-
-
Notifications
You must be signed in to change notification settings - Fork 36
Make declarations array optional in data model #620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right direction. I suggest a wontfix.
The data model is not normative:
Implementations are not required to use this data model for their internal representation of messages.
Implementations are free to make declarations optional for the purpose of optimizing memory allocation or otherwise. If, however, they want to translate their internal data model to the canonical one recommended (but not enforced) by the spec, then I think they should include an empty declarations array. It's semantically more correct to do so, and it makes the canonical data model easier to work with.
Note that in many languages, an empty vector doesn't allocate memory, so the overhead is minimal (e.g. C++, Rust).
|
Consider, however, our stated primary purpose for the data model:
Are there any other message formats that provide for something like declarations? I'm not aware of any, but I'd be happy to be corrected here. My point here is that as the data model is defining an interface for messages in other syntaxes to be expressed as MF2, or for MF2 content to be expressed in such syntaxes, it should be reasonable for the interface to take into account both sides here, and not require the empty |
|
I won't block this, but I agree with @stasm -- having two different representations for a message with no declarations seems weird. It also complicates the implementation with extra special cases: consider (pseudocode) vs. To avoid the special case, the implementation could canonicalize the data model to an internal form that requires the |
|
Both objections seem reasonable to me? @eemeli would you be okay to keep as is? We can seek feedback on the data model in general in tech preview. |
|
The issues raised by @stasm and @catamorphism seem to be elevating the concerns of the implementation needing to include a single Allowing the Expression attributes are optional: message-format-wg/spec/data-model/README.md Lines 135 to 140 in 551b5f7
FunctionAnnotation options are optional: message-format-wg/spec/data-model/README.md Lines 199 to 203 in 551b5f7
Markup options and attributes are optional: message-format-wg/spec/data-model/README.md Lines 249 to 255 in 551b5f7
|
It's not about just a single
What I mean by (b) is that thanks to the guarantee of there always being an array, working with the data model becomes easier, more expressive, and less error prone. For example, it's trivial to chain operations like The canonical data model should be, well, "canonical". I don't think we should introduce optimizations to it.
Expressions attributes are not part of the proposed spec; we only reserved the syntax for them. Why are they in the data model?
We should fix both of these; they should be empty arrays, for the same reasons as declarations. |
|
@eemeli So I take it that you don't think we can park this for ldml45? |
|
A general point is that the data model should be a reference model. Few if any implementations will implement it precisely as described. And we cannot make them do that, especially in environments where different structures are more efficient. And a reference model should favor reader comprehension over "memory savings". That being said, the data model doesn't have to be solid for the tech preview, so changes can come after; it can be parked. |
I filed #632 to remove expression attributes from the data model (and keep them reserved in the syntax).
I filed #633 to make all arrays in the data model non-optional. This is the opposite of what this PR proposes. |
This is not necessarily the case. The JS
As it's looking likely that syntax parsing will need to be initially left out of the proposal, this is placing the data model in a rather prominent position. Much like the syntax, the JS data model API will need to be convenient to use, so that e.g. a message like { type: 'message', pattern: [
'Hello ',
{ type: 'expression',
arg: { type: 'variable', name: 'place' },
annotation: { type: 'function', name: 'string' } }] }won't be thrown out because it does not include an empty The intent with the current PR is to ensure that the MF2 data model definition meets that bar, and so avoid a need or an interest for the ECMA-402 spec to define its own less-strict data model specification, as that would be unfortunate. This change also makes future changes to the data model easier to implement. Consider, for instance, if we accept #632 as currently proposed, and remove attributes from the data model. It is possible that their definition (or the definition of any similar feature that will require representation in the data model) will happen in MF 2.1 or some later spec version. What should we do then about the requirement for any such added fields; will they be optional, or required? One purpose of the data model is intended for interchange between systems, in APIs. In such use, I do not think it's reasonable to expect and require both endpoints to be updated simultaneously, as would be necessitated by any new required field. Instead, they should be optional. Does it not make sense then to make all such fields (declarations, attributes, and options) optional, so that we won't have a mix of them later on? |
This deserves a disclaimer: the quoted spec is a proposal, of which you are the author. It's possible that the proposed spec for Early on we made a decision to make the syntax, not the data model, the canonical representation for messages, based on the assumption that the syntax is a good common denominator: it's relatively simple, it's text, and we can guarantee its stability over time. The recent development in TC39 deserves a dedicated discussion since it has impact on the strategy we chose for MF2. In the meantime, #633 aligns the data model closer to what I'd consider the "reference" data model. |
|
obsoleted by #633 |
Most messages are simple, and have no declarations. They should be optional in the data model, so that we don't need to include an empty
declarations: []for every one of them.