Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions exploration/error-handling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Error Handling

Status: **Proposed**

<details>
<summary>Metadata</summary>
<dl>
<dt>Contributors</dt>
<dd>@echeran</dd>
<dt>First proposed</dt>
<dd>2024-06-02</dd>
<dt>Issues</dt>
<dd><a href="https://github.com/unicode-org/message-format-wg/issues/782">#782</a></dd>
<dt>Pull Requests</dt>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/795">#795</a></dd>
</dl>
</details>

## Objective

Decide whether and what implementations "MUST" / "SHOULD" / "MAY" perform after a runtime error, regarding:

1. information about error(s)
- including, if relevant, the minimum number of errors for which such information is expected
1. a fallback representation of the message

## Background

In practice,
runtime errors happen when formatting messages.
It is useful to provide information about any formatting error(s) back to the callsite.
It is useful to the end user to provide best effort fallback representation of the message.
Specifying the behavior in such cases promotes consistent results across conformant implementations.

However, implementations of MessageFormat 2.0 will be faced with different constraints due to various reasons:

* Programming language: the language of the implementation informs idiomatic patterns of error handling.
In Java, errors are thrown and subsequently caught in `try...catch` block.
In Rust, fallible callsites (those which can return errors) should return a `Result<T, Err>` monad.
In both languages, built-in error handling assumes a singular error.
* Environment constriants: as mentioned in [feedback from ICU4X](https://github.com/unicode-org/message-format-wg/issues/782#issuecomment-2103177417),
ICU4X operates in low resource environments for which returning at most 1 error is desirable
because returning more than 1 error would require heap allocation.
* Programming conventions and idioms: in [feedback from ICU-TC](https://docs.google.com/document/d/11yJUWedBIpmq-YNSqqDfgUxcREmlvV0NskYganXkQHA/edit#bookmark=id.lx4ls9eelh99),
they found over the 25 years of maintaining the library that there was more cost than benefit in providing a default best effort return value at the same time as providing error information.
The additional constraint in ICU4C's C++ style to use error code rather than throwing errors using the STL further complicates the usefulness and likelihood to be used correctly during nested calls.

## Proposed Design

The following spec text is proposed:

> In all cases, when encountering an error during formatting,
> a message formatter MUST provide some representation of the message,
> or MUST provide an informative error or errors.
> An implementation MAY provide both.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the fact of an error is more important than the fallback representation. Showing the fallback representation to end users is less good than having good error recovery (or no error at all). This wording makes the fallback representation first and thus, seemingly, the more important thing.

I also find the "MUST or MUST" formulation to be weird. I think this is trying to say "MUST provide some representation of the message or an informative error or errors or both"

I think it would be weird if an implementation never failed, that is, if it returned the fallback representation silently with no signal of failure. When would this be a good thing? How would the caller find out about the error.

Note that we do not specify how errors are signaled. Just because (for example) Java often throws Throwable does not mean that the MF2 implementation has to use that as the error mechanism. See, for example, the use of ParsePosition in NumberFormat.

Thus I'd suggest an additional alternative:

Suggested change
> In all cases, when encountering an error during formatting,
> a message formatter MUST provide some representation of the message,
> or MUST provide an informative error or errors.
> An implementation MAY provide both.
> In all cases, when encountering an error,
> a message formatter MUST provide an informative error or errors.
> It MAY also provide the appropriate fallback representation of the _message_ defined
> in this specification.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied most of the suggested change in order to address the simplification of the "MUST or MUST" construction.

Now, the reason for that verbose wording was to be as least restrictive as possible. It's basically saying, "You do something when you get an error. You must provide a best effort message or an error, maybe both. You can't do nothing."

The point about requiring signaling that an error occurred is an extra constraint beyond the existing text. I think it is a reasonable constraint, so I will incorporate that. But I'm not comfortable strengthening that constraint anything further because as we discussed in Monday's meeting, there is a big difference in "error" (ex: instance of java.lang.Exception) versus "signal that an error occurred" (ex: have just a boolean, or have a "strict version alternate API"). In order to avoid ambiguity about that, I much prefer "signal an error" rather than "provide an error" (this potential ambiguity is why I prodded us in Monday's meeting to be specific about what we mean by "provide/return an error")

I also think it would help avoid ambiguity to say "be able to signal" rather than "signal" so that we more clearly support implementations that choose to meet the requirement with an alternative that uses 2 APIs (ex: an infallible best-effort and a strict fallible one that signals an error). I think that would also address Tim's concern in his review comment since we have all been comfortable allowing choice by implementations since the time it was brought up 2 months ago.


This solution requires implementations to return _something_,
but it leaves the decision to the implementation whether to:

* return an error (or errors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably say:

Suggested change
* return an error (or errors)
* return or emit an error or errors

One reason is that throwing an exception (in languages that do that) is not the same as returning a return value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "signal" instead of "emit"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the word "signal" instead of "emit" because that better accomodates all of the alternatives in the solution space.

Note: I had to rewrite this section to reflect the extra constraint that I am taking on from your suggestion in the other conversation thread, since this section existed to clarify the spec text concretely and in plain words.

* return a representative message
* return both

This does not give implementations full freedom to return _nothing_ or some other behavior.

## Alternatives Considered

### Current spec: require information from error(s) and a representative best effort message

The current spec text says:

> In all cases, when encountering a runtime error,
> a message formatter MUST provide some representation of the message.
> An informative error or errors MUST also be separately provided.

This alternative places constraints on implementations to provide multiple avenues of useful information (to the callsite and user).

This alternative establishes constraints that would contravene the constraints that exist in projects that have implemented MF 2.0 (or likely will soon), based on:
* programming language idioms/constraints
* execution environment constraints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What execution environment constraints does this alternative contravene? The only such mentioned in this document is that the cost of returning more than one error may be prohibitive in some cases, and the current text explicitly says "error or errors" to allow for an implementation signaling a single error to be valid.

Suggested change
* execution environment constraints

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text says "an informative error". As written, that implies to me an error object, like a java.lang.Throwable instance. Despite discussion in last week's meeting to interpret that phrase as equivalent to "whether or not an error occurred", it comes across distinctly as something more and thus should be rewritten if the intent is not so. There are applications like ICU and potentially browsers that might only want to provide a best effort message and signal an error, but not pay the cost of creating "informative error" objects each time.

As I mentioned in the 2024-04-09 meeting, another paradigm from which to look at this, besides "whether is returning an error possible", is "how actionable is returning the error object". ICU & browsers need to be performant and might not want to pay the cost of the creating a full error object.

* experience-based programming guidelines

### Allow implementations to determine all details

> When encountering an error during formatting,
> a message formatter MAY provide some representation of the message,
> or it MAY provide an informative error or errors.
> An implementation MAY provide both.

This alternative places no expectations on implementations,
which supports the constraints we know now,
as well as any possible constraints in the future
(ex: new programming languages, new execution environments).

This alternative does not assume or assert that some type of useful information
(error info, representative message)
will be possible and should be returned.

### Alternate wording

> When an error is encountered during formatting,
> a message formatter can provide an informative error (or errors)
> or some representation of the message or both.