Skip to content

Commit 52998e7

Browse files
authored
Merge pull request #24 from goldfirere/error-messages-2
Coordination for structured error messages
2 parents 1541e2f + bbc9201 commit 52998e7

File tree

1 file changed

+237
-0
lines changed

1 file changed

+237
-0
lines changed

proposals/000-error-messages.md

Lines changed: 237 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,237 @@
1+
## Introduction
2+
3+
This proposal seeks technical coordination from the Haskell Foundation
4+
for improving the interop story between GHC and HLS. Once that is
5+
done well, we might imagine taking our gained knowledge to improve
6+
other interop around error messages.
7+
While much of the work may be doable by volunteers, the HF would play
8+
a role in harnessing and corralling the volunteers, as well as coordinating
9+
common APIs between tools that are both easy to implement and easy to use.
10+
The HF may also be instrumental in managing an error code namespace, shared
11+
among all tooling central to Haskell.
12+
13+
## Background
14+
15+
Currently, there is no discipline around error messages. This lack of
16+
structure manifests itself in a number of ways:
17+
18+
- HLS must parse the error messages that GHC
19+
produces. This is fragile, wasteful, and hard to keep up-to-date. For
20+
example, the HLS looks to see if a GHC extension name appears in an error
21+
message, in order to allow the user to automatically enable it via a pragma.
22+
But since `KindSignatures` is a substring of `StandaloneKindSignatures`, any
23+
message mentioning the latter causes HLS to suggest both enabling `KindSignatures`
24+
and `StandaloneKindSignatures` -- even though only `StandaloneKindSignatures`
25+
would actually work. While there is a workaround here, we can see that
26+
better communication between GHC and HLS would avoid this class of problem.
27+
28+
- Many GHC error messages refer to advanced concepts. This is unavoidable,
29+
as Haskell has advanced features. However, telling a user that their rigid
30+
type variable does not unify with a type because there is a kind mismatch
31+
is utterly bewildering to Haskell learners. Applying structure to error messages
32+
would allow for the creation of an error-message index that could explain
33+
what the messages mean -- and how to fix the errors.
34+
35+
- Given that tools are increasingly working with one another and invoking
36+
one another, it can be hard to know who exactly is producing an error
37+
message. In one recent example that happened to me, I was trying to get
38+
GHC to work with a GHC plugin, and I got a baffling error. It took me
39+
more than an hour, if I recall, to discover that the problem was with
40+
Haddock (I forget the details) and that I just needed to `--disable-documentation`.
41+
42+
There is already work in this area, within GHC. For the past few years, GHC
43+
has slowly been converting its error messages to be encoded in data constructors,
44+
not just as (fancy) strings. [This wiki page](https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values)
45+
and [this blog post](https://well-typed.com/blog/2021/08/the-new-ghc-diagnostic-infrastructure/) describe
46+
roughly the state of play. However, this work currently lacks a very important
47+
ingredient: clients. That is, if GHC is exporting new, fancy datatypes encoding
48+
its error messages, are these datatypes of use to HLS? We've reached out
49+
to potential clients for feedback, but the best response we've gotten is something
50+
along the lines of "sure, looks good". That's encouraging, but I would want to have
51+
a little more coordination to make sure that the interface GHC is building is one
52+
that can be easily consumed. The HF could help here by coordinating this
53+
communication between projects.
54+
55+
The website describing error messages and error-generator identification
56+
are both fresh in this proposal.
57+
58+
## Motivation
59+
60+
- It is better for tools to collaborate by passing structured data than
61+
by sending strings back and forth. Structured error messages will thus
62+
accelerate the development of powerful editor integrations and other
63+
code analysis tools.
64+
65+
- Establishing a website describing error messages will make it a standard
66+
reference in the Haskell community and flatten the learning curve to
67+
new Haskellers.
68+
69+
While the two main goals of this proposal (conversion of all error messages
70+
to use datatypes; assigning error codes / creating a website) could be
71+
considered separately, I think they make sense together in this proposal.
72+
The second goal depends on the first, and it seems likely that many of
73+
the same potential volunteers will be interested in both. That said, it would
74+
be fine for, e.g. the HFTT to accept only one part of this proposal without
75+
the other, or simply not to commit resources until a mid-way review were conducted.
76+
77+
## Goals
78+
79+
1. When compiling a program, HLS queries GHC for error messages and receives
80+
structured errors, not strings. HLS can then use the information in these structures
81+
to offer repairs to the user or other options.
82+
83+
1. All GHC error messages include a code. These codes can be searched for on a website
84+
that explains the error message, with examples of what causes it and how the error
85+
might be fixed.
86+
87+
1. Stretch goal: Building on the success of the HLS/GHC integration around error
88+
messages, other central tooling adopts a similar approach. This would, for example,
89+
enable the possibility that HLS can report more informative configuration errors
90+
to users, or even to repair some of the problems itself.
91+
92+
1. Stretch goal: The HF would establish a global namespace for Haskell-tool
93+
error message codes, where each tool is assigned (say) a prefix it should use
94+
for any error codes. The HF would then encourage tools to use these prefixes
95+
in error codes when presenting messages.
96+
97+
## What the Haskell Foundation Would Do
98+
99+
This section is meant to be suggestive of the concrete activity that would support
100+
this proposal. It is possible the HFTT or other HF people would have an alternative
101+
approach, which is fine, too.
102+
103+
1. Devote the time of an HF person, hereby called the Coordinator, to stay on top of
104+
this project. The Coordinator could be an HF employee, an in-kind donation of labor,
105+
or perhaps a dedicated and trustworthy volunteer.
106+
I think it would be reasonable to timebox this work at 5 hours / week from
107+
the Coordinator.
108+
109+
1. A key task of the Coordinator is to source volunteers to help with this initiative.
110+
Accordingly, the Coordinator would be responsible for publicity around this plan, as well
111+
as thinking creatively about ways to attract volunteers. For example, it might be a fun
112+
idea to plan a virtual hackathon with potential volunteers or to reward contributions
113+
with t-shirts. I would expect the Coordinator to think creatively about how to source
114+
the volunteers. Volunteer management is a primary requirement of the Coordinator; it is
115+
assumed that the Coordinator is managing volunteers in parallel with all other tasks here.
116+
117+
1. The Coordinator would start by getting an exact handle on the state of structured
118+
error messages in GHC, by working with current contributors (e.g. Alfredo di Napoli, Sam
119+
Derbyshire, Richard Eisenberg) and looking at the GHC source code. The Coordinator
120+
would then identify an area within GHC that would be an appropriate next step to add
121+
similar structured error messages and source volunteers to contribute to that area.
122+
The Coordinator would help to shepherd any GHC MRs that would arise as part of this work.
123+
124+
1. In parallel with the previous item, the Coordinator would work with representatives
125+
from the HLS team to figure out how HLS might take advantage of the structured error messages
126+
GHC already has. Even if HLS is not ready to merge yet, the Coordinator and HLS would
127+
work out a way to build a proof-of-concept based on the structured errors GHC already
128+
has. This would validate the current API and increase the confidence in building on it.
129+
130+
1. Having established that the API is usable, the Coordinator would systematically work
131+
through remaining error messages in GHC, directing volunteers to convert them to the
132+
structured format.
133+
134+
1. As capacity is available, the Coordinator would also organize (or encourage a volunteer
135+
to organize) a website where error messages could be explained. This might be a wiki,
136+
or a git repository, or something exportable to e.g. readthedocs.io. This might even be
137+
incorporated into the user manual. Figuring out a good
138+
format would be the responsibility of the Coordinator, possibly by contacting stakeholders
139+
with a survey or looking at other language communities.
140+
141+
1. The Coordinator would devise a scheme for assigning error code to messages. These might
142+
be terse, inscrutable alphanumeric identifiers, or perhaps they would be human-readable.
143+
The namespace would include the possibility of covering tools beyond just GHC, though
144+
recursive hierarchy seems likely unnecessary. With the help of volunteers, the Coordinator
145+
would add these error codes into the error-message API.
146+
147+
1. The Coordinator would continue to encourage volunteers to document error messages on
148+
the error-message website, learning from early successes and failures.
149+
150+
1. If the project is going well and with community support, the Coordinator could look at
151+
extending this idea to other tools. For example, perhaps Cabal or Stack could start to
152+
deliver similar structured error messages -- with buy-in from those maintainers, of course.
153+
154+
## People
155+
156+
- **Performers:** The Coordinator, someone who will have time dedicated to this project. This person
157+
would ideally be an HF employee or part of the portfolio of an in-kind donation of labor.
158+
159+
- **Reviewers:** The GHC team would review changes to GHC, while the HLS team would review changes to HLS.
160+
The GHC and HLS teams would work together, coordinated by the Coordinator, to make an API that is useful
161+
to both. Community volunteers would review the text of the website describing error messages. The Coordinator
162+
would review the uptake of any website by examining analytics.
163+
164+
- **Stakeholders:** This would affect anyone who uses GHC, as the error codes would appear there. Key stakeholders
165+
include the GHC and HLS maintainers, as well as educators, who would have access to Haskell learners who
166+
would benefit from the results of this work.
167+
168+
## Resources
169+
170+
- The Coordinator would need to devote 5 hours / week.
171+
- There would be a set of volunteers who would do much of the labor. If the volunteer pool runs low, the Coordinator
172+
can do some of the technical work, as well.
173+
- The GHC and HLS teams would have to devote some of their time to help support this initiative.
174+
175+
## Timeline
176+
177+
The timeline is highly dependent on the availability of volunteers to do the work. It thus seems
178+
more sensible to timebox this effort at 5 hours of Coordination / week than to set a deadline for
179+
completion. It would be sensible to review progress after 3 months to decide whether this project
180+
is producing benefits (or is likely to soon).
181+
182+
## Lifecycle:
183+
184+
I don't think this really applies here. There would be a warm-up period at the beginning where the goal
185+
is to source volunteers, but afterwards, it's all about keeping people moving forwards.
186+
187+
## Deliverables
188+
189+
1. A release of GHC where all of its error messages are structured.
190+
191+
1. A release of HLS which consumes the structured error messages of GHC.
192+
193+
1. A website explaining at least 20 different errors produced by GHC. (More is better!)
194+
I think we should set a modest goal of having 100 unique visitors to this website
195+
over the course of a month.
196+
197+
1. A blog post (ideally written by the Coordinator) describing this process, as a way
198+
of creating publicity for the HF.
199+
200+
## Outcomes
201+
202+
- With the structured interface to errors, tools such as HLS will be better equipped to
203+
offer more power to users to manipulate and reason about code.
204+
205+
- The error-message cataloguing website will help Haskell learners (and, likely, some
206+
old hands) understand error messages better.
207+
208+
## Risks
209+
210+
- It is currently unclear who would best serve as the Coordinator, which is why this
211+
proposal leaves this role abstract. Accordingly, a risk is that there is no one suitable.
212+
However, I still believe this proposal is worth considering (and perhaps approving) in this
213+
state: it would then serve as a concrete task the HF could have when an appropriate
214+
Coordinator arises. In the meantime, it could be used as an idea to show potential sponsors
215+
who might want to know what initiatives the HF is considering or to use as part of a motivation
216+
for expanding the HF employment base.
217+
218+
- Much of the work in this proposal is designed to be done by volunteers, working in parallel.
219+
It is possible we will not find the right volunteers for this work. It is then possible
220+
for the Coordinator to do more work themselves. In any case, trying to source volunteers for
221+
this work could be an important learning experience in the lifetime of the HF, and it informs
222+
the design of future initiatives.
223+
224+
- One risk is that the API being built around error messages is not useful to consumers.
225+
This risk is intended to be mitigated by an early consultation with HLS.
226+
227+
- It is possible that the structured error messages will provide no opportunity for
228+
improvement over the status quo. This is a risk the HFTT should consider. It might also
229+
be worthwhile to reach out to HLS now to see what they think.
230+
231+
- It is possible that no one will find their way to the error-index website, or that
232+
the format chosen for the site will not resonate with users. The Coordinator would ideally
233+
reach out to users to understand their needs better as the website is being designed
234+
in order to mitigate this risk.
235+
236+
- It is possible that the extra structure will provide an obstacle to evolution within
237+
GHC and slow development down there. I do not think this is likely, but it is conceivable.

0 commit comments

Comments
 (0)