Skip to content

Commit 7ebe7d3

Browse files
committed
Rewrite the error-messages proposal
1 parent 352647a commit 7ebe7d3

File tree

1 file changed

+220
-0
lines changed

1 file changed

+220
-0
lines changed

proposals/000-error-messages.md

Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
## Introduction
2+
3+
This proposal seeks technical coordination from the Haskell Foundation
4+
for improving the interop story around Haskell tooling error messages.
5+
While much of the work may be doable by volunteers, the HF would play
6+
a role in harnessing and corralling the volunteers, as well as coordinating
7+
common APIs between tools that are both easy to implement and easy to use.
8+
The HF may also be instrumental in managing an error code namespace, shared
9+
among all tooling central to Haskell.
10+
11+
## Background
12+
13+
Currently, there is no discipline around error messages. This lack of
14+
structure manifests itself in a number of ways:
15+
16+
- Some tools parse the error messages that other tools
17+
produce. This is fragile, wasteful, and hard to keep up-to-date. For
18+
example, the HLS looks to see if a GHC extension name appears in an error
19+
message, in order to allow the user to automatically enable it via a pragma.
20+
But since `KindSignatures` is a substring of `StandaloneKindSignatures`, any
21+
message mentioning the latter causes HLS to suggest both enabling `KindSignatures`
22+
and `StandaloneKindSignatures` -- even though only `StandaloneKindSignatures`
23+
would actually work. While there is a workaround here, we can see that
24+
better communication between GHC and HLS would avoid this class of problem.
25+
26+
- Many Haskell error messages refer to advanced concepts. This is unavoidable,
27+
as Haskell has advanced features. However, telling a user that their rigid
28+
type variable does not unify with a type because there is a kind mismatch
29+
is utterly bewildering to Haskell learners. Applying structure to error messages
30+
would allow for the creation of an error-message index that could explain
31+
what the messages mean -- and how to fix the errors.
32+
33+
- Given that tools are increasingly working with one another and invoking
34+
one another, it can be hard to know who exactly is producing an error
35+
message. In one recent example that happened to me, I was trying to get
36+
GHC to work with a GHC plugin, and I got a baffling error. It took me
37+
more than an hour, if I recall, to discover that the problem was with
38+
Haddock (I forget the details) and that I just needed to `--disable-documentation`.
39+
40+
There is already work in this area, within GHC. For the past few years, GHC
41+
has slowly been converting its error messages to be encoded in data constructors,
42+
not just as (fancy) strings. [This wiki page](https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values)
43+
and [this blog post](https://well-typed.com/blog/2021/08/the-new-ghc-diagnostic-infrastructure/) describe
44+
roughly the state of play. However, this work currently lacks a very important
45+
ingredient: clients. That is, if GHC is exporting new, fancy datatypes encoding
46+
its error messages, are these datatypes of use to, say, HLS? We've reached out
47+
to potential clients for feedback, but the best response we've gotten is something
48+
along the lines of "sure, looks good". That's encouraging, but I would want to
49+
a little more coordination to make sure that the interface GHC is building is one
50+
that can be easily consumed. The HF could help here by coordinating this
51+
communication between projects.
52+
53+
Furthermore, if this is successful in increasing interop between (say) GHC
54+
and HLS, then we can expand the idea to other tooling, as well.
55+
56+
The website describing error messages and error-generator identification
57+
are both fresh in this proposal.
58+
59+
## Motivation
60+
61+
- It is better for tools to collaborate by passing structured data than
62+
by sending strings back and forth. Structured error messages will thus
63+
accelerate the development of powerful editor integrations and other
64+
code analysis tools.
65+
66+
- Establishing a website describing error messages will make it a standard
67+
reference in the Haskell community and flatten the learning curve to
68+
new Haskellers.
69+
70+
While the two main goals of this proposal (conversion of all error messages
71+
to use datatypes; assigning error codes / creating a website) could be
72+
considered separately, I think they make sense together in this proposal.
73+
The second goal depends on the first, and it seems likely that many of
74+
the same potential volunteers will be interested in both. That said, it would
75+
be fine for, e.g. the HFTT to accept only one part of this proposal without
76+
the other, or simply not to commit resources until a mid-way review were conducted.
77+
78+
## Goals
79+
80+
1. When compiling a program, HLS queries GHC for error messages and receives
81+
structured errors, not strings. HLS can then use the information in these structures
82+
to offer repairs to the user or other options.
83+
84+
1. All GHC error messages include a code. These codes can be searched for on a website
85+
that explains the error message, with examples of what causes it and how the error
86+
might be fixed.
87+
88+
1. Stretch goal: Building on the success of the HLS/GHC integration around error
89+
messages, other central tooling adopts a similar approach. This would, for example,
90+
enable the possibility that HLS can report more informative configuration errors
91+
to users, or even to repair some of the problems itself.
92+
93+
1. Stretch goal: The HF would establish a global namespace for Haskell-tool error
94+
message codes, where each tool includes a code in each message. This would both
95+
broaden the domain of the website index of error messages and also serve to identify
96+
the producer of error messages.
97+
98+
## What the Haskell Foundation Would Do
99+
100+
This section is meant to be suggestive of the concrete activity that would support
101+
this proposal. It is possible the HFTT or other HF people would have an alternative
102+
approach, which is fine, too.
103+
104+
1. Devote the time of an HF employee (hereby called the Coordinator) to stay on top of
105+
this project. I think it would be reasonable to timebox this work at 5 hours / week from
106+
the Coordinator.
107+
108+
1. A key task of the Coordinator is to source volunteers to help with this initiative.
109+
Accordingly, the Coordinator would be responsible for publicity around this plan, as well
110+
as thinking creatively about ways to attract volunteers. For example, it might be a fun
111+
idea to plan a virtual hackathon with potential volunteers or to reward contributions
112+
with t-shirts. I would expect the Coordinator to think creatively about how to source
113+
the volunteers. Volunteer management is a primary requirement of the Coordinator; it is
114+
assumed that the Coordinator is managing volunteers in parallel with all other tasks here.
115+
116+
1. The Coordinator would start by getting an exact handle on the state of structured
117+
error messages in GHC, by working with current contributors (e.g. Alfredo di Napoli, Sam
118+
Derbyshire, Richard Eisenberg) and looking at the GHC source code. The Coordinator
119+
would then identify an area within GHC that would be an appropriate next step to add
120+
similar structured error messages and source volunteers to contribute to that area.
121+
122+
1. In parallel with the previous item, the Coordinator would work with representatives
123+
from the HLS team to figure out how HLS might take advantage of the structured error messages
124+
GHC already has. Even if HLS is not ready to merge yet, the Coordinator and HLS would
125+
work out a way to build a proof-of-concept based on the structured errors GHC already
126+
has. This would validate the current API and increase the confidence in building on it.
127+
128+
1. Having established that the API is usable, the Coordinator would systematically work
129+
through remaining error messages in GHC, directing volunteers to convert them to the
130+
structured format.
131+
132+
1. As capacity is available, the Coordinator would also organize (or encourage a volunteer
133+
to organize) a website where error messages could be explained. This might be a wiki,
134+
or a git repository, or something exportable to e.g. readthedocs.io. Figuring out a good
135+
format would be the responsibility of the Coordinator, possibly by contacting stakeholders
136+
with a survey or looking at other language communities.
137+
138+
1. The Coordinator would devise a scheme for assigning error code to messages. These might
139+
be terse, inscrutable alphanumeric identifiers, or perhaps they would be human-readable.
140+
The namespace would include the possibility of covering tools beyond just GHC, though
141+
recursive hierarchy seems likely unnecessary. With the help of volunteers, the Coordinator
142+
would add these error codes into the error-message API.
143+
144+
1. The Coordinator would continue to encourage volunteers to document error messages on
145+
the error-message website, learning from early successes and failures.
146+
147+
1. If the project is going well and with community support, the Coordinator could look at
148+
extending this idea to other tools. For example, perhaps Cabal or Stack could start to
149+
deliver similar structured error messages -- with buy-in from those maintainers, of course.
150+
151+
## People
152+
153+
- **Performers:** The Coordinator, someone who will have time dedicated to this project. This person
154+
would ideally be an HF employee or part of the portfolio of an in-kind donation of labor.
155+
156+
- **Reviewers:** The GHC team would review changes to GHC, while the HLS team would review changes to HLS.
157+
The GHC and HLS teams would work together, coordinated by the Coordinator, to make an API that is useful
158+
to both. Community volunteers would review the text of the website describing error messages. The Coordinator
159+
would review the uptake of any website by examining analytics.
160+
161+
- **Stakeholders:** This would affect anyone who uses GHC, as the error codes would appear there. Key stakeholders
162+
include the GHC and HLS maintainers, as well as educators, who would have access to Haskell learners who
163+
would benefit from the results of this work.
164+
165+
## Resources
166+
167+
- The Coordinator would need to devote 5 hours / week.
168+
- There would be a set of volunteers who would do much of the labor. If the volunteer pool runs low, the Coordinator
169+
can do some of the technical work, as well.
170+
- The GHC and HLS teams would have to devote some of their time to help support this initiative.
171+
172+
## Timeline
173+
174+
The timeline is highly dependent on the availability of volunteers to do the work. It thus seems
175+
more sensible to timebox this effort at 5 hours of Coordination / week than to set a deadline for
176+
completion. It would be sensible to review progress after 3 months to decide whether this project
177+
is producing benefits (or is likely to soon).
178+
179+
## Lifecycle:
180+
181+
I don't think this really applies here. There would be a warm-up period at the beginning where the goal
182+
is to source volunteers, but afterwards, it's all about keeping people moving forwards.
183+
184+
## Deliverables
185+
186+
1. A release of GHC where all of its error messages are structured.
187+
188+
1. A release of HLS which consumes the structured error messages of GHC.
189+
190+
1. A website explaining at least 20 different errors produced by GHC. (More is better!)
191+
I think we should set a modest goal of having 100 unique visitors to this website
192+
over the course of a month.
193+
194+
1. A blog post (ideally written by the Coordinator) describing this process, as a way
195+
of creating publicity for the HF.
196+
197+
## Outcomes
198+
199+
- With the structured interface to errors, tools such as HLS will be better equipped to
200+
offer more power to users to manipulate and reason about code.
201+
202+
- The error-message cataloguing website will help Haskell learners (and, likely, some
203+
old hands) understand error messages better.
204+
205+
## Risks
206+
207+
- One risk is that the API being built around error messages is not useful to consumers.
208+
This risk is intended to be mitigated by an early consultation with HLS.
209+
210+
- It is possible that the structured error messages will provide no opportunity for
211+
improvement over the status quo. This is a risk the HFTT should consider. It might also
212+
be worthwhile to reach out to HLS now to see what they think.
213+
214+
- It is possible that no one will find their way to the error-index website, or that
215+
the format chosen for the site will not resonate with users. The Coordinator would ideally
216+
reach out to users to understand their needs better as the website is being designed
217+
in order to mitigate this risk.
218+
219+
- It is possible that the extra structure will provide an obstacle to evolution within
220+
GHC and slow development down there. I do not think this is likely, but it is conceivable.

0 commit comments

Comments
 (0)