|
| 1 | +## Introduction |
| 2 | + |
| 3 | +This proposal seeks technical coordination from the Haskell Foundation |
| 4 | +for improving the interop story around Haskell tooling error messages. |
| 5 | +While much of the work may be doable by volunteers, the HF would play |
| 6 | +a role in harnessing and corralling the volunteers, as well as coordinating |
| 7 | +common APIs between tools that are both easy to implement and easy to use. |
| 8 | +The HF may also be instrumental in managing an error code namespace, shared |
| 9 | +among all tooling central to Haskell. |
| 10 | + |
| 11 | +## Background |
| 12 | + |
| 13 | +Currently, there is no discipline around error messages. This lack of |
| 14 | +structure manifests itself in a number of ways: |
| 15 | + |
| 16 | + - Some tools parse the error messages that other tools |
| 17 | + produce. This is fragile, wasteful, and hard to keep up-to-date. For |
| 18 | + example, the HLS looks to see if a GHC extension name appears in an error |
| 19 | + message, in order to allow the user to automatically enable it via a pragma. |
| 20 | + But since `KindSignatures` is a substring of `StandaloneKindSignatures`, any |
| 21 | + message mentioning the latter causes HLS to suggest both enabling `KindSignatures` |
| 22 | + and `StandaloneKindSignatures` -- even though only `StandaloneKindSignatures` |
| 23 | + would actually work. While there is a workaround here, we can see that |
| 24 | + better communication between GHC and HLS would avoid this class of problem. |
| 25 | + |
| 26 | + - Many Haskell error messages refer to advanced concepts. This is unavoidable, |
| 27 | + as Haskell has advanced features. However, telling a user that their rigid |
| 28 | + type variable does not unify with a type because there is a kind mismatch |
| 29 | + is utterly bewildering to Haskell learners. Applying structure to error messages |
| 30 | + would allow for the creation of an error-message index that could explain |
| 31 | + what the messages mean -- and how to fix the errors. |
| 32 | + |
| 33 | + - Given that tools are increasingly working with one another and invoking |
| 34 | + one another, it can be hard to know who exactly is producing an error |
| 35 | + message. In one recent example that happened to me, I was trying to get |
| 36 | + GHC to work with a GHC plugin, and I got a baffling error. It took me |
| 37 | + more than an hour, if I recall, to discover that the problem was with |
| 38 | + Haddock (I forget the details) and that I just needed to `--disable-documentation`. |
| 39 | + |
| 40 | +There is already work in this area, within GHC. For the past few years, GHC |
| 41 | +has slowly been converting its error messages to be encoded in data constructors, |
| 42 | +not just as (fancy) strings. [This wiki page](https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values) |
| 43 | +and [this blog post](https://well-typed.com/blog/2021/08/the-new-ghc-diagnostic-infrastructure/) describe |
| 44 | +roughly the state of play. However, this work currently lacks a very important |
| 45 | +ingredient: clients. That is, if GHC is exporting new, fancy datatypes encoding |
| 46 | +its error messages, are these datatypes of use to, say, HLS? We've reached out |
| 47 | +to potential clients for feedback, but the best response we've gotten is something |
| 48 | +along the lines of "sure, looks good". That's encouraging, but I would want to |
| 49 | +a little more coordination to make sure that the interface GHC is building is one |
| 50 | +that can be easily consumed. The HF could help here by coordinating this |
| 51 | +communication between projects. |
| 52 | + |
| 53 | +Furthermore, if this is successful in increasing interop between (say) GHC |
| 54 | +and HLS, then we can expand the idea to other tooling, as well. |
| 55 | + |
| 56 | +The website describing error messages and error-generator identification |
| 57 | +are both fresh in this proposal. |
| 58 | + |
| 59 | +## Motivation |
| 60 | + |
| 61 | +- It is better for tools to collaborate by passing structured data than |
| 62 | +by sending strings back and forth. Structured error messages will thus |
| 63 | +accelerate the development of powerful editor integrations and other |
| 64 | +code analysis tools. |
| 65 | + |
| 66 | +- Establishing a website describing error messages will make it a standard |
| 67 | +reference in the Haskell community and flatten the learning curve to |
| 68 | +new Haskellers. |
| 69 | + |
| 70 | +While the two main goals of this proposal (conversion of all error messages |
| 71 | +to use datatypes; assigning error codes / creating a website) could be |
| 72 | +considered separately, I think they make sense together in this proposal. |
| 73 | +The second goal depends on the first, and it seems likely that many of |
| 74 | +the same potential volunteers will be interested in both. That said, it would |
| 75 | +be fine for, e.g. the HFTT to accept only one part of this proposal without |
| 76 | +the other, or simply not to commit resources until a mid-way review were conducted. |
| 77 | + |
| 78 | +## Goals |
| 79 | + |
| 80 | +1. When compiling a program, HLS queries GHC for error messages and receives |
| 81 | +structured errors, not strings. HLS can then use the information in these structures |
| 82 | +to offer repairs to the user or other options. |
| 83 | + |
| 84 | +1. All GHC error messages include a code. These codes can be searched for on a website |
| 85 | +that explains the error message, with examples of what causes it and how the error |
| 86 | +might be fixed. |
| 87 | + |
| 88 | +1. Stretch goal: Building on the success of the HLS/GHC integration around error |
| 89 | +messages, other central tooling adopts a similar approach. This would, for example, |
| 90 | +enable the possibility that HLS can report more informative configuration errors |
| 91 | +to users, or even to repair some of the problems itself. |
| 92 | + |
| 93 | +1. Stretch goal: The HF would establish a global namespace for Haskell-tool error |
| 94 | +message codes, where each tool includes a code in each message. This would both |
| 95 | +broaden the domain of the website index of error messages and also serve to identify |
| 96 | +the producer of error messages. |
| 97 | + |
| 98 | +## What the Haskell Foundation Would Do |
| 99 | + |
| 100 | +This section is meant to be suggestive of the concrete activity that would support |
| 101 | +this proposal. It is possible the HFTT or other HF people would have an alternative |
| 102 | +approach, which is fine, too. |
| 103 | + |
| 104 | +1. Devote the time of an HF employee (hereby called the Coordinator) to stay on top of |
| 105 | +this project. I think it would be reasonable to timebox this work at 5 hours / week from |
| 106 | +the Coordinator. |
| 107 | + |
| 108 | +1. A key task of the Coordinator is to source volunteers to help with this initiative. |
| 109 | +Accordingly, the Coordinator would be responsible for publicity around this plan, as well |
| 110 | +as thinking creatively about ways to attract volunteers. For example, it might be a fun |
| 111 | +idea to plan a virtual hackathon with potential volunteers or to reward contributions |
| 112 | +with t-shirts. I would expect the Coordinator to think creatively about how to source |
| 113 | +the volunteers. Volunteer management is a primary requirement of the Coordinator; it is |
| 114 | +assumed that the Coordinator is managing volunteers in parallel with all other tasks here. |
| 115 | + |
| 116 | +1. The Coordinator would start by getting an exact handle on the state of structured |
| 117 | +error messages in GHC, by working with current contributors (e.g. Alfredo di Napoli, Sam |
| 118 | +Derbyshire, Richard Eisenberg) and looking at the GHC source code. The Coordinator |
| 119 | +would then identify an area within GHC that would be an appropriate next step to add |
| 120 | +similar structured error messages and source volunteers to contribute to that area. |
| 121 | + |
| 122 | +1. In parallel with the previous item, the Coordinator would work with representatives |
| 123 | +from the HLS team to figure out how HLS might take advantage of the structured error messages |
| 124 | +GHC already has. Even if HLS is not ready to merge yet, the Coordinator and HLS would |
| 125 | +work out a way to build a proof-of-concept based on the structured errors GHC already |
| 126 | +has. This would validate the current API and increase the confidence in building on it. |
| 127 | + |
| 128 | +1. Having established that the API is usable, the Coordinator would systematically work |
| 129 | +through remaining error messages in GHC, directing volunteers to convert them to the |
| 130 | +structured format. |
| 131 | + |
| 132 | +1. As capacity is available, the Coordinator would also organize (or encourage a volunteer |
| 133 | +to organize) a website where error messages could be explained. This might be a wiki, |
| 134 | +or a git repository, or something exportable to e.g. readthedocs.io. Figuring out a good |
| 135 | +format would be the responsibility of the Coordinator, possibly by contacting stakeholders |
| 136 | +with a survey or looking at other language communities. |
| 137 | + |
| 138 | +1. The Coordinator would devise a scheme for assigning error code to messages. These might |
| 139 | +be terse, inscrutable alphanumeric identifiers, or perhaps they would be human-readable. |
| 140 | +The namespace would include the possibility of covering tools beyond just GHC, though |
| 141 | +recursive hierarchy seems likely unnecessary. With the help of volunteers, the Coordinator |
| 142 | +would add these error codes into the error-message API. |
| 143 | + |
| 144 | +1. The Coordinator would continue to encourage volunteers to document error messages on |
| 145 | +the error-message website, learning from early successes and failures. |
| 146 | + |
| 147 | +1. If the project is going well and with community support, the Coordinator could look at |
| 148 | +extending this idea to other tools. For example, perhaps Cabal or Stack could start to |
| 149 | +deliver similar structured error messages -- with buy-in from those maintainers, of course. |
| 150 | + |
| 151 | +## People |
| 152 | + |
| 153 | +- **Performers:** The Coordinator, someone who will have time dedicated to this project. This person |
| 154 | + would ideally be an HF employee or part of the portfolio of an in-kind donation of labor. |
| 155 | + |
| 156 | +- **Reviewers:** The GHC team would review changes to GHC, while the HLS team would review changes to HLS. |
| 157 | + The GHC and HLS teams would work together, coordinated by the Coordinator, to make an API that is useful |
| 158 | + to both. Community volunteers would review the text of the website describing error messages. The Coordinator |
| 159 | + would review the uptake of any website by examining analytics. |
| 160 | + |
| 161 | +- **Stakeholders:** This would affect anyone who uses GHC, as the error codes would appear there. Key stakeholders |
| 162 | + include the GHC and HLS maintainers, as well as educators, who would have access to Haskell learners who |
| 163 | + would benefit from the results of this work. |
| 164 | + |
| 165 | +## Resources |
| 166 | + |
| 167 | +- The Coordinator would need to devote 5 hours / week. |
| 168 | +- There would be a set of volunteers who would do much of the labor. If the volunteer pool runs low, the Coordinator |
| 169 | +can do some of the technical work, as well. |
| 170 | +- The GHC and HLS teams would have to devote some of their time to help support this initiative. |
| 171 | + |
| 172 | +## Timeline |
| 173 | + |
| 174 | +The timeline is highly dependent on the availability of volunteers to do the work. It thus seems |
| 175 | +more sensible to timebox this effort at 5 hours of Coordination / week than to set a deadline for |
| 176 | +completion. It would be sensible to review progress after 3 months to decide whether this project |
| 177 | +is producing benefits (or is likely to soon). |
| 178 | + |
| 179 | +## Lifecycle: |
| 180 | + |
| 181 | +I don't think this really applies here. There would be a warm-up period at the beginning where the goal |
| 182 | +is to source volunteers, but afterwards, it's all about keeping people moving forwards. |
| 183 | + |
| 184 | +## Deliverables |
| 185 | + |
| 186 | +1. A release of GHC where all of its error messages are structured. |
| 187 | + |
| 188 | +1. A release of HLS which consumes the structured error messages of GHC. |
| 189 | + |
| 190 | +1. A website explaining at least 20 different errors produced by GHC. (More is better!) |
| 191 | +I think we should set a modest goal of having 100 unique visitors to this website |
| 192 | +over the course of a month. |
| 193 | + |
| 194 | +1. A blog post (ideally written by the Coordinator) describing this process, as a way |
| 195 | +of creating publicity for the HF. |
| 196 | + |
| 197 | +## Outcomes |
| 198 | + |
| 199 | +- With the structured interface to errors, tools such as HLS will be better equipped to |
| 200 | +offer more power to users to manipulate and reason about code. |
| 201 | + |
| 202 | +- The error-message cataloguing website will help Haskell learners (and, likely, some |
| 203 | +old hands) understand error messages better. |
| 204 | + |
| 205 | +## Risks |
| 206 | + |
| 207 | +- One risk is that the API being built around error messages is not useful to consumers. |
| 208 | +This risk is intended to be mitigated by an early consultation with HLS. |
| 209 | + |
| 210 | +- It is possible that the structured error messages will provide no opportunity for |
| 211 | +improvement over the status quo. This is a risk the HFTT should consider. It might also |
| 212 | +be worthwhile to reach out to HLS now to see what they think. |
| 213 | + |
| 214 | +- It is possible that no one will find their way to the error-index website, or that |
| 215 | +the format chosen for the site will not resonate with users. The Coordinator would ideally |
| 216 | +reach out to users to understand their needs better as the website is being designed |
| 217 | +in order to mitigate this risk. |
| 218 | + |
| 219 | +- It is possible that the extra structure will provide an obstacle to evolution within |
| 220 | +GHC and slow development down there. I do not think this is likely, but it is conceivable. |
0 commit comments