Skip to content

Commit f02ea2b

Browse files
Create 2022-11-10.md
1 parent da9b39c commit f02ea2b

File tree

1 file changed

+108
-0
lines changed

1 file changed

+108
-0
lines changed

meetings/2022-11-10.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# Technical Working Group: 2022-11-10
2+
3+
## Present
4+
David Thrane Christiansen
5+
John Ericson
6+
Laurent
7+
Hécate
8+
Gershom
9+
Noon
10+
Davean
11+
Luke
12+
13+
## Agenda
14+
1. Vote on #43
15+
2. Discussion of Haddock intermediate representation proposal
16+
3. Update on implementation of vulnerabilities proposal
17+
4. Update on error index
18+
19+
## Minutes
20+
### Vote on #43
21+
It passed unanimously
22+
23+
### Discussion of Haddock intermediate representation proposal
24+
https://github.com/LaurentRDC/tech-proposals/blob/main/proposals/000-haddock-serialized-ast.md
25+
26+
Laurent: Hackage contains rendered documentation files. When the package is uploaded, Hackage generates docs (or the maintainer may upload them), but only the HTML is saved. This means that, over time, we get a variety of different appearances for docs over time.
27+
Rust had a different issue: Rustdoc generated non-scrapable HTML, and they wanted to make it easier for tools to use documentation.
28+
29+
The goal here is to solve both of these problems by having a canonical IR for Haddock documentation as a kind of AST. This output format is also an input format: source code goes to GHC and Haddock, and the IR is save. A different Haddock version can convert this JSON IR to HTML (or LaTeX or whatever). There is a decoupling of GHC and Haddock from the perspective of Haddock, but also tools (e.g. Hoogle) could load this IR to parse/query/manipulate docs without needing the whole transitive dependency chain of Haddock itself (which includes GHC).
30+
31+
The proposal intentionally does not include a specific format. Should it be part of the HF proposal, or leave that to the implementation?
32+
33+
Hécate: Is the PM of Haddock and thus may be able to shed light on it. Having other tools use the IR would be useful (e.g. build a glossary for some projects). Fully support the idea. Regarding the format, it's important that it be specified, documented, and versioned, but we should not do so here. Esp. because there is something happening in the next year or so where we want GHC to produce an intermediate format for Haddock to parse and then render the AST form. The hi-haddock project produces docs as part of the .hi files, but .hi files are version-specific and don't migrate across GHC versions (it's a private binary format). GHC could also export this data in a compiler-independent format.
34+
35+
Gershom: it sounds like these ideas overlap - if GHC emits one format and Haddock another, then these are very similar. A lot of the work is coming up with the right format, and thus the format doesn't belong in the proposal itself.
36+
37+
Hécate: We can't produce it without implementing it.
38+
39+
Davean: Do we need docs for it or are we happy enough with an independent Hackage library with compatibility guarantees? That would be as good as a spec.
40+
41+
Laurent: It would be nice to have the ability to look at this IR from a non-Haskell codebase.
42+
43+
Gershom: David is a fan of Scribble, and one of the obstacles to using something like Scribble or other system is that it's integrated with the other system. But a bridge could be written using this as an intermediate format.
44+
45+
David: documenting the IR format is nice so that editors can take advantage of capabilities beyond LSP capabilities.
46+
47+
John: That's definitely a nice goal, but in the larger context: Haddock is confusing right now. There's the version that is developed as part of GHC HEAD, there's the Haddock repo itself, there's the hi-haddoc overhaul which perhaps effects the interface between haddock and GHC. It feels like there are lots of things up in the air but we don't really know what's going on here.
48+
49+
Gershom: hi-haddock seems irrelevant to this discussion
50+
51+
John: Wish he had a clear idea of what ends up where
52+
53+
Gershom: Missing in this proposal that could be fleshed out more is the actual concrete architecture of the tools surrounding a specified intermediate format. It sounds like architecture stays basically the same, but with this extra step along the way.
54+
55+
John: this proposal doesn't change the architecture, but the other stuff will. We want the things consuming the file to be not built against GHC. It's really hard to make sure that the GHC dep doesn't leak into the API. We want to consume the format with no GHC anywhere.
56+
57+
Gershom: this is a good point. He doesn't know if any of the other proposals also change the architecture. There's an ongoing confusion to sort out WRT branches and the GHC fork and responsibility. But there's no proposal to change the architecture right now. A stretch goal could be to have the library and format into "haddock-base" that is AST and parsing of this format, plus renderers to HTML etc. Everything non-GHC.
58+
59+
Hécate: isn't this haddock-api and haddock lib?
60+
61+
Laurent: In the background, he is trying to undersatnd the architecture of the system today:
62+
* Haddock-api - uses GHC to make an AST
63+
* Haddock - calls haddock-api to produce the HTML (and with this proposal would also emit the IR)
64+
65+
John/Hécate: The architecture today is not so clean. It grew rather than being defined.
66+
67+
Problem with depending on GHC is that we inherit complexity like DynFlags, which makes working on the code difficult. If we want to get a cleaner Haddock design then we need to cut out the dependency on GHC's internals (which changes all the time). Ideall we can get healthy boundaries by building a clean interface to GHC that this format can be.
68+
69+
70+
David: seems odd to have something non-GHC that is coupled to the details of GHC. GHC also maintains its own version of Haddock. Slight modification suggested: generation of well-defined fileformat seems like a plausible thing for GHC to do. Then Haddock consumes the IR and does not further engage with GHC internals.
71+
72+
John: poked around the repo, and found that e.g. the XHTML renderer “backend” depends on GHC, which seems backwards. Ideally we can work by first dividing responsibility (architecture fixes) and then do the IR. PErhaps we should swap the stretch goal and first goal.
73+
74+
Noon: Curious about the proposal draft's combining of multiple docs together. Is there an existing format that we can build on, rather than making our own? Is that the plan?
75+
76+
Laurent: Rust has a JSON IR for Rust, but not an inter-linguisitic one.
77+
78+
Noon: What if there's something a la CTAGS that can support lots of languages? A common format rather than rolling our own?
79+
80+
Davean: Isn't the value that we get docs that are good for us? E.g. Python's does none of the things that we use Haddock for. Other tools don't represent the things that matter for Haskell, so we probably can't share with something that's fundamentally different from Haskell.
81+
82+
John: To second Davean's point, there are some really language-agnostic formatting bits (bold, italic, etc) and then language-specific things. Deduplication is good, but hard/impossible for the latter stuff.
83+
84+
Gershom: Wanted to respond to John. Disagrees that we should try to sort out the relationship first - that's a recipe for not moving forward. We can hit a middle ground here between David's and John's proposals: haddock-api has the GHC interface and the emitters in one library. An architectural goal can be to split this into the part that talks to GHC and emits the IR, and the part that reads the IR and emits the HTML etc. Hopefully the piece that integrates with GHC is small and self-contained. We need not have a goal to merge it into GHC per se, but the smaller and more self-contained it is, the easier it would be to move it into the GHC repo (where it makes the most sense for it to live). We can do this now. The IR allows executables to be decoupled rather than just APIs, and by doing the split this way, we enable a productive discussion.
85+
86+
John: Agrees. Maybe generalize moving out the backends to looking over the API and figuring out what's there and why. Would also like to say that GHC developers have seemed generally supportive of making GHC changes affect Haddock less (and thus easier). Doesn't expect pushback here.
87+
88+
David: Is Laurent still interested?
89+
90+
Laurent: It seems like the proposal is as it is. He'd be interested in the other steps as well, though the design of the format still makes sense.
91+
92+
Gershom: PErhaps the proposal should be extended to include the code decoupling bit inside HAddock as well?
93+
94+
John: Agrees, and also thinks it will be somewhat non-trivial. The backend modules are importing all kinds of stuff from GHC. We'll likely hit some hair-pulling moments detangling it.
95+
96+
Laurent: Happy to update the proposal and to do the extended work. This seems to lose some focus: we're now proposal two serial activities that could be separate.
97+
98+
David: the haskell language has changed over time, and therefore the collection of documentable “things” has grown. As more things appear, how to keep the JSON IR backwards-compatible? Does it make sense to version the IR format? Should we include a description of the Haskell language of the module being documented in the JSON IR?
99+
100+
Gershom: Action items: Laurent updates the proposal draft and submits it before the next meeting.
101+
102+
### Update on implementation of vulnerabilities proposal
103+
David: Having a hard time making this happen. Would like someone to spend some synchronous time on this with me.
104+
105+
Gershom: volunteered
106+
107+
### Update on error index
108+
David: errors.haskell.org is live. David is planning a "week of error docs" on the release of 9.6.

0 commit comments

Comments
 (0)