Skip to content

Commit aacb0df

Browse files
authored
Meta: mark up algorithms and var-scopes
See the new documentation in CONTRIBUTING.md. This notably gives "var click highlighting", i.e. the ability to click on a variable and see all other usages of it within the algorithm.
1 parent b0b40cc commit aacb0df

File tree

3 files changed

+5360
-95
lines changed

3 files changed

+5360
-95
lines changed

CONTRIBUTING.md

Lines changed: 118 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,122 @@ is.
109109

110110
End tags must not be omitted (except where it is consistent to do so) and attribute values must be quoted (use double quotes).
111111

112+
### Algorithms
113+
114+
[The Infra Standard](https://infra.spec.whatwg.org/#algorithms) sets out the basics of algorithms, but the HTML spec goes way beyond that.
115+
116+
When contributing to HTML, we attempt to mark up algorithms and variable scopes. The main visible benefit of this is that it gives "var highlighting", where clicking on a `<var>` element highlights all other references to it. Behind the scenes, it also enables various static analysis checks. Do your best to follow the below guidelines when introducing new algorithms or modifying existing ones.
117+
118+
HTML's algorithm system is based on, and intended to be compatible with, [that of Bikeshed](https://speced.github.io/bikeshed/#var-and-algorithms) (a build tool often used for other specifications).
119+
120+
#### The markup
121+
122+
Each algorithm should be wrapped in `<div algorithm> ... </div>`. The contents of this `<div>` should not be indented.
123+
124+
The 'body' of an algorithm will normally be preceded by a 'preamble', some text that gives:
125+
126+
* the name of the algorithm, or some indication of how/when it is invoked;
127+
* the names and/or types of any parameters; and
128+
* maybe the type of the return value, if any.
129+
130+
Include this preamble within the `<div algorithm>`. Sometimes the preamble will be preceded by other stuff (not specific to the algorithm) in the same `<p>`. It's generally okay to include the other stuff within the `<div>`, but consider splitting it off into its own `<p>`, so that the `<div>` can be focused on the algorithm.
131+
132+
If the algorithm is followed by one or more paragraphs that refer to any of the algorithm's variables, include those paragraphs within the `<div>`, so that they can participate in var-highlighting.
133+
134+
Sometimes, a set of related algorithms (e.g., the 4 associated algorithms of a reflected target) are presented in a `<dl>`, where each `<dt>/<dd>` pair are (roughly speaking) the preamble and body of an algorithm. In these cases, each `<dt>/<dd>` pair is wrapped in `<div algorithm> ... </div>`.
135+
136+
---
137+
138+
According to the Infra standard, "very short algorithms can be declared and specified using a single sentence". (The HTML spec sometimes strains the idea of "very short".) So an algorithm might be contained by a single `<p>` element, and you might be tempted to just add the `algorithm` attribute to the `<p>`. But we prefer
139+
140+
```html
141+
<div algorithm>
142+
<p>...</p>
143+
</div>
144+
```
145+
146+
over
147+
148+
```html
149+
<p algorithm>...</p>
150+
```
151+
152+
as it makes refactoring easier, and is easy to spot.
153+
154+
In fact, a single `<p>` can contain two or more single-sentence algorithms. For instance, this sometimes happens with the getter and setter steps of an IDL attribute. You might think that each algorithm should get its own markup, but it's okay to put a single `<div algorithm>` around the multiple algorithms in the `<p>`.
155+
156+
---
157+
158+
In Bikeshed, the `algorithm` attribute has an optional value, which supplies the name of the algorithm. In the HTML spec, don't give the `algorithm` attribute a value.
159+
160+
#### What qualifies as an algorithm?
161+
162+
Algorithms are easy to spot when the body is a block element like `<ol>` or `<dl>` (when used like a 'switch' statement). But the existence of single-sentence algorithms (see above) can make it harder to know when you've written an algorithm.
163+
164+
Here are some categories of algorithms (roughly from commonest to rarest):
165+
166+
* Generally, if you have a term in a `<dfn>` element, followed by a description of how to 'implement' that term, that's probably an algorithm. Likewise if the term is in a `<span>` element; the `<dfn>` might be elsewhere in the spec, or even in a different spec.
167+
168+
* Most Web IDL interface members (attributes and operations) have associated behavior. Any text that defines such behavior is an algorithm, even it just says that an IDL attribute reflects a content attribute, or that a method does nothing.
169+
170+
* Text of roughly the form
171+
172+
```html
173+
When [something happens], the user agent must [do something].
174+
```
175+
176+
or
177+
178+
```html
179+
When [something happens], [do something].
180+
```
181+
182+
is probably an algorithm.
183+
184+
* The behavior of each tokenization state is an algorithm. Similarly for the behavior of each insertion mode.
185+
186+
* The JavaScript spec declares some internal methods and implementation-defined abstract operations, but leaves their definitions to the 'host'. Any text that defines such JavaScript-related behavior is an algorithm. Typically, the method/operation's signature (name and parameter list) is given in an `<hN>` element; include this in the `<div algorithm>`.
187+
188+
* There are format-definitions, which typically start with wording such as:
189+
190+
```html
191+
A string is a <dfn>foo</dfn> if it consists of...
192+
```
193+
194+
or
195+
196+
```html
197+
A <dfn>foo</dfn> is a string containing...
198+
```
199+
200+
These aren't algorithms per se, but they're wrapped in `<div algorithm>` by special dispensation.
201+
202+
* Even algorithms that appear in examples should be marked up.
203+
204+
Note that this list isn't exhaustive. There are things that are clearly algorithms that don't fit into any of the above categories. There are cases where it's unclear.
205+
206+
And it's possible that we'll change our minds about what should be marked as an algorithm.
207+
208+
### `<var>` and `var-scope`
209+
210+
For every `<var>` element, one or more of the following should be true:
211+
212+
* It has the `ignore` attribute.
213+
* It is within an element with the `var-scope` attribute.
214+
* It is within an element with the `algorithm` attribute.
215+
* It is within a `<dl>` element with `class="domintro"`.
216+
217+
The build process will complain if it finds an 'unscoped' `<var>`, one for which none of the above is true.
218+
219+
Most of the time, any `<var>` element that you introduce will be within a `<div algorithm>` or a `<dl class="domintro">`. But for other cases, the question arises as to whether to mark a `<var>` with `ignore` or mark an ancestor with `var-scope` (possibly creating a `<div>` to have the `var-scope`). Here are some guidelines:
220+
221+
- When you have a set of consecutive algorithms that share variables, put `<div var-scope> ... </div>` around the algorithms and any preamble that mentions the shared variables.
222+
- In any context that has two or more `<var>` elements with the same variable-name, mark the context with `var-scope`, or put a `<div var-scope>` around it, so that the `<var>`s will participate in var-highlighting.
223+
- Even when a context has only single-use `<var>`s, it can be easier (if there's enough of them) to mark the context `var-scope` rather than mark each `<var>` as `ignore`.
224+
- But if a context has only one `<var>`, or two with different variable-names, probably use `ignore`.
225+
226+
But there's an additional situation in which to use `ignore`. In addition to looking for unscoped `<var>`s, the build process will examine the `<var>`s within each algorithm. Typically, a given variable-name will appear at least twice in an algorithm: once when it's declared/defined, and one or more times when it's used. So it's supicious if a variable-name appears only *once* within an algorithm, and the build process will raise a warning about it. If you have a `<var>` that should be ignored by this check, mark it with `ignore`.
227+
112228
### Common mistakes around prose style
113229

114230
Most of the style conventions in this section are covered by Infra or the WHATWG style guide, but the editors often have to correct them in contributions anyway.
@@ -123,7 +239,7 @@ Most of the style conventions in this section are covered by Infra or the WHATWG
123239

124240
<li>
125241
<p>If (all|any) of the following are true:</p>
126-
242+
127243
<ul class="brief">
128244
<li><p>condition 1;</p></li>
129245

@@ -136,7 +252,7 @@ Most of the style conventions in this section are covered by Infra or the WHATWG
136252

137253
<p>then…</p>
138254
</li>
139-
255+
140256
<li><p>Baz.</p></li>
141257
```
142258
- **Conjugate algorithm invocations inline** so they read more naturally in English, instead of more procedurally. For [example](https://github.com/whatwg/html/pull/9778#discussion_r1574075112), use `the result of <span data-x="get the popcorn">getting the popcorn</span>` instead of `the result of running <span>get the popcorn</span>`.

0 commit comments

Comments
 (0)