Update README

mbostock · mbostock · commit b1d8fa5a58e4 · 2021-05-02T17:37:38.000-07:00
diff --git a/README.md b/README.md
@@ -940,11 +940,11 @@ Plot’s option transforms, listed below, do more than populate the **transform*
 
 [<img src="./img/bin.png" width="320" height="198" alt="a histogram of athletes by weight">](https://observablehq.com/@data-workflows/plot-bin)
 
-[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@data-workflows/plot-bin) · Aggregates continuous data — quantitative or temporal values such as temperatures or times — into discrete bins, and then computes summary statistics for each bin such as a count or sum. The bin transform is like a continuous [group transform](#group) and is often used to make histograms. There are separate transforms depending on which dimensions need binning: [Plot.binX](#plotbinxoutputs-options) for *x*; [Plot.binY](#plotbinyoutputs-options) for *y*; and [Plot.bin](#plotbinoutputs-options) for both.
+[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@data-workflows/plot-bin) · Aggregates continuous data — quantitative or temporal values such as temperatures or times — into discrete bins, and then computes summary statistics for each bin such as a count or sum. The bin transform is like a continuous [group transform](#group) and is often used to make histograms. There are separate transforms depending on which dimensions need binning: [Plot.binX](#plotbinxoutputs-options) for *x*; [Plot.binY](#plotbinyoutputs-options) for *y*; and [Plot.bin](#plotbinoutputs-options) for both *x* and *y*.
 
-Given input *data* = [*d₀*, *d₁*, *d₂*, …], by default the resulting binned data is an array of arrays where each inner array is a subset of the input data [[*d₀₀*, *d₀₁*, …], [*d₁₀*, *d₁₁*, …], [*d₂₀*, *d₂₁*, …], …]. Each inner array is in input order, while the outer array is in natural order according to the associated dimension (*x* then *y*). Empty bins are skipped. By specifying a different aggregation method for the *data* output, as described next, you can change how the binned data is computed.
+Given input *data* = [*d₀*, *d₁*, *d₂*, …], by default the resulting binned data is an array of arrays where each inner array is a subset of the input data [[*d₀₀*, *d₀₁*, …], [*d₁₀*, *d₁₁*, …], [*d₂₀*, *d₂₁*, …], …]. Each inner array is in input order. The outer array is in ascending order according to the associated dimension (*x* then *y*). Empty bins are skipped. By specifying a different aggregation method for the *data* output, as described below, you can change how the binned data is computed.
 
-While it is possible to compute channel values on the binned data by defining channel values as a function, more commonly channel values are computed by the bin transform, either implicitly or explicitly. The following channels are automatically computed by the bin transform:
+While it is possible to compute channel values on the binned data by defining channel values as a function, more commonly channel values are computed directly by the bin transform, either implicitly or explicitly. In addition to data, the following channels are automatically binned:
 
 * **x1** - the starting horizontal position of the bin
 * **x2** - the ending horizontal position of the bin
@@ -958,7 +958,7 @@ While it is possible to compute channel values on the binned data by defining ch
 
 The **x1**, **x2**, and **x** output channels are only computed by the Plot.binX and Plot.bin transform; similarly the **y1**, **y2**, and **y** output channels are only computed by the Plot.binY and Plot.bin transform.
 
-In addition to the automatically binned channels, you can declare additional channels to bin by specifying the desired aggregation method in the *outputs* object which is the first argument to the transform. For example, to use [Plot.binX](#plotbinxoutputs-options) to generate a **y** channel of bin counts as in a frequency histogram:
+You can declare additional channels to bin by specifying the desired aggregation method in the *outputs* object which is the first argument to the transform. For example, to use [Plot.binX](#plotbinxoutputs-options) to generate a **y** channel of bin counts as in a frequency histogram:
 
 ```js
 Plot.binX({y: "count"}, {x: "culmen_length_mm"})
@@ -987,29 +987,32 @@ Most aggregation methods require binding the output channel to an input channel;
 Plot.binX({y: "sum"}, {x: "culmen_length_mm", y: "body_mass_g"})
 ```
 
-TODO Describe binning options:
+To control how the quantitative dimensions *x* and *y* are divided into bins, the following options are supported:
 
-* **thresholds** -
-* **domain** -
-* **cumulative** -
+* **thresholds** - the threshold values; see below
+* **domain** - values outside the domain will be omitted
+* **cumulative** - if positive, each bin will contain all lesser bins
 
-For separate dimensions *x* and *y*:
+If **cumulative** is negative (typically -1 by convention), each bin will contain all *greater* bins rather than all *lesser* bins, representing the [complementary cumulative distribution](https://en.wikipedia.org/wiki/Cumulative_distribution_function#Complementary_cumulative_distribution_function_.28tail_distribution.29).
 
-* *scale*.**thresholds** -
-* *scale*.**domain** -
-* *scale*.**cumulative** -
-* *scale*.**value** -
+For Plot.bin, you may wish to separate binning options for *x* and *y*. To allow this, the **x** and **y** input channels can be specified as an object with the options above and a **value** option to specify the input channel values.
 
-TODO Describe threshold functions:
+```js
+Plot.binX({y: "count"}, {x: {thresholds: 20, value: "culmen_length_mm"}})
+```
 
-* *freedman-diaconis* -
-* *scott* -
-* *sturges* -
+The **thresholds** option may specified as a named method or a variety of other ways:
+
+* *freedman-diaconis* - the [Freedman–Diaconis rule](https://en.wikipedia.org/wiki/Freedman–Diaconis_rule)
+* *scott* - [Scott’s normal reference rule](https://en.wikipedia.org/wiki/Histogram#Scott.27s_normal_reference_rule)
+* *sturges* - [Sturges’ formula](https://en.wikipedia.org/wiki/Histogram#Sturges.27_formula)
 * a count (hint) representing the desired number of bins
 * an array of *n* threshold values for *n* + 1 bins
 * a time interval (for temporal binning)
 * a function that returns an array, count, or time interval
 
+If the **thresholds** option is not specified, it defaults to *freedman-diaconis*. If a function, it is passed three arguments: the array of input values, the domain minimum, and the domain maximum. If a number, [d3.ticks](https://github.com/d3/d3-array/blob/master/README.md#ticks) or [d3.utcTicks](https://github.com/d3/d3-time/blob/master/README.md#ticks) is used to choose suitable nice thresholds.
+
 TODO Describe grouping and faceting. Describe what happens to the group-eligible channels (*z*, *fill*, *stroke*).
 
 TODO Describe default insets.