Skip to content

Commit b1d8fa5

Browse files
committed
Update README
1 parent d699ec0 commit b1d8fa5

File tree

1 file changed

+20
-17
lines changed

1 file changed

+20
-17
lines changed

README.md

Lines changed: 20 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -940,11 +940,11 @@ Plot’s option transforms, listed below, do more than populate the **transform*
940940

941941
[<img src="./img/bin.png" width="320" height="198" alt="a histogram of athletes by weight">](https://observablehq.com/@data-workflows/plot-bin)
942942

943-
[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@data-workflows/plot-bin) · Aggregates continuous data — quantitative or temporal values such as temperatures or times — into discrete bins, and then computes summary statistics for each bin such as a count or sum. The bin transform is like a continuous [group transform](#group) and is often used to make histograms. There are separate transforms depending on which dimensions need binning: [Plot.binX](#plotbinxoutputs-options) for *x*; [Plot.binY](#plotbinyoutputs-options) for *y*; and [Plot.bin](#plotbinoutputs-options) for both.
943+
[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@data-workflows/plot-bin) · Aggregates continuous data — quantitative or temporal values such as temperatures or times — into discrete bins, and then computes summary statistics for each bin such as a count or sum. The bin transform is like a continuous [group transform](#group) and is often used to make histograms. There are separate transforms depending on which dimensions need binning: [Plot.binX](#plotbinxoutputs-options) for *x*; [Plot.binY](#plotbinyoutputs-options) for *y*; and [Plot.bin](#plotbinoutputs-options) for both *x* and *y*.
944944

945-
Given input *data* = [*d₀*, *d₁*, *d₂*, …], by default the resulting binned data is an array of arrays where each inner array is a subset of the input data [[*d₀₀*, *d₀₁*, …], [*d₁₀*, *d₁₁*, …], [*d₂₀*, *d₂₁*, …], …]. Each inner array is in input order, while the outer array is in natural order according to the associated dimension (*x* then *y*). Empty bins are skipped. By specifying a different aggregation method for the *data* output, as described next, you can change how the binned data is computed.
945+
Given input *data* = [*d₀*, *d₁*, *d₂*, …], by default the resulting binned data is an array of arrays where each inner array is a subset of the input data [[*d₀₀*, *d₀₁*, …], [*d₁₀*, *d₁₁*, …], [*d₂₀*, *d₂₁*, …], …]. Each inner array is in input order. The outer array is in ascending order according to the associated dimension (*x* then *y*). Empty bins are skipped. By specifying a different aggregation method for the *data* output, as described below, you can change how the binned data is computed.
946946

947-
While it is possible to compute channel values on the binned data by defining channel values as a function, more commonly channel values are computed by the bin transform, either implicitly or explicitly. The following channels are automatically computed by the bin transform:
947+
While it is possible to compute channel values on the binned data by defining channel values as a function, more commonly channel values are computed directly by the bin transform, either implicitly or explicitly. In addition to data, the following channels are automatically binned:
948948

949949
* **x1** - the starting horizontal position of the bin
950950
* **x2** - the ending horizontal position of the bin
@@ -958,7 +958,7 @@ While it is possible to compute channel values on the binned data by defining ch
958958

959959
The **x1**, **x2**, and **x** output channels are only computed by the Plot.binX and Plot.bin transform; similarly the **y1**, **y2**, and **y** output channels are only computed by the Plot.binY and Plot.bin transform.
960960

961-
In addition to the automatically binned channels, you can declare additional channels to bin by specifying the desired aggregation method in the *outputs* object which is the first argument to the transform. For example, to use [Plot.binX](#plotbinxoutputs-options) to generate a **y** channel of bin counts as in a frequency histogram:
961+
You can declare additional channels to bin by specifying the desired aggregation method in the *outputs* object which is the first argument to the transform. For example, to use [Plot.binX](#plotbinxoutputs-options) to generate a **y** channel of bin counts as in a frequency histogram:
962962

963963
```js
964964
Plot.binX({y: "count"}, {x: "culmen_length_mm"})
@@ -987,29 +987,32 @@ Most aggregation methods require binding the output channel to an input channel;
987987
Plot.binX({y: "sum"}, {x: "culmen_length_mm", y: "body_mass_g"})
988988
```
989989

990-
TODO Describe binning options:
990+
To control how the quantitative dimensions *x* and *y* are divided into bins, the following options are supported:
991991

992-
* **thresholds** -
993-
* **domain** -
994-
* **cumulative** -
992+
* **thresholds** - the threshold values; see below
993+
* **domain** - values outside the domain will be omitted
994+
* **cumulative** - if positive, each bin will contain all lesser bins
995995

996-
For separate dimensions *x* and *y*:
996+
If **cumulative** is negative (typically -1 by convention), each bin will contain all *greater* bins rather than all *lesser* bins, representing the [complementary cumulative distribution](https://en.wikipedia.org/wiki/Cumulative_distribution_function#Complementary_cumulative_distribution_function_.28tail_distribution.29).
997997

998-
* *scale*.**thresholds** -
999-
* *scale*.**domain** -
1000-
* *scale*.**cumulative** -
1001-
* *scale*.**value** -
998+
For Plot.bin, you may wish to separate binning options for *x* and *y*. To allow this, the **x** and **y** input channels can be specified as an object with the options above and a **value** option to specify the input channel values.
1002999

1003-
TODO Describe threshold functions:
1000+
```js
1001+
Plot.binX({y: "count"}, {x: {thresholds: 20, value: "culmen_length_mm"}})
1002+
```
10041003

1005-
* *freedman-diaconis* -
1006-
* *scott* -
1007-
* *sturges* -
1004+
The **thresholds** option may specified as a named method or a variety of other ways:
1005+
1006+
* *freedman-diaconis* - the [Freedman–Diaconis rule](https://en.wikipedia.org/wiki/Freedman–Diaconis_rule)
1007+
* *scott* - [Scott’s normal reference rule](https://en.wikipedia.org/wiki/Histogram#Scott.27s_normal_reference_rule)
1008+
* *sturges* - [Sturges’ formula](https://en.wikipedia.org/wiki/Histogram#Sturges.27_formula)
10081009
* a count (hint) representing the desired number of bins
10091010
* an array of *n* threshold values for *n* + 1 bins
10101011
* a time interval (for temporal binning)
10111012
* a function that returns an array, count, or time interval
10121013

1014+
If the **thresholds** option is not specified, it defaults to *freedman-diaconis*. If a function, it is passed three arguments: the array of input values, the domain minimum, and the domain maximum. If a number, [d3.ticks](https://github.com/d3/d3-array/blob/master/README.md#ticks) or [d3.utcTicks](https://github.com/d3/d3-time/blob/master/README.md#ticks) is used to choose suitable nice thresholds.
1015+
10131016
TODO Describe grouping and faceting. Describe what happens to the group-eligible channels (*z*, *fill*, *stroke*).
10141017

10151018
TODO Describe default insets.

0 commit comments

Comments
 (0)