Skip to content

Commit b06949e

Browse files
committed
Update README
1 parent 86be65e commit b06949e

File tree

1 file changed

+47
-17
lines changed

1 file changed

+47
-17
lines changed

README.md

Lines changed: 47 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -940,9 +940,54 @@ Plot’s option transforms, listed below, do more than populate the **transform*
940940

941941
[<img src="./img/bin.png" width="320" height="198" alt="a histogram of athletes by weight">](https://observablehq.com/@data-workflows/plot-bin)
942942

943-
[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@data-workflows/plot-bin) · Aggregates continuous, quantitative data — such as temperatures or times — into discrete bins. You can then compute summary statistics for each bin, such as a count or sum. The bin transform is like a [group transform](#group) for quantitative data, and is most often used to make histograms or heatmaps.
943+
[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@data-workflows/plot-bin) · Aggregates continuous data — quantitative or temporal values such as temperatures or times — into discrete bins, and then computes summary statistics for each bin such as a count or sum. The bin transform is like a continuous [group transform](#group) and is often used to make histograms.
944944

945-
TODO Describe how the binning dimensions and output channels are specified. Describe the resulting binned data.
945+
There are several variants of the bin transform depending on which dimensions need binning: [Plot.binX](#plotbinxoutputs-options) for *x*; [Plot.binY](#plotbinyoutputs-options) for *y*; and [Plot.bin](#plotbinoutputs-options) for both.
946+
947+
Given input *data* = [*d₀*, *d₁*, *d₂*, …], by default the resulting binned data is an array of arrays where each inner array is a subset of the input data [[*d₀₀*, *d₀₁*, …], [*d₁₀*, *d₁₁*, …], [*d₂₀*, *d₂₁*, …], …]. Each inner array is in input order, while the outer array is in natural order according to the associated dimension (*x* then *y*). Empty bins are skipped. By specifying a different aggregation method for the *data* output, as described next, you can change how the binned data is computed.
948+
949+
While it is possible to compute channel values on the binned data by defining channel values as a function, more commonly channel values are computed by the bin transform, either implicitly or explicitly. The following channels are automatically computed by the bin transform:
950+
951+
* **x1** - the starting horizontal position of the bin
952+
* **x2** - the ending horizontal position of the bin
953+
* **x** - the horizontal center of the bin
954+
* **y1** - the starting vertical position of the bin
955+
* **y2** - the ending vertical position of the bin
956+
* **y** - the vertical center of the bin
957+
* **z** - the first value of the *z* channel, if any
958+
* **fill** - the first value of the *fill* channel, if any
959+
* **stroke** - the first value of the *stroke* channel, if any
960+
961+
The **x1**, **x2**, and **x** output channels are only computed by the Plot.binX and Plot.bin transform; similarly the **y1**, **y2**, and **y** output channels are only computed by the Plot.binY and Plot.bin transform.
962+
963+
In addition to the automatically binned channels, you can declare additional channels to bin by specifying the desired aggregation method in the *outputs* object which is the first argument to the transform. For example, to use [Plot.binX](#plotbinxoutputs-options) to generate a **y** channel of bin counts as in a frequency histogram:
964+
965+
```js
966+
Plot.binX({y: "count"}, {x: "culmen_length_mm"})
967+
```
968+
969+
The following aggregation methods are supported:
970+
971+
* *first* - the first value, in input order
972+
* *last* - the last value, in input order
973+
* *count* - the number of elements (frequency)
974+
* *sum* - the sum of values
975+
* *proportion* - the sum proportional to the overall total (weighted frequency)
976+
* *proportion-facet* - the sum proportional to the facet total
977+
* *deviation* - the standard deviation
978+
* *min* - the minimum value
979+
* *max* - the maximum value
980+
* *mean* - the mean value (average)
981+
* *median* - the median value
982+
* *variance* - the variance per [Welford’s algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)
983+
* a function - passed the array of values for each bin
984+
* an object with a *reduce* method - passed the index for each bin, and all values
985+
986+
Most aggregation methods require binding the output channel to an input channel; for example, if you want the **y** output channel to be a *sum* (not merely a count), there should be a corresponding **y** input channel specifying which values to sum. If there is not, *sum* will be equivalent to *count*.
987+
988+
```js
989+
Plot.binX({y: "sum"}, {x: "culmen_length_mm", y: "body_mass_g"})
990+
```
946991

947992
TODO Describe binning options:
948993

@@ -967,21 +1012,6 @@ TODO Describe threshold functions:
9671012
* a time interval (for temporal binning)
9681013
* a function that returns an array, count, or time interval
9691014

970-
TODO Describe output aggregation. Supported reducers:
971-
972-
* *first* -
973-
* *last* -
974-
* *count* -
975-
* *sum* -
976-
* *proportion* -
977-
* *proportion-facet* -
978-
* *deviation* -
979-
* *min* -
980-
* *max* -
981-
* *mean* -
982-
* *median* -
983-
* *variance* -
984-
9851015
TODO Describe grouping and faceting. Describe what happens to the group-eligible channels (*z*, *fill*, *stroke*).
9861016

9871017
TODO Describe default insets.

0 commit comments

Comments
 (0)