Skip to content

Commit 9962804

Browse files
authored
Polish docs for GenStage.PartitionDispatcher (#292)
1 parent a95d3be commit 9962804

File tree

1 file changed

+37
-24
lines changed

1 file changed

+37
-24
lines changed

lib/gen_stage/dispatchers/partition_dispatcher.ex

Lines changed: 37 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,12 @@ defmodule GenStage.PartitionDispatcher do
22
@moduledoc """
33
A dispatcher that sends events according to partitions.
44
5-
This dispatcher assumes that partitions are evenly distributed.
6-
If the data is uneven for long periods of time, then you may
7-
buffer excessive data from busy partitions for long periods of
8-
time. This happens because the producer is unable to distinguish
9-
from which particular consumer/partition demand arrives.
10-
11-
Let's see an example. Imagine you have three consumers/partitions:
12-
A, B, and C. Let's assume 60% of the data goes to A, 20% to B, and
13-
20% to C. Let's also say `max_demand` is 10 and `min_demand` is 5.
14-
When they initially request data, 10 events for each, A will receive
15-
18 while B and C receive 6 each. After processing 5 events (min demand),
16-
they request additional 5 events each, which at this point will be 9
17-
additional elements for A, and 3 additional elements for B and C.
18-
At the end of these two rounds, we will have:
19-
20-
A = 18 - 5 + 9 = 22 events
21-
B = 6 - 5 + 3 = 4 events
22-
C = 6 - 5 + 3 = 4 events
5+
This dispatcher assumes that partitions are *evenly distributed*.
6+
See the ["Even distribution"](#module-even-distribution) section for
7+
more information.
238
24-
Furthermore, as B and C request more items, A will only go further
25-
behind. This behaviour is fine for spikes that should quickly
26-
resolve, but it can be problematic if the data is consistently uneven.
9+
When multiple consumers subscribe to one partition, the producer
10+
behaves like a `GenStage.DemandDispatcher` *within that partition*.
2711
2812
## Options
2913
@@ -35,14 +19,14 @@ defmodule GenStage.PartitionDispatcher do
3519
is named from 0 up to `integer - 1`. For example, `partitions: 4`
3620
will contain four partitions named `0`, `1`, `2` and `3`.
3721
38-
It may also be an *enumerable* that specifies the name of every partition.
22+
It may also be an *enumerable* that specifies the name of each partition.
3923
For instance, `partitions: [:odd, :even]` will build two partitions,
4024
named `:odd` and `:even`.
4125
4226
* `:hash` - the hashing algorithm. It's a function of type
4327
`t:hash_function/0`, which receives the event and returns a tuple with two
44-
elements, the event to be dispatched as first argument and the partition
45-
as second. The function can also return `:none`, in which case the event
28+
elements: the event to be dispatched and the partition to dispatch it to.
29+
The function can also return `:none`, in which case the event
4630
is discarded. The partition must be one of the partitions specified in
4731
`:partitions` above. The default uses:
4832
@@ -76,6 +60,35 @@ defmodule GenStage.PartitionDispatcher do
7660
7761
GenStage.sync_subscribe(consumer, to: producer, partition: 0)
7862
63+
## Even distribution
64+
65+
This dispatcher assumes that partitions are *evenly distributed*.
66+
If the data is uneven for long periods of time, then you may
67+
buffer excessive data from busy partitions for long periods of
68+
time. This happens because the producer is unable to distinguish
69+
from which particular consumer/partition demand arrives.
70+
71+
Let's see an example. Imagine you have three consumers, each
72+
for one partition: `A`, `B`, and `C`.
73+
74+
Let's assume 60% of the data goes to `A`, 20% to `B`, and 20% to
75+
`C`. Let's also say that `max_demand` is `10` and `min_demand` is
76+
`5`. When the consumers initially request data (`10` events each),
77+
the producer receives a total demand of `30`. A will receive `18` of
78+
those (60%), while `B` and `C` receive `6` each (20%). After
79+
processing `5` events (the `min_demand`), each consumer requests
80+
additional `5` events, for a total of `15` additional events. At
81+
this point, that will be `9` additional elements for A, and 3
82+
additional elements for B and C. At the end of these two rounds, we
83+
will have:
84+
85+
A = 18 - 5 + 9 = 22 events
86+
B = 6 - 5 + 3 = 4 events
87+
C = 6 - 5 + 3 = 4 events
88+
89+
Furthermore, as B and C request more items, A will only go further
90+
behind. This behaviour is fine for spikes that should quickly
91+
resolve, but it can be problematic if the data is consistently uneven.
7992
"""
8093

8194
@typedoc """

0 commit comments

Comments
 (0)