You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: specification/changes/2025-08-25-bucket-beacons/background.md
+26-22Lines changed: 26 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,9 +62,10 @@ with four of the queries returning nothing.
62
62
Maybe you need a different number of buckets for different standard beacons.
63
63
64
64
For example, maybe
65
-
*`LastName` has a terribly uneven distribution, and so needs 5 buckets.
66
-
*`Phone` is pretty much unique in each item, and so only needs on bucket.
67
-
*`Precinct` is just a little bit uneven, and so needs two beacons.
65
+
66
+
-`LastName` has a terribly uneven distribution, and so needs 5 buckets.
67
+
-`Phone` is pretty much unique in each item, and so only needs on bucket.
68
+
-`Precinct` is just a little bit uneven, and so needs two beacons.
68
69
69
70
To accomplish this, you set the number of buckets for the whole table to `5`,
70
71
and then constrain the number of buckets for the other two beacons.
@@ -75,13 +76,14 @@ The ensures that the values are reasonably well distributed among the constraine
75
76
while still guaranteeing that, given the bucket for the item,
76
77
we can uniquely identify the bucket for the constrained beacon.
77
78
78
-
***WARNING*** Adding or changing the constraint on a beacon is difficult, sometimes impossible;
79
+
**_WARNING_** Adding or changing the constraint on a beacon is difficult, sometimes impossible;
79
80
once any items have been written.
80
81
81
82
The only situation in which you might consider adding a constraint to a beacon is if **all** of the following apply
82
-
* You're going to use that beacon in an index (GSI)
83
-
* The queries you make against that index are expected to return a very small number of results
84
-
* Your security people have agreed that reducing the number of buckets for this beacon is acceptable.
83
+
84
+
- You're going to use that beacon in an index (GSI)
85
+
- The queries you make against that index are expected to return a very small number of results
86
+
- Your security people have agreed that reducing the number of buckets for this beacon is acceptable.
85
87
86
88
### Behind the scenes
87
89
@@ -170,9 +172,10 @@ This would allow one to set `numberOfBuckets = 5` on the LastName beacon as abov
170
172
and then choose the bucket for each item explicitly, based on the LastName value.
171
173
Perhaps "Smith" would be divided among all 5 buckets, while "Svaboda" was always placed in bucket zero.
172
174
This has two benefits
173
-
* This can do a much better job of overcoming the [shortcoming of Beacons](#a-shortcoming-of-beacons),
174
-
by only splitting the popular names, making the distribution of hashes ven more regular.
175
-
* When querying on the LastName "Svaboda", only one query is needed, rather than five.
175
+
176
+
- This can do a much better job of overcoming the [shortcoming of Beacons](#a-shortcoming-of-beacons),
177
+
by only splitting the popular names, making the distribution of hashes ven more regular.
178
+
- When querying on the LastName "Svaboda", only one query is needed, rather than five.
176
179
177
180
## Changing Beacon Constraints
178
181
@@ -188,16 +191,16 @@ Why not? Because when we write, the item has a bucket, and each beacon calculate
188
191
When we Query, the query has a bucket and each beacon calculates its bucket from the query's bucket.
189
192
If you change a constraint, then the beacon bucket calculation at query time will not produce the same results as were used when the item was written. For example:
190
193
191
-
*`maximumNumberOfBuckets` for a table is 5
192
-
* Beacon A has no constraint
193
-
* Beacon B is constrained to 2 buckets.
194
-
* We write an item. Its bucket is 4.
195
-
* Beacon A is put in bucket 4, beacon B is put in bucket `4 % 2` or 0.
196
-
* We search with ":aws_dbe_bucket = 4". We look for beacon A in bucket 4 and beacon B in bucket 0
197
-
* We find the item.
198
-
* Now we change the constraint to `3`.
199
-
* We search with ":aws_dbe_bucket = 4". We look for beacon A in bucket 4 and beacon B in bucket `4 % 3` or 1
200
-
* We do not find the item.
194
+
-`maximumNumberOfBuckets` for a table is 5
195
+
- Beacon A has no constraint
196
+
- Beacon B is constrained to 2 buckets.
197
+
- We write an item. Its bucket is 4.
198
+
- Beacon A is put in bucket 4, beacon B is put in bucket `4 % 2` or 0.
199
+
- We search with ":aws_dbe_bucket = 4". We look for beacon A in bucket 4 and beacon B in bucket 0
200
+
- We find the item.
201
+
- Now we change the constraint to `3`.
202
+
- We search with ":aws_dbe_bucket = 4". We look for beacon A in bucket 4 and beacon B in bucket `4 % 3` or 1
203
+
- We do not find the item.
201
204
202
205
### Long answer
203
206
@@ -213,8 +216,9 @@ then all of the items can still be found.
213
216
and include both plain standard beacons and standard beacons used as part of a compound beacon.
214
217
215
218
Compatible in this context means that, for any ":aws_dbe_bucket = N",
216
-
* before the change, all involved beacons were put in the same bucket
217
-
* after the change, all involved beacons are put in the same bucket
219
+
220
+
- before the change, all involved beacons were put in the same bucket
221
+
- after the change, all involved beacons are put in the same bucket
218
222
219
223
but the bucket before might be different than the bucket after.
0 commit comments