Skip to content

Commit cf53b25

Browse files
authored
Update bandit action context examples to use attributes only related to subject-action interaction (#504)
* update examples * update general page on action contexts * feedback from PR * changes from self-review
1 parent e52db59 commit cf53b25

File tree

5 files changed

+61
-60
lines changed

5 files changed

+61
-60
lines changed

docs/contextual-bandits/actions_contexts.md

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ At a high level, to select a particular action, the Contextual Bandit performs t
1111

1212
There are two processes at play that make the Contextual Bandit work:
1313

14-
1. **Real-time decision making**: based on provided actions and contexts, select an action
14+
1. **Real-time decision-making**: based on provided actions and contexts, select an action
1515
2. **Model updating**: periodically, update the parameters of the Contextual Bandit based on outcome data.
1616

1717
## Bandit objective
@@ -32,7 +32,7 @@ Consider optimizing across multiple product promotions that we can show the user
3232
We can leverage the built-in [experiment analysis](/contextual-bandits/analysis) to verify that the selected objective is well-aligned with improving long term outcomes.
3333
:::
3434

35-
## Real-time decision making
35+
## Real-time decision-making
3636

3737
In order to make real-time decisions, you provide Eppo with actions and contexts that are relevant to personalize the experience.
3838
The Eppo SDK then uses the underlying Contextual Bandit model to select an action, balancing exploration and exploitation; learning and optimizing at the same time.
@@ -61,31 +61,35 @@ Contextual Bandits automatically explore new actions as they are encountered and
6161

6262
To create a bandit policy that personalizes, you need to provide context.
6363

64-
Generally, there are three types of context:
65-
1. Subject context: For example, the subjects's past behavior, demographic information, are they on a mobile device, etc.
66-
2. Action context: For example, the size of the discount, the type of product, the price, etc.
67-
3. Subject-action interaction context: For example, the brand affinity of the subject to the product (based on past purchases, user reviews, etc.)
64+
Generally, there are two types of context:
65+
1. Subject context: For example, the subject's past behavior, demographic information, whether they are on a mobile device, etc.
66+
2. Action (Subject-action interaction) context: For example, the number of previous purchases of the action's brand or product category.
6867

69-
Note that the first of these is independent of the actions, while the other two are action dependent.
70-
For convenience, you can supply the subject attributes once and they will be used across all actions, while
71-
separately, you can supply action specific attributes.
68+
Note that the first of these is independent of the actions, while the other is action dependent.
69+
The subject attributes are provided directly and not tied to a specific action, while separately, you can supply action-specific attributes for each action.
7270
Behind the scenes, we combine the two to create a single context per action that is used by the underlying model to select which action to pick.
7371

72+
:::info Currently, we build bandit models on a per-action basis
73+
Be sure action attributes are subject-specific. Any action attributes that are not specific to the subject will end up
74+
being the same for all subjects and be ignored as they will not be predictive.
75+
:::
76+
77+
7478
#### Subject attributes
7579

7680
Subject attributes capture information about the subject that is relevant to the actions.
7781
Generic attributes such as age (bucket), gender, and device information can be helpful, but the most salient attributes are product specific.
7882

7983
#### Action attributes
8084

81-
Action attributes capture information that is unique to a particular action. For example, the discount offered in a promotion, the price of a product, etc.
82-
Furthermore, you can also leverage action attributes to include information that is specific to the subject-action pair. For example, you can leverage an internal Machine Learning model that measures brand affinity.
85+
Action attributes capture information that is unique to a particular action for the subject. For example, the number of
86+
previous purchases the subject has made of the action's brand or product category.
8387

8488
:::info What context attributes to use
8589
Selection of which attributes to include in the context is a bit of an art.
8690

8791
In general, there are two types of attributes that will help the Contextual Bandit efficiently learn a strong policy:
88-
1. Attributes that provide a strong signal on personalization: For example, the brand affinity of the subject to the product, the price, etc.
92+
1. Attributes that provide a strong signal on personalization: For example, the brand affinity of the subject to the product.
8993
2. Attributes that predict the outcome of an action: For example, when optimizing for purchases, whether a user is a new or returning user might not affect which action is best, but it can help reduce variance (similar to CUPED), helping the contextual bandit learn more efficiently
9094
:::
9195

docs/sdks/sdk-features/bandits.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,11 +38,11 @@ subject_attributes = eppo_client.bandit.ContextAttributes(
3838
actions = {
3939
"nike": eppo_client.bandit.ContextAttributes(
4040
numeric_attributes={"brand_affinity": 2.3},
41-
categorical_attributes={"aspect_ratio": "16:9"}
41+
categorical_attributes={"previously_purchased": true}
4242
),
4343
"adidas": eppo_client.bandit.ContextAttributes(
4444
numeric_attributes={"brand_affinity": 0.2},
45-
categorical_attributes={"aspect_ratio": "16:9"}
45+
categorical_attributes={"previously_purchased": false}
4646
)
4747
}
4848

docs/sdks/server-sdks/java.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -186,8 +186,8 @@ The properties of the event object passed to the bandit logger, accessible via g
186186
| `subjectNumericAttributes` (Attributes) | Metadata about numeric attributes of the subject. Map of the name of attributes their provided values | {age=60} |
187187
| `subjectCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the subject. Map of the name of attributes their provided values | {country=FR} |
188188
| `action` (String) | The action assigned by the bandit | "promo-20%-off" |
189-
| `actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values | {discount=0.2} |
190-
| `actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values | {promoTextColor=white} |
189+
| `actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values | {brandAffinity=0.3} |
190+
| `actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values | {previouslyPurchased=false} |
191191
| `actionProbability` (Double) | The weight between 0 and 1 the bandit valued the assigned action | 0.25 |
192192
| `optimalityGap` (Double) | The difference between the score of the selected action and the highest-scored action | 456 |
193193
| `modelVersion` (String) | The key for the version (iteration) of the bandit parameters used to determine the action probability | "v123" |
@@ -233,14 +233,14 @@ Actions actions = new BanditActions(
233233
new Attributes(
234234
Map.of(
235235
"brandAffinity", EppoValue.valueOf(2.3),
236-
"imageAspectRatio", EppoValue.valueOf("16:9")
236+
"previouslyPurchased", EppoValue.valueOf(true)
237237
)
238238
),
239239
"adidas",
240240
new Attributes(
241241
Map.of(
242242
"brandAffinity", EppoValue.valueOf(0.2),
243-
"imageAspectRatio", EppoValue.valueOf("16:9")
243+
"previouslyPurchased", EppoValue.valueOf(false)
244244
)
245245
)
246246
)
@@ -318,10 +318,9 @@ Similar to subject context, action contexts can be provided as `Attributes`--whi
318318
is a numeric attribute, and everything else is a categorical attribute--or as `ContextAttributes`, which have explicit
319319
bucketing into `numericAttributes` and `categoricalAttributes`.
320320

321-
Note that action contexts can contain two kinds of information:
322-
- Action-specific context (e.g., the image aspect ratio of image corresponding to this action)
323-
- Subject-action interaction context (e.g., there could be a "brand-affinity" model that computes brand affinities of users
324-
to brands, and scores of that model can be added to the action context to provide additional context for the bandit)
321+
Note that relevant action contexts are subject-action interactions. For example, there could be a "brand-affinity" model
322+
that computes brand affinities of users to brands, and scores of that model can be added to the action context to provide
323+
additional context for the bandit.
325324

326325
If there is no action context, you can use a `Set<String>` of all the action names when constructing `BanditActions` to
327326
pass in.

docs/sdks/server-sdks/node.md

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -484,21 +484,21 @@ await init({
484484

485485
The SDK will invoke the `logBanditAction()` function with an `IBanditEvent` object that contains the following fields:
486486

487-
| Field | Description | Example |
488-
|---------------------------------------------|-------------------------------------------------------------------------------------------------------------------|--------------------------------|
489-
| `timestamp` (string) | The time when the action is taken in UTC as an ISO string | "2024-03-22T14:26:55.000Z" |
490-
| `featureFlag` (string) | The key of the feature flag corresponding to the bandit | "bandit-test-allocation-4" |
491-
| `bandit` (string) | The key (unique identifier) of the bandit | "ad-bandit-1" |
492-
| `subject` (string) | An identifier of the subject or user assigned to the experiment variation | "ed6f85019080" |
493-
| `subjectNumericAttributes` (Attributes) | Metadata about numeric attributes of the subject. Map of the name of attributes their provided values | `{"age": 30}` |
494-
| `subjectCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the subject. Map of the name of attributes their provided values | `{"loyalty_tier": "gold"}` |
495-
| `action` (string) | The action assigned by the bandit | "promo-20%-off" |
496-
| `actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values | `{"discount": 0.2}` |
497-
| `actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values | `{"promoTextColor": "white"}` |
498-
| `actionProbability` (number) | The weight between 0 and 1 the bandit valued the assigned action | 0.25 |
499-
| `optimalityGap` (number) | The difference between the score of the selected action and the highest-scored action | 456 |
500-
| `modelVersion` (string) | Unique identifier for the version (iteration) of the bandit parameters used to determine the action probability | "v123" |
501-
| `metaData` Record<string, unknown> | Any additional freeform meta data, such as the version of the SDK | `{ "sdkLibVersion": "3.5.1" }` |
487+
| Field | Description | Example |
488+
|---------------------------------------------|-------------------------------------------------------------------------------------------------------------------|----------------------------------|
489+
| `timestamp` (string) | The time when the action is taken in UTC as an ISO string | "2024-03-22T14:26:55.000Z" |
490+
| `featureFlag` (string) | The key of the feature flag corresponding to the bandit | "bandit-test-allocation-4" |
491+
| `bandit` (string) | The key (unique identifier) of the bandit | "ad-bandit-1" |
492+
| `subject` (string) | An identifier of the subject or user assigned to the experiment variation | "ed6f85019080" |
493+
| `subjectNumericAttributes` (Attributes) | Metadata about numeric attributes of the subject. Map of the name of attributes their provided values | `{"age": 30}` |
494+
| `subjectCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the subject. Map of the name of attributes their provided values | `{"loyalty_tier": "gold"}` |
495+
| `action` (string) | The action assigned by the bandit | "promo-20%-off" |
496+
| `actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values | `{"brandAffinity": 0.2}` |
497+
| `actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values | `{"previouslyPurchased": false}` |
498+
| `actionProbability` (number) | The weight between 0 and 1 the bandit valued the assigned action | 0.25 |
499+
| `optimalityGap` (number) | The difference between the score of the selected action and the highest-scored action | 456 |
500+
| `modelVersion` (string) | Unique identifier for the version (iteration) of the bandit parameters used to determine the action probability | "v123" |
501+
| `metaData` Record<string, unknown> | Any additional freeform meta data, such as the version of the SDK | `{ "sdkLibVersion": "3.5.1" }` |
502502

503503
### Querying the bandit for an action
504504

@@ -579,10 +579,9 @@ Similar to subject context, action contexts can be provided as `Attributes`--whi
579579
attribute, and everything else is a categorical attribute--or as `ContextAttributes`, which have explicit bucketing into `numericAttributes`
580580
and `categoricalAttributes`.
581581

582-
Note that action contexts can contain two kinds of information:
583-
- Action-specific context (e.g., the image aspect ratio of image corresponding to this action)
584-
- Subject-action interaction context (e.g., there could be a "brand-affinity" model that computes brand affinities of users to brands,
585-
and scores of that model can be added to the action context to provide additional context for the bandit)
582+
Note that relevant action contexts are subject-action interactions. For example, there could be a "brand-affinity" model
583+
that computes brand affinities of users to brands, and scores of that model can be added to the action context to provide
584+
additional context for the bandit.
586585

587586
If there is no action context, an array of strings comprising only the actions names can also be passed in.
588587

0 commit comments

Comments
 (0)