You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/contextual-bandits/actions_contexts.md
+16-12Lines changed: 16 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ At a high level, to select a particular action, the Contextual Bandit performs t
11
11
12
12
There are two processes at play that make the Contextual Bandit work:
13
13
14
-
1.**Real-time decisionmaking**: based on provided actions and contexts, select an action
14
+
1.**Real-time decision-making**: based on provided actions and contexts, select an action
15
15
2.**Model updating**: periodically, update the parameters of the Contextual Bandit based on outcome data.
16
16
17
17
## Bandit objective
@@ -32,7 +32,7 @@ Consider optimizing across multiple product promotions that we can show the user
32
32
We can leverage the built-in [experiment analysis](/contextual-bandits/analysis) to verify that the selected objective is well-aligned with improving long term outcomes.
33
33
:::
34
34
35
-
## Real-time decisionmaking
35
+
## Real-time decision-making
36
36
37
37
In order to make real-time decisions, you provide Eppo with actions and contexts that are relevant to personalize the experience.
38
38
The Eppo SDK then uses the underlying Contextual Bandit model to select an action, balancing exploration and exploitation; learning and optimizing at the same time.
@@ -61,31 +61,35 @@ Contextual Bandits automatically explore new actions as they are encountered and
61
61
62
62
To create a bandit policy that personalizes, you need to provide context.
63
63
64
-
Generally, there are three types of context:
65
-
1. Subject context: For example, the subjects's past behavior, demographic information, are they on a mobile device, etc.
66
-
2. Action context: For example, the size of the discount, the type of product, the price, etc.
67
-
3. Subject-action interaction context: For example, the brand affinity of the subject to the product (based on past purchases, user reviews, etc.)
64
+
Generally, there are two types of context:
65
+
1. Subject context: For example, the subject's past behavior, demographic information, whether they are on a mobile device, etc.
66
+
2. Action (Subject-action interaction) context: For example, the number of previous purchases of the action's brand or product category.
68
67
69
-
Note that the first of these is independent of the actions, while the other two are action dependent.
70
-
For convenience, you can supply the subject attributes once and they will be used across all actions, while
71
-
separately, you can supply action specific attributes.
68
+
Note that the first of these is independent of the actions, while the other is action dependent.
69
+
The subject attributes are provided directly and not tied to a specific action, while separately, you can supply action-specific attributes for each action.
72
70
Behind the scenes, we combine the two to create a single context per action that is used by the underlying model to select which action to pick.
73
71
72
+
:::info Currently, we build bandit models on a per-action basis
73
+
Be sure action attributes are subject-specific. Any action attributes that are not specific to the subject will end up
74
+
being the same for all subjects and be ignored as they will not be predictive.
75
+
:::
76
+
77
+
74
78
#### Subject attributes
75
79
76
80
Subject attributes capture information about the subject that is relevant to the actions.
77
81
Generic attributes such as age (bucket), gender, and device information can be helpful, but the most salient attributes are product specific.
78
82
79
83
#### Action attributes
80
84
81
-
Action attributes capture information that is unique to a particular action. For example, the discount offered in a promotion, the price of a product, etc.
82
-
Furthermore, you can also leverage action attributes to include information that is specific to the subject-action pair. For example, you can leverage an internal Machine Learning model that measures brand affinity.
85
+
Action attributes capture information that is unique to a particular action for the subject. For example, the number of
86
+
previous purchases the subject has made of the action's brand or product category.
83
87
84
88
:::info What context attributes to use
85
89
Selection of which attributes to include in the context is a bit of an art.
86
90
87
91
In general, there are two types of attributes that will help the Contextual Bandit efficiently learn a strong policy:
88
-
1. Attributes that provide a strong signal on personalization: For example, the brand affinity of the subject to the product, the price, etc.
92
+
1. Attributes that provide a strong signal on personalization: For example, the brand affinity of the subject to the product.
89
93
2. Attributes that predict the outcome of an action: For example, when optimizing for purchases, whether a user is a new or returning user might not affect which action is best, but it can help reduce variance (similar to CUPED), helping the contextual bandit learn more efficiently
Copy file name to clipboardExpand all lines: docs/sdks/server-sdks/java.md
+7-8Lines changed: 7 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -186,8 +186,8 @@ The properties of the event object passed to the bandit logger, accessible via g
186
186
|`subjectNumericAttributes` (Attributes) | Metadata about numeric attributes of the subject. Map of the name of attributes their provided values | {age=60} |
187
187
|`subjectCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the subject. Map of the name of attributes their provided values | {country=FR} |
188
188
|`action` (String) | The action assigned by the bandit | "promo-20%-off" |
189
-
|`actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values | {discount=0.2} |
190
-
|`actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values | {promoTextColor=white} |
189
+
|`actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values | {brandAffinity=0.3}|
190
+
|`actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values | {previouslyPurchased=false}|
191
191
|`actionProbability` (Double) | The weight between 0 and 1 the bandit valued the assigned action | 0.25 |
192
192
|`optimalityGap` (Double) | The difference between the score of the selected action and the highest-scored action | 456 |
193
193
|`modelVersion` (String) | The key for the version (iteration) of the bandit parameters used to determine the action probability | "v123" |
@@ -233,14 +233,14 @@ Actions actions = new BanditActions(
233
233
newAttributes(
234
234
Map.of(
235
235
"brandAffinity", EppoValue.valueOf(2.3),
236
-
"imageAspectRatio", EppoValue.valueOf("16:9")
236
+
"previouslyPurchased", EppoValue.valueOf(true)
237
237
)
238
238
),
239
239
"adidas",
240
240
newAttributes(
241
241
Map.of(
242
242
"brandAffinity", EppoValue.valueOf(0.2),
243
-
"imageAspectRatio", EppoValue.valueOf("16:9")
243
+
"previouslyPurchased", EppoValue.valueOf(false)
244
244
)
245
245
)
246
246
)
@@ -318,10 +318,9 @@ Similar to subject context, action contexts can be provided as `Attributes`--whi
318
318
is a numeric attribute, and everything else is a categorical attribute--or as `ContextAttributes`, which have explicit
319
319
bucketing into `numericAttributes` and `categoricalAttributes`.
320
320
321
-
Note that action contexts can contain two kinds of information:
322
-
- Action-specific context (e.g., the image aspect ratio of image corresponding to this action)
323
-
- Subject-action interaction context (e.g., there could be a "brand-affinity" model that computes brand affinities of users
324
-
to brands, and scores of that model can be added to the action context to provide additional context for the bandit)
321
+
Note that relevant action contexts are subject-action interactions. For example, there could be a "brand-affinity" model
322
+
that computes brand affinities of users to brands, and scores of that model can be added to the action context to provide
323
+
additional context for the bandit.
325
324
326
325
If there is no action context, you can use a `Set<String>` of all the action names when constructing `BanditActions` to
|`timestamp` (string) | The time when the action is taken in UTC as an ISO string | "2024-03-22T14:26:55.000Z" |
490
-
|`featureFlag` (string) | The key of the feature flag corresponding to the bandit | "bandit-test-allocation-4" |
491
-
|`bandit` (string) | The key (unique identifier) of the bandit | "ad-bandit-1" |
492
-
|`subject` (string) | An identifier of the subject or user assigned to the experiment variation | "ed6f85019080" |
493
-
|`subjectNumericAttributes` (Attributes) | Metadata about numeric attributes of the subject. Map of the name of attributes their provided values |`{"age": 30}`|
494
-
|`subjectCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the subject. Map of the name of attributes their provided values |`{"loyalty_tier": "gold"}`|
495
-
|`action` (string) | The action assigned by the bandit | "promo-20%-off" |
496
-
|`actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values |`{"discount": 0.2}`|
497
-
|`actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values |`{"promoTextColor": "white"}`|
498
-
|`actionProbability` (number) | The weight between 0 and 1 the bandit valued the assigned action | 0.25 |
499
-
|`optimalityGap` (number) | The difference between the score of the selected action and the highest-scored action | 456 |
500
-
|`modelVersion` (string) | Unique identifier for the version (iteration) of the bandit parameters used to determine the action probability | "v123" |
501
-
|`metaData` Record<string, unknown> | Any additional freeform meta data, such as the version of the SDK |`{ "sdkLibVersion": "3.5.1" }`|
|`timestamp` (string) | The time when the action is taken in UTC as an ISO string | "2024-03-22T14:26:55.000Z" |
490
+
|`featureFlag` (string) | The key of the feature flag corresponding to the bandit | "bandit-test-allocation-4" |
491
+
|`bandit` (string) | The key (unique identifier) of the bandit | "ad-bandit-1" |
492
+
|`subject` (string) | An identifier of the subject or user assigned to the experiment variation | "ed6f85019080" |
493
+
|`subjectNumericAttributes` (Attributes) | Metadata about numeric attributes of the subject. Map of the name of attributes their provided values |`{"age": 30}`|
494
+
|`subjectCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the subject. Map of the name of attributes their provided values |`{"loyalty_tier": "gold"}`|
495
+
|`action` (string) | The action assigned by the bandit | "promo-20%-off" |
496
+
|`actionNumericAttributes` (Attributes) | Metadata about numeric attributes of the assigned action. Map of the name of attributes their provided values |`{"brandAffinity": 0.2}`|
497
+
|`actionCategoricalAttributes` (Attributes) | Metadata about non-numeric attributes of the assigned action. Map of the name of attributes their provided values |`{"previouslyPurchased": false}`|
498
+
|`actionProbability` (number) | The weight between 0 and 1 the bandit valued the assigned action | 0.25 |
499
+
|`optimalityGap` (number) | The difference between the score of the selected action and the highest-scored action | 456 |
500
+
|`modelVersion` (string) | Unique identifier for the version (iteration) of the bandit parameters used to determine the action probability | "v123" |
501
+
|`metaData` Record<string, unknown> | Any additional freeform meta data, such as the version of the SDK |`{ "sdkLibVersion": "3.5.1" }`|
502
502
503
503
### Querying the bandit for an action
504
504
@@ -579,10 +579,9 @@ Similar to subject context, action contexts can be provided as `Attributes`--whi
579
579
attribute, and everything else is a categorical attribute--or as `ContextAttributes`, which have explicit bucketing into `numericAttributes`
580
580
and `categoricalAttributes`.
581
581
582
-
Note that action contexts can contain two kinds of information:
583
-
- Action-specific context (e.g., the image aspect ratio of image corresponding to this action)
584
-
- Subject-action interaction context (e.g., there could be a "brand-affinity" model that computes brand affinities of users to brands,
585
-
and scores of that model can be added to the action context to provide additional context for the bandit)
582
+
Note that relevant action contexts are subject-action interactions. For example, there could be a "brand-affinity" model
583
+
that computes brand affinities of users to brands, and scores of that model can be added to the action context to provide
584
+
additional context for the bandit.
586
585
587
586
If there is no action context, an array of strings comprising only the actions names can also be passed in.
0 commit comments