You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/concept-active-inactive-events.md
+15-2Lines changed: 15 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,17 @@ ms.date: 02/20/2020
7
7
8
8
# Active and inactive events
9
9
10
-
When your application calls the Rank API, you receive the action the application should show in the **rewardActionId** field. From that moment, Personalizer expects a Reward call that has the same eventId. The reward score will be used to train the model for future Rank calls. If no Reward call is received for the eventId, a default reward is applied. Default rewards are set in the Azure portal.
10
+
An **active** event is any call to Rank where you know you are going to show the result to the customer and determine the reward score.
11
+
12
+
An **inactive** event is a call to Rank where you will not show the result to the customer and determine the reward score. Inactive events should not call the Reward API.
13
+
14
+
It is important the that the learning loop know the actual type of event. An inactive event will not have a Reward call. An active event should have a Reward call but if the API call is never made, the default reward score is applied. You don't want the default applied reward score to impact training, if the customer never saw the Rank best-selected content.
15
+
16
+
## Typical active events scenario
17
+
18
+
When your application calls the Rank API, you receive the action which the application should show in the **rewardActionId** field. From that moment, Personalizer expects a Reward call with a reward score that has the same eventId. The reward score is used to train the model for future Rank calls. If no Reward call is received for the eventId, a default reward is applied. [Default rewards](how-to-settings#configure-rewards-for-the-feedback-loop-based-on-use-case) are set on your Personalizer resource in the Azure portal.
19
+
20
+
## Other event type scenarios
11
21
12
22
In some scenarios, the application might need to call Rank before it even knows if the result will be used or displayed to the user. This might happen in situations where, for example, the page rendering of promoted content is overwritten by a marketing campaign. If the result of the Rank call was never used and the user never saw it, don't send a corresponding Reward call.
13
23
@@ -17,11 +27,14 @@ Typically, these scenarios happen when:
17
27
* Your application is doing predictive personalization in which Rank calls are made with little real-time context and the application might or might not use the output.
18
28
19
29
In these cases, use Personalizer to call Rank, requesting the event to be _inactive_. Personalizer won't expect a reward for this event, and it won't apply a default reward.
30
+
20
31
Later in your business logic, if the application uses the information from the Rank call, just _activate_ the event. As soon as the event is active, Personalizer expects an event reward. If no explicit call is made to the Reward API, Personalizer applies a default reward.
21
32
22
33
## Inactive events
23
34
24
-
To disable training for an event, call Rank by using `learningEnabled = False`. For an inactive event, learning is implicitly activated if you send a reward for the eventId or call the `activate` API for that eventId.
35
+
To disable training for an event, call Rank by using `learningEnabled = False`.
36
+
37
+
For an inactive event, learning is implicitly activated if you send a reward for the eventId or call the `activate` API for that eventId.
Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/concept-active-learning.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,11 +9,13 @@ ms.date: 02/20/2020
9
9
10
10
Learning settings determine the *hyperparameters* of the model training. Two models of the same data that are trained on different learning settings will end up different.
11
11
12
+
[Learning policy and settings](how-to-settings#configure-rewards-for-the-feedback-loop-based-on-use-case) are set on your Personalizer resource in the Azure portal.
13
+
12
14
### Import and export learning policies
13
15
14
16
You can import and export learning-policy files from the Azure portal. Use this method to save existing policies, test them, replace them, and archive them in your source code control as artifacts for future reference and audit.
15
17
16
-
Learn [how to](how-to-manage-model.md) import and export a learning policy.
18
+
Learn [how to](how-to-manage-model.md#import-a-new-learning-policy) import and export a learning policy in the Azure portal for your Personalizer resource.
Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/concept-rewards.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,9 +11,11 @@ The reward score indicates how well the personalization choice, [RewardActionID]
11
11
12
12
Personalizer trains its machine learning models by evaluating the rewards.
13
13
14
+
Learn [how to](how-to-settings.md#configure-rewards-for-the-feedback-loop-based-on-use-case) configure the default reward score in the Azure portal for your Personalizer resource.
15
+
14
16
## Use Reward API to send reward score to Personalizer
15
17
16
-
Rewards are sent to Personalizer by the [Reward API](https://docs.microsoft.com/rest/api/cognitiveservices/personalizer/events/reward). Typically, a reward is a number from 0 and 1. A negative reward, with the value of -1, is possible in certain scenarios and should only be used if you are experienced with reinforcement learning (RL). Personalizer trains the model to achieve the highest possible sum of rewards over time.
18
+
Rewards are sent to Personalizer by the [Reward API](https://docs.microsoft.com/rest/api/cognitiveservices/personalizer/events/reward). Typically, a reward is a number from 0 to 1. A negative reward, with the value of -1, is possible in certain scenarios and should only be used if you are experienced with reinforcement learning (RL). Personalizer trains the model to achieve the highest possible sum of rewards over time.
17
19
18
20
Rewards are sent after the user behavior has happened, which could be days later. The maximum amount of time Personalizer will wait until an event is considered to have no reward or a default reward is configured with the [Reward Wait Time](#reward-wait-time) in the Azure portal.
Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/how-personalizer-works.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,14 +7,14 @@ ms.date: 02/18/2020
7
7
8
8
# How Personalizer works
9
9
10
-
The Personalizer _learning loop_ uses machine learning to build the model that predicts the top action for your content. The model is trained exclusively on your data that you sent to it with the **Rank** and **Reward** calls. Every loop is completely independent of each other.
10
+
The Personalizer reource, your _learning loop_, uses machine learning to build the model that predicts the top action for your content. The model is trained exclusively on your data that you sent to it with the **Rank** and **Reward** calls. Every loop is completely independent of each other.
11
11
12
12
## Rank and Reward APIs impact the model
13
13
14
14
You send _actions with features_ and _context features_ to the Rank API. The **Rank** API decides to use either:
15
15
16
16
*_Exploit_: The current model to decide the best action based on past data.
17
-
*_Explore_: Select a different action instead of the top action. This percentage is configured for your Personalizer resource in the Azure portal.
17
+
*_Explore_: Select a different action instead of the top action. You [configure this percentage](how-to-settings#configure-exploration-to-allow-the-learning-loop-to-adapt) for your Personalizer resource in the Azure portal.
18
18
19
19
You determine the reward score and send that score to the Reward API. The **Reward** API:
Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/how-to-create-resource.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,9 @@ ms.date: 02/19/2020
9
9
10
10
A Personalizer resource is the same thing as a Personalizer learning loop. A single resource, or learning loop, is created for each subject domain or content area you have. Do not use multiple content areas in the same loop because this will confuse the learning loop and provide poor predictions.
11
11
12
+
If you want Personalizer to select the best content for more than one content area of a web page, use a different learning loop for each.
13
+
14
+
12
15
## Create a resource in the Azure portal
13
16
14
17
Create a Personalizer resource for each feedback loop.
@@ -21,7 +24,7 @@ Create a Personalizer resource for each feedback loop.
21
24
22
25
1. Select **Create** to create the resource.
23
26
24
-
1.While still in the Azure portal, go to the **Configuration** page for the new resource to [configure the learning loop](how-to-settings.md).
27
+
1.Once your resource has deployed, select the **Go to Resource** button to go to your Personalizer resource. Go to the **Configuration** page for the new resource to [configure the learning loop](how-to-settings.md).
Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/how-to-offline-evaluation.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ Read about [Offline Evaluations](concepts-offline-evaluation.md) to learn more.
27
27
28
28
## Run an offline evaluation
29
29
30
-
1. In the [Azure portal](https://azure.microsoft.com/free/), locate your Personalization resource.
30
+
1. In the [Azure portal](https://azure.microsoft.com/free/), locate your Personalizer resource.
31
31
1. In the Azure portal, go to the **Evaluations** section and select **Create Evaluation**.
32
32

Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/how-to-settings.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,10 +9,12 @@ ms.date: 02/19/2020
9
9
10
10
Service configuration includes how the service treats rewards, how often the service explores, how often the model is retrained, and how much data is stored.
11
11
12
+
Configure the learning loop on the **Configuration** page, in the Azure portal for that Personalizer resource.
## Configure rewards for the feedback loop based on use case
17
+
## Configure rewards for the feedback loop
16
18
17
19
Configure the service for your learning loop's use of rewards. Changes to the following values will reset the current Personalizer model and retrain it with the last 2 days of data.
Copy file name to clipboardExpand all lines: articles/cognitive-services/personalizer/terminology.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,7 +55,7 @@ Personalizer is configured from the [Azure portal](https://portal.azure.com).
55
55
***Inactive Events**: An inactive event is one where you called Rank, but you're not sure the user will ever see the result, due to client application decisions. Inactive events allow you to create and store personalization results, then decide to discard them later without impacting the machine learning model.
56
56
57
57
58
-
***Reward**: A measure of how the user responded to the Rank API's returned reward action ID, as a score between 0 and 1. The 0 to 1 value is set by your business logic, based on how the choice helped achieve your business goals of personalization. The learning loop doesn't store this reward as individual user history.
58
+
***Reward**: A measure of how the user responded to the Rank API's returned reward action ID, as a score between 0 to 1. The 0 to 1 value is set by your business logic, based on how the choice helped achieve your business goals of personalization. The learning loop doesn't store this reward as individual user history.
0 commit comments