Skip to content

Commit 8e0a0fd

Browse files
committed
edits from Mary's feedback
1 parent ac7a96a commit 8e0a0fd

File tree

8 files changed

+32
-10
lines changed

8 files changed

+32
-10
lines changed

articles/cognitive-services/personalizer/concept-active-inactive-events.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,17 @@ ms.date: 02/20/2020
77

88
# Active and inactive events
99

10-
When your application calls the Rank API, you receive the action the application should show in the **rewardActionId** field. From that moment, Personalizer expects a Reward call that has the same eventId. The reward score will be used to train the model for future Rank calls. If no Reward call is received for the eventId, a default reward is applied. Default rewards are set in the Azure portal.
10+
An **active** event is any call to Rank where you know you are going to show the result to the customer and determine the reward score.
11+
12+
An **inactive** event is a call to Rank where you will not show the result to the customer and determine the reward score. Inactive events should not call the Reward API.
13+
14+
It is important the that the learning loop know the actual type of event. An inactive event will not have a Reward call. An active event should have a Reward call but if the API call is never made, the default reward score is applied. You don't want the default applied reward score to impact training, if the customer never saw the Rank best-selected content.
15+
16+
## Typical active events scenario
17+
18+
When your application calls the Rank API, you receive the action which the application should show in the **rewardActionId** field. From that moment, Personalizer expects a Reward call with a reward score that has the same eventId. The reward score is used to train the model for future Rank calls. If no Reward call is received for the eventId, a default reward is applied. [Default rewards](how-to-settings#configure-rewards-for-the-feedback-loop-based-on-use-case) are set on your Personalizer resource in the Azure portal.
19+
20+
## Other event type scenarios
1121

1222
In some scenarios, the application might need to call Rank before it even knows if the result will be used or displayed to the user. This might happen in situations where, for example, the page rendering of promoted content is overwritten by a marketing campaign. If the result of the Rank call was never used and the user never saw it, don't send a corresponding Reward call.
1323

@@ -17,11 +27,14 @@ Typically, these scenarios happen when:
1727
* Your application is doing predictive personalization in which Rank calls are made with little real-time context and the application might or might not use the output.
1828

1929
In these cases, use Personalizer to call Rank, requesting the event to be _inactive_. Personalizer won't expect a reward for this event, and it won't apply a default reward.
30+
2031
Later in your business logic, if the application uses the information from the Rank call, just _activate_ the event. As soon as the event is active, Personalizer expects an event reward. If no explicit call is made to the Reward API, Personalizer applies a default reward.
2132

2233
## Inactive events
2334

24-
To disable training for an event, call Rank by using `learningEnabled = False`. For an inactive event, learning is implicitly activated if you send a reward for the eventId or call the `activate` API for that eventId.
35+
To disable training for an event, call Rank by using `learningEnabled = False`.
36+
37+
For an inactive event, learning is implicitly activated if you send a reward for the eventId or call the `activate` API for that eventId.
2538

2639
## Next steps
2740

articles/cognitive-services/personalizer/concept-active-learning.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,13 @@ ms.date: 02/20/2020
99

1010
Learning settings determine the *hyperparameters* of the model training. Two models of the same data that are trained on different learning settings will end up different.
1111

12+
[Learning policy and settings](how-to-settings#configure-rewards-for-the-feedback-loop-based-on-use-case) are set on your Personalizer resource in the Azure portal.
13+
1214
### Import and export learning policies
1315

1416
You can import and export learning-policy files from the Azure portal. Use this method to save existing policies, test them, replace them, and archive them in your source code control as artifacts for future reference and audit.
1517

16-
Learn [how to](how-to-manage-model.md) import and export a learning policy.
18+
Learn [how to](how-to-manage-model.md#import-a-new-learning-policy) import and export a learning policy in the Azure portal for your Personalizer resource.
1719

1820
### Understand learning policy settings
1921

articles/cognitive-services/personalizer/concept-rewards.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,11 @@ The reward score indicates how well the personalization choice, [RewardActionID]
1111

1212
Personalizer trains its machine learning models by evaluating the rewards.
1313

14+
Learn [how to](how-to-settings.md#configure-rewards-for-the-feedback-loop-based-on-use-case) configure the default reward score in the Azure portal for your Personalizer resource.
15+
1416
## Use Reward API to send reward score to Personalizer
1517

16-
Rewards are sent to Personalizer by the [Reward API](https://docs.microsoft.com/rest/api/cognitiveservices/personalizer/events/reward). Typically, a reward is a number from 0 and 1. A negative reward, with the value of -1, is possible in certain scenarios and should only be used if you are experienced with reinforcement learning (RL). Personalizer trains the model to achieve the highest possible sum of rewards over time.
18+
Rewards are sent to Personalizer by the [Reward API](https://docs.microsoft.com/rest/api/cognitiveservices/personalizer/events/reward). Typically, a reward is a number from 0 to 1. A negative reward, with the value of -1, is possible in certain scenarios and should only be used if you are experienced with reinforcement learning (RL). Personalizer trains the model to achieve the highest possible sum of rewards over time.
1719

1820
Rewards are sent after the user behavior has happened, which could be days later. The maximum amount of time Personalizer will wait until an event is considered to have no reward or a default reward is configured with the [Reward Wait Time](#reward-wait-time) in the Azure portal.
1921

articles/cognitive-services/personalizer/how-personalizer-works.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@ ms.date: 02/18/2020
77

88
# How Personalizer works
99

10-
The Personalizer _learning loop_ uses machine learning to build the model that predicts the top action for your content. The model is trained exclusively on your data that you sent to it with the **Rank** and **Reward** calls. Every loop is completely independent of each other.
10+
The Personalizer reource, your _learning loop_, uses machine learning to build the model that predicts the top action for your content. The model is trained exclusively on your data that you sent to it with the **Rank** and **Reward** calls. Every loop is completely independent of each other.
1111

1212
## Rank and Reward APIs impact the model
1313

1414
You send _actions with features_ and _context features_ to the Rank API. The **Rank** API decides to use either:
1515

1616
* _Exploit_: The current model to decide the best action based on past data.
17-
* _Explore_: Select a different action instead of the top action. This percentage is configured for your Personalizer resource in the Azure portal.
17+
* _Explore_: Select a different action instead of the top action. You [configure this percentage](how-to-settings#configure-exploration-to-allow-the-learning-loop-to-adapt) for your Personalizer resource in the Azure portal.
1818

1919
You determine the reward score and send that score to the Reward API. The **Reward** API:
2020

articles/cognitive-services/personalizer/how-to-create-resource.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ ms.date: 02/19/2020
99

1010
A Personalizer resource is the same thing as a Personalizer learning loop. A single resource, or learning loop, is created for each subject domain or content area you have. Do not use multiple content areas in the same loop because this will confuse the learning loop and provide poor predictions.
1111

12+
If you want Personalizer to select the best content for more than one content area of a web page, use a different learning loop for each.
13+
14+
1215
## Create a resource in the Azure portal
1316

1417
Create a Personalizer resource for each feedback loop.
@@ -21,7 +24,7 @@ Create a Personalizer resource for each feedback loop.
2124
2225
1. Select **Create** to create the resource.
2326

24-
1. While still in the Azure portal, go to the **Configuration** page for the new resource to [configure the learning loop](how-to-settings.md).
27+
1. Once your resource has deployed, select the **Go to Resource** button to go to your Personalizer resource. Go to the **Configuration** page for the new resource to [configure the learning loop](how-to-settings.md).
2528

2629
## Create a resource with the Azure CLI
2730

articles/cognitive-services/personalizer/how-to-offline-evaluation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Read about [Offline Evaluations](concepts-offline-evaluation.md) to learn more.
2727

2828
## Run an offline evaluation
2929

30-
1. In the [Azure portal](https://azure.microsoft.com/free/), locate your Personalization resource.
30+
1. In the [Azure portal](https://azure.microsoft.com/free/), locate your Personalizer resource.
3131
1. In the Azure portal, go to the **Evaluations** section and select **Create Evaluation**.
3232
![In the Azure portal, go to the **Evaluations** section and select **Create Evaluation**.](./media/offline-evaluation/create-new-offline-evaluation.png)
3333
1. Configure the following values:

articles/cognitive-services/personalizer/how-to-settings.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,12 @@ ms.date: 02/19/2020
99

1010
Service configuration includes how the service treats rewards, how often the service explores, how often the model is retrained, and how much data is stored.
1111

12+
Configure the learning loop on the **Configuration** page, in the Azure portal for that Personalizer resource.
13+
1214
<a name="configure-service-settings-in-the-azure-portal"></a>
1315
<a name="configure-reward-settings-for-the-feedback-loop-based-on-use-case"></a>
1416

15-
## Configure rewards for the feedback loop based on use case
17+
## Configure rewards for the feedback loop
1618

1719
Configure the service for your learning loop's use of rewards. Changes to the following values will reset the current Personalizer model and retrain it with the last 2 days of data.
1820

articles/cognitive-services/personalizer/terminology.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ Personalizer is configured from the [Azure portal](https://portal.azure.com).
5555
* **Inactive Events**: An inactive event is one where you called Rank, but you're not sure the user will ever see the result, due to client application decisions. Inactive events allow you to create and store personalization results, then decide to discard them later without impacting the machine learning model.
5656

5757

58-
* **Reward**: A measure of how the user responded to the Rank API's returned reward action ID, as a score between 0 and 1. The 0 to 1 value is set by your business logic, based on how the choice helped achieve your business goals of personalization. The learning loop doesn't store this reward as individual user history.
58+
* **Reward**: A measure of how the user responded to the Rank API's returned reward action ID, as a score between 0 to 1. The 0 to 1 value is set by your business logic, based on how the choice helped achieve your business goals of personalization. The learning loop doesn't store this reward as individual user history.
5959

6060
## Offline evaluations
6161

0 commit comments

Comments
 (0)