Merge pull request #78906 from diberry/0605-personalizer

ktoliver · web-flow · commit 02a0689c6b8e · 2019-06-11T14:12:00.000-07:00
[Cogsvcs] Personalizer - moving docs over
diff --git a/articles/cognitive-services/personalizer/concept-rewards.md b/articles/cognitive-services/personalizer/concept-rewards.md
@@ -1,14 +1,14 @@
 ---
 title: Reward score - Personalizer
 titleSuffix: Azure Cognitive Services
-description: The reward score indicates how well the personalization choice, RewardActionID, resulted for the user. The value of the reward score is determined by your business logic, based on observations of user behavior. Personalizer trains it's machine learning models by evaluating the rewards.
+description: The reward score indicates how well the personalization choice, RewardActionID, resulted for the user. The value of the reward score is determined by your business logic, based on observations of user behavior. Personalizer trains its machine learning models by evaluating the rewards.
 services: cognitive-services
 author: edjez
 manager: nitinme
 ms.service: cognitive-services
 ms.subservice: personalizer
 ms.topic: overview
-ms.date: 05/13/2019
+ms.date: 06/07/2019
 ms.author: edjez
 ---
 
@@ -26,6 +26,18 @@ Rewards are sent after the user behavior has happened, which could be days later
 
 If the reward score for an event hasn't been received within the **Reward Wait Time**, then the **Default Reward** will be applied. Typically, the **[Default Reward](how-to-settings.md#configure-reward-settings-for-the-feedback-loop-based-on-use-case)** is configured to be zero.
 
+
+## Behaviors and data to consider for rewards
+
+Consider these signals and behaviors for the context of the reward score:
+
+* Direct user input for suggestions when options are involved ("Do you mean X?").
+* Session length.
+* Time between sessions.
+* Sentiment analysis of the user's interactions.
+* Direct questions and mini surveys where the bot asks the user for feedback about usefulness, accuracy.
+* Response to alerts, or delay to response to alerts.
+
 ## Composing reward scores
 
 A Reward score must be computed in your business logic. The score can be represented as:
diff --git a/articles/cognitive-services/personalizer/concepts-scalability-performance.md b/articles/cognitive-services/personalizer/concepts-scalability-performance.md
@@ -0,0 +1,61 @@
+---
+title: Reinforcement Learning - Personalizer
+titleSuffix: Azure Cognitive Services
+description: "High-performance and high-traffic websites and applications have two main factors to consider with Personalizer for scalability and performance: latency and training throughput."
+services: cognitive-services
+author: edjez
+manager: nitinme
+ms.service: cognitive-services
+ms.subservice: personalizer
+ms.topic: overview
+ms.date: 06/07/2019
+ms.author: edjez
+---
+# Scalability and Performance
+
+High-performance and high-traffic websites and applications have two main factors to consider with Personalizer for scalability and performance:
+
+* Keeping low latency when making Rank API calls
+* Making sure training throughput keeps up with event input
+
+Personalization can return a rank very rapidly, with most of the call duration dedicated to communication through the REST API. Azure will autoscale the ability to respond to requests rapidly.
+
+##  Low-latency scenarios
+
+Some applications require low latencies when returning a rank. This is necessary:
+
+* To keep the user from waiting a noticeable amount of time before displaying ranked content.
+* To help a server that is experiencing extreme traffic avoid tying up scarce compute time and network connections.
+
+<!--
+
+If your web site is scaled on your infrastructure, you can avoid making HTTP calls by hosting the Personalizer API in your own servers running a Docker container.
+
+This change would be transparent to your application, other than using an endpoint URL referring to the running docker instances as opposed to an online service in the cloud.
+
+
+
+### Extreme Low Latency Scenarios
+
+If you require latencies under a millisecond, and have already tested using Personalizer via containers, please contact our support team so we can assess your scenario and provide guidance suited to your needs.
+
+-->
+
+## Scalability and training throughput
+
+Personalizer works by updating a model that is retrained based on messages sent asynchronously by Personalizer after Rank and Reward APIs. These messages are sent using an Azure EventHub for the application.
+
+ It is unlikely most applications will reach the maximum joining and training throughput of Personalizer. While reaching this maximum will not slow down the application, it would imply Event Hub queues are getting filled internally faster than they can be cleaned up.
+
+## How to estimate your throughput requirements
+
+* Estimate the average number of bytes per ranking event adding the lengths of the context and action JSON documents.
+* Divide 20MB/sec by this estimated average bytes.
+
+For example, if your average payload has 500 features and each is an estimated 20 characters, then each event is approximately 10kb. With these estimates, 20,000,000 / 10,000 = 2,000 events/sec, which is about 173 million events/day. 
+
+If you are reaching these limits, please contact our support team for architecture advice.
+
+## Next steps
+
+[Create and configure Personalizer](how-to-settings.md).
diff --git a/articles/cognitive-services/personalizer/how-personalizer-works.md b/articles/cognitive-services/personalizer/how-personalizer-works.md
@@ -7,7 +7,7 @@ manager: nitinme
 ms.service: cognitive-services
 ms.subservice: personalizer
 ms.topic: overview
-ms.date: 05/07/2019
+ms.date: 06/07/2019
 ms.author: edjez
 ---
 
@@ -75,6 +75,106 @@ Personalizer is based on cutting-edge science and research in the area of [Reinf
 
 * **Model**: A Personalizer model captures all data learned about user behavior, getting training data from the combination of the arguments you send to Rank and Reward calls, and with a training behavior determined by the Learning Policy. 
 
+## Example use cases for Personalizer
+
+* Intent clarification & disambiguation: help your users have a better experience when their intent is not clear by providing an option that is personalized to each user.
+* Default suggestions for menus & options: have the bot suggest the most likely item in a personalized way as a first step, instead of presenting an impersonal menu or list of alternatives.
+* Bot traits & tone: for bots that can vary tone, verbosity, and writing style, consider varying these traits in a personalized ways.
+* Notification & alert content: decide what text to use for alerts in order to engage users more.
+* Notification & alert timing: have personalized learning of when to send notifications to users to engage them more.
+
+## Checklist for Applying Personalizer
+
+You can apply Personalizer in situations where:
+
+* You have a business or usability goal for your application.
+* You have a place in your application where making a contextual decision of what to show to users will improve that goal.
+* The best choice can and should be learned from collective user behavior and total reward score.
+* The use of machine learning for personalization follows [responsible use guidelines](ethics-responsible-use.md) and choices for your team.
+* The decision can be expressed as ranking the best option ([action](concepts-features.md#actions-represent-a-list-of-options) from a limited set of choices.
+* How well that choice worked can be computed by your business logic, by measuring some aspect of user behavior, and expressing it in a number between -1 and 1.
+* The reward score doesn't bring in too many confounding or external factors, specifically the experiment duration is low enough that the reward score can be computed while it's still relevant.
+* You can express the context for the rank as a dictionary of at least 5 features that you think would help make the right choice, and that doesn't include personally identifiable information.
+* You have information about each action as a dictionary of at least 5 attributes or features that you think will help Personalizer make the right choice.
+* You can retain data for long enough to accumulate a history of at least 100,000 interactions.
+
+## Machine learning considerations for applying Personalizer
+
+Personalizer is based on reinforcement learning, an approach to machine learning that gets taught by feedback you give it. 
+
+Personalizer will learn best in situations where:
+* There's enough events to stay on top of optimal personalization if the problem drifts over time (such as preferences in news or fashion). Personalizer will adapt to continuous change in the real world, but results won't be optimal if there's not enough events and data to learn from to discover and settle on new patterns. You should choose a use case that happens often enough. Consider looking for use cases that happen at least 500 times per day.
+* Context and actions have enough  features to facilitate learning.
+* There are less than 50 actions for rank per call.
+* Your data retention settings allow Personalizer to collect enough data to perform offline evaluations and policy optimization. This is typically at least 50,000 data points.
+
+## How to use Personalizer in a web application
+
+Adding a loop to a web application includes:
+
+* Determine which experience to personalize, what actions and features you have, what context features to use, and what reward you'll set.
+* Add a reference to the Personalization SDK in your application.
+* Call the Rank API when you are ready to personalize.
+* Store the eventId. You send a reward with the Reward API later.
+1. Call Activate for the event once you're sure the user has seen your personalized page.
+1. Wait for user selection of ranked content. 
+1. Call Reward API to specify how well the output of the Rank API did.
+
+## How to use Personalizer with a chat bot
+
+In this example, you will see how to use Personalization to make a default suggestion instead of sending the user down a series of menus or choices every time.
+
+* Get the [code](https://github.com/Azure-Samples/cognitive-services-personalizer-samples/tree/master/samples/ChatbotExample) for this sample.
+* Set up your bot solution. Make sure to publish your LUIS application. 
+* Manage Rank and Reward API calls for bot.
+    * Add code to manage LUIS intent processing. If the **None** is returned as the top intent or the top intent's score is below your business logic threshold, send the intents list to Personalizer to Rank the intents.
+    * Show intent list to user as selectable links with the first intent being the top-ranked intent from Rank API response.
+    * Capture the user's selection and send this in the Reward API call. 
+
+### Recommended bot patterns
+
+* Make Personalizer Rank API calls every time a disambiguation is needed, as opposed to caching results for each user. The result of disambiguating intent may change over time for one person, and allowing the Rank API to explore variances will accelerate overall learning.
+* Choose an interaction that is common with many users so that you have enough data to personalize. For example, introductory questions may be better fits than smaller clarifications deep in the conversation graph that only a few users may get to.
+* Use Rank API calls to enable "first suggestion is right" conversations, where the user gets asked "Would you like X?" or "Did you mean X?" and the user can just confirm; as opposed to giving options to the user where they must choose from a menu. For example User:"I'd like to order a coffee" Bot:"Would you like a double espresso?". This way the reward signal is also strong as it pertains directly to the one suggestion.
+
+## How to use Personalizer with a recommendation solution
+
+Use your recommendation engine to filter down a large catalog to a few items which can then be presented as 30 possible actions sent to the Rank API.
+
+You can use recommendation engines with Personalizer:
+
+* Set up the [recommendation solution](https://github.com/Microsoft/Recommenders/). 
+* When displaying a page, invoke the Recommendation Model to get a short list of recommendations.
+* Call Personalization to Rank the Output of Recommendation Solution.
+* Send feedback about your user action with the Reward API call.
+
+
+## Pitfalls to avoid
+
+* Don't use Personalizer where the personalized behavior isn't something that can be discovered across all users but rather something that should be remembered for specific users, or comes from a user-specific list of alternatives. For example, using Personalizer to suggest a first pizza order from a list of 20 possible menu items is useful, but which contact to call from the users' contact list when requiring help with childcare (such as "Grandma") is not something that is personalizable across your user base.
+
+
+## Adding content safeguards to your application
+
+If your application allows for large variances in content shown to users, and some of that content may be unsafe or inappropriate for some users, you should plan ahead to make sure that the right safeguards are in place to prevent your users from seeing unacceptable content. The best pattern to implement safeguards is:
+The best pattern to implement safeguards is:
+    * Obtain the list of actions to rank.
+    * Filter out the ones that are not viable for the audience.
+    * Only rank these viable actions.
+    * Display the top ranked action to the user.
+
+In some architectures, the above sequence may be hard to implement. In that case, there is an alternative approach to implementing safeguards after ranking, but a provision needs to be made so actions that falls outside the safeguard are not used to train the Personalizer model.
+
+* Obtain the list of actions to rank, with learning deactivated.
+* Rank actions.
+* Check if the top action is viable.
+    * If the top action is viable, activate learning for this rank, then show it to the user.
+    * If the top action is not viable, do not activate learning for this ranking, and decide through your own logic or alternative approaches what to show to the user. Even if you use the second-best ranked option, do not activate learning for this ranking.
+
+## Verifying adequate effectiveness of Personalizer
+
+You can monitor the effectiveness of Personalizer periodically by performing [offline evaluations](how-to-offline-evaluation.md)
+
 ## Next steps
 
 Understand [where you can use Personalizer](where-can-you-use-personalizer.md).
diff --git a/articles/cognitive-services/personalizer/toc.yml b/articles/cognitive-services/personalizer/toc.yml
@@ -43,6 +43,8 @@
     - name: Offline evaluation
       href: concepts-offline-evaluation.md
       displayName: learning policy, Counterfactual, features
+  - name: Scalability and performance
+    href: concepts-scalability-performance.md 
 - name: How-to guides
   items:
   - name: Create and configure Personalizer