Skip to content

Commit 30065c1

Browse files
Merge pull request #217765 from rmca14/11-8edits
[CogSvcs] Additional edits from PR 215826
2 parents 4850cf8 + f2c4ba2 commit 30065c1

File tree

1 file changed

+21
-40
lines changed

1 file changed

+21
-40
lines changed

articles/cognitive-services/personalizer/concepts-features.md

Lines changed: 21 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,10 @@ ms.topic: conceptual
1111
ms.date: 10/25/2022
1212
---
1313

14-
# Context and Actions
14+
# Context and actions
1515

1616
Personalizer works by learning what your application should show to users in a given context. These are the two most important pieces of information that you pass into Personalizer. The **context** represents the information you have about the current user or the state of your system, and the **actions** are the options to be chosen from.
1717

18-
## Table of Contents
19-
20-
* [Context](#context) Information about the current user or state of the system
21-
* [Actions](#actions) A list of options to choose from
22-
* [Features](#features) Attributes describing the Context and Actions
23-
* [Feature Engineering](#feature-engineering) Tips for constructing impactful features
24-
* [Namespaces](#namespaces) Grouping Features
25-
* [Examples](#json-examples) Examples of Context and Action features in JSON format
26-
27-
2818
## Context
2919

3020
Information for the _context_ depends on each application and use case, but it typically may include information such as:
@@ -34,18 +24,16 @@ Information for the _context_ depends on each application and use case, but it t
3424
* Information about the current time, such as day of the week, weekend or not, morning or afternoon, holiday season or not, etc.
3525
* Information extracted from mobile applications, such as location, movement, or battery level.
3626
* Historical aggregates of the behavior of users - such as what are the movie genres this user has viewed the most.
37-
* Information about the state of the system.
27+
* Information about the state of the system.
3828

3929
Your application is responsible for loading the information about the context from the relevant databases, sensors, and systems you may have. If your context information doesn't change, you can add logic in your application to cache this information, before sending it to the Rank API.
4030

41-
4231
## Actions
4332

4433
Actions represent a list of options.
4534

4635
Don't send in more than 50 actions when Ranking actions. These may be the same 50 actions every time, or they may change. For example, if you have a product catalog of 10,000 items for an e-commerce application, you may use a recommendation or filtering engine to determine the top 40 a customer may like, and use Personalizer to find the one that will generate the most reward (for example, the user will add to the basket) for the current context.
4736

48-
4937
### Examples of actions
5038

5139
The actions you send to the Rank API will depend on what you are trying to personalize.
@@ -61,18 +49,15 @@ Here are some examples:
6149
|Choose a chat bot's response to clarify user intent or suggest an action.|Each action is an option of how to interpret the response.|
6250
|Choose what to show at the top of a list of search results|Each action is one of the top few search results.|
6351

64-
6552
### Load actions from the client application
6653

6754
Features from actions may typically come from content management systems, catalogs, and recommender systems. Your application is responsible for loading the information about the actions from the relevant databases and systems you have. If your actions don't change or getting them loaded every time has an unnecessary impact on performance, you can add logic in your application to cache this information.
6855

69-
7056
### Prevent actions from being ranked
7157

7258
In some cases, there are actions that you don't want to display to users. The best way to prevent an action from being ranked is by adding it to the [Excluded Actions](https://learn.microsoft.com/dotnet/api/microsoft.azure.cognitiveservices.personalizer.models.rankrequest.excludedactions) list, or not passing it to the Rank Request.
7359

74-
In some cases, you might not want events to be trained on by default, i.e., you only want to train events when a specific condition is met. For example, The personalized part of your webpage is below the fold (users have to scroll before interacting with the personalized content). In this case you will render the entire page, but only want an event to be trained on when the user scrolls and has a chance to interact with the personalized content. For these cases, you should [Defer Event Activation](concept-active-inactive-events.md) to avoid assigning default reward (and training) events which the end user did not have a chance to interact with.
75-
60+
In some cases, you might not want events to be trained on by default. In other words, you only want to train events when a specific condition is met. For example, The personalized part of your webpage is below the fold (users have to scroll before interacting with the personalized content). In this case you will render the entire page, but only want an event to be trained on when the user scrolls and has a chance to interact with the personalized content. For these cases, you should [Defer Event Activation](concept-active-inactive-events.md) to avoid assigning default reward (and training) events which the end user did not have a chance to interact with.
7661

7762
## Features
7863

@@ -91,14 +76,14 @@ Personalizer does not prescribe, limit, or fix what features you can send for ac
9176

9277
It's ok and natural for features to change over time. However, keep in mind that Personalizer's machine learning model adapts based on the features it sees. If you send a request containing all new features, Personalizer's model will not be able to leverage past events to select the best action for the current event. Having a 'stable' feature set (with recurring features) will help the performance of Personalizer's machine learning algorithms.
9378

94-
### Context Features
79+
### Context features
9580
* Some context features may only be available part of the time. For example, if a user is logged into the online grocery store website, the context will contain features describing purchase history. These features will not be available for a guest user.
9681
* There must be at least one context feature. Personalizer does not support an empty context.
9782
* If the context features are identical for every request, Personalizer will choose the globally best action.
9883

99-
### Action Features
84+
### Action features
10085
* Not all actions need to contain the same features. For example, in the online grocery store scenario, microwavable popcorn will have a "cooking time" feature, while a cucumber will not.
101-
* Features for a certain action ID may be available one day, but later on become unavailable.
86+
* Features for a certain action ID may be available one day, but later on become unavailable.
10287

10388
Examples:
10489

@@ -112,17 +97,16 @@ The following are good examples for action features. These will depend a lot on
11297

11398
Personalizer supports features of string, numeric, and boolean types. It's very likely that your application will mostly use string features, with a few exceptions.
11499

115-
### How feature types affects the Machine Learning in Personalizer
100+
### How feature types affect machine learning in Personalizer
116101

117-
* **Strings**: For string types, every key-value (feature name, feature value) combination is treated as a One-Hot feature (e.g. category:"Produce" and category:"Meat" would internally be represented as different features in the machine learning model.
102+
* **Strings**: For string types, every key-value (feature name, feature value) combination is treated as a One-Hot feature (for example, category:"Produce" and category:"Meat" would internally be represented as different features in the machine learning model).
118103
* **Numeric**: Only use numeric values when the number is a magnitude that should proportionally affect the personalization result. This is very scenario dependent. Features that are based on numeric units but where the meaning isn't linear - such as Age, Temperature, or Person Height - are best encoded as categorical strings. For example Age could be encoded as "Age":"0-5", "Age":"6-10", etc. Height could be bucketed as "Height": "<5'0", "Height": "5'0-5'4", "Height": "5'5-5'11", "Height":"6'0-6-4", "Height":">6'4".
119104
* **Boolean**
120-
* **Arrays** ONLY numeric arrays are supported.
121-
105+
* **Arrays** Only numeric arrays are supported.
122106

123-
## Feature Engineering
107+
## Feature engineering
124108

125-
* Use categorical and string types for features that are not a magnitude.
109+
* Use categorical and string types for features that are not a magnitude.
126110
* Make sure there are enough features to drive personalization. The more precisely targeted the content needs to be, the more features are needed.
127111
* There are features of diverse *densities*. A feature is *dense* if many items are grouped in a few buckets. For example, thousands of videos can be classified as "Long" (over 5 min long) and "Short" (under 5 min long). This is a *very dense* feature. On the other hand, the same thousands of items can have an attribute called "Title", which will almost never have the same value from one item to another. This is a very non-dense or *sparse* feature.
128112

@@ -134,21 +118,21 @@ Having features of high density helps Personalizer extrapolate learning from one
134118
* **Sending user IDs** With large numbers of users, it's unlikely that this information is relevant to Personalizer learning to maximize the average reward score. Sending user IDs (even if non-PII) will likely add more noise to the model and is not recommended.
135119
* **Sending unique values that will rarely occur more than a few times**. It's recommended to bucket your features to a higher level-of-detail. For example, having features such as `"Context.TimeStamp.Day":"Monday"` or `"Context.TimeStamp.Hour":13` can be useful as there are only 7 and 24 unique values, respectively. However, `"Context.TimeStamp":"1985-04-12T23:20:50.52Z"` is very precise and has an extremely large number of unique values, which makes it very difficult for Personalizer to learn from it.
136120

137-
### Improve feature sets
121+
### Improve feature sets
138122

139123
Analyze the user behavior by running a [Feature Evaluation Job](how-to-feature-evaluation.md). This allows you to look at past data to see what features are heavily contributing to positive rewards versus those that are contributing less. You can see what features are helping, and it will be up to you and your application to find better features to send to Personalizer to improve results even further.
140124

141125
### Expand feature sets with artificial intelligence and cognitive services
142126

143-
Artificial Intelligence and ready-to-run Cognitive Services can be a very powerful addition to Personalizer.
127+
Artificial Intelligence and ready-to-run Cognitive Services can be a very powerful addition to Personalizer.
144128

145129
By preprocessing your items using artificial intelligence services, you can automatically extract information that is likely to be relevant for personalization.
146130

147131
For example:
148132

149-
* You can run a movie file via [Video Indexer](https://azure.microsoft.com/services/media-services/video-indexer/) to extract scene elements, text, sentiment, and many other attributes. These attributes can then be made more dense to reflect characteristics that the original item metadata didn't have.
133+
* You can run a movie file via [Video Indexer](https://azure.microsoft.com/services/media-services/video-indexer/) to extract scene elements, text, sentiment, and many other attributes. These attributes can then be made more dense to reflect characteristics that the original item metadata didn't have.
150134
* Images can be run through object detection, faces through sentiment, etc.
151-
* Information in text can be augmented by extracting entities, sentiment, expanding entities with Bing knowledge graph, etc.
135+
* Information in text can be augmented by extracting entities, sentiment, and expanding entities with Bing knowledge graph.
152136

153137
You can use several other [Azure Cognitive Services](https://www.microsoft.com/cognitive-services), like
154138

@@ -157,17 +141,16 @@ You can use several other [Azure Cognitive Services](https://www.microsoft.com/c
157141
* [Emotion](../face/overview.md)
158142
* [Computer Vision](../computer-vision/overview.md)
159143

160-
### Use Embeddings as Features
144+
### Use embeddings as features
161145

162146
Embeddings from various Machine Learning models have proven to be affective features for Personalizer
163147

164148
* Embeddings from Large Language Models
165149
* Embeddings from Computer Vision Models
166150

167-
168151
## Namespaces
169152

170-
Optionally, features can be organized using namespaces (relevant for both context and action features). Namespaces can be used to group features by topic, by source, or any other grouping that makes sense in your application. You determine if namespaces are used and what they should be. Namespaces organize features into distinct sets, and disambiguate features with similar names. You can think of namespaces as a 'prefix' that is added to feature names. Namespaces should not be nested.
153+
Optionally, features can be organized using namespaces (relevant for both context and action features). Namespaces can be used to group features by topic, by source, or any other grouping that makes sense in your application. You determine if namespaces are used and what they should be. Namespaces organize features into distinct sets, and disambiguate features with similar names. You can think of namespaces as a 'prefix' that is added to feature names. Namespaces should not be nested.
171154

172155
The following are examples of feature namespaces used by applications:
173156

@@ -191,13 +174,12 @@ The following are examples of feature namespaces used by applications:
191174
* The following characters cannot be used: codes < 32 (not printable), 32 (space), 58 (colon), 124 (pipe), and 126–140.
192175
* All namespaces starting with an underscore `_` will be ignored.
193176

194-
195-
## JSON Examples
177+
## JSON examples
196178

197179
### Actions
198180
When calling Rank, you will send multiple actions to choose from:
199181

200-
JSON objects can include nested JSON objects and simple property/values. An array can be included only if the array items are numbers.
182+
JSON objects can include nested JSON objects and simple property/values. An array can be included only if the array items are numbers.
201183

202184
```json
203185
{
@@ -266,7 +248,7 @@ JSON objects can include nested JSON objects and simple property/values. An arra
266248

267249
Context is expressed as a JSON object that is sent to the Rank API:
268250

269-
JSON objects can include nested JSON objects and simple property/values. An array can be included only if the array items are numbers.
251+
JSON objects can include nested JSON objects and simple property/values. An array can be included only if the array items are numbers.
270252

271253
```JSON
272254
{
@@ -290,12 +272,11 @@ JSON objects can include nested JSON objects and simple property/values. An arra
290272

291273
### Namespaces
292274

293-
In the following JSON, `user`, `environment`, `device`, and `activity` are namespaces.
275+
In the following JSON, `user`, `environment`, `device`, and `activity` are namespaces.
294276

295277
> [!Note]
296278
> We strongly recommend using names for feature namespaces that are UTF-8 based and start with different letters. For example, `user`, `environment`, `device`, and `activity` start with `u`, `e`, `d`, and `a`. Currently having namespaces with same first characters could result in collisions.
297279
298-
299280
```JSON
300281
{
301282
"contextFeatures": [

0 commit comments

Comments
 (0)