You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/algorithm-module-reference/module-reference.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@ For help with choosing algorithms, see
52
52
| R language | Write code and embed it in a module to integrate R with your pipeline. |[Execute R Script](execute-r-script.md)|
53
53
| Text Analytics | Provide specialized computational tools for working with both structured and unstructured text. |[Extract N Gram Features from Text](extract-n-gram-features-from-text.md) <br/> [Feature Hashing](feature-hashing.md) <br/> [Preprocess Text](preprocess-text.md)|
Copy file name to clipboardExpand all lines: articles/machine-learning/algorithm-module-reference/pca-based-anomaly-detection.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ This article describes how to use the **PCA-Based Anomaly Detection** module in
18
18
19
19
This module helps you build a model in scenarios where it is easy to obtain training data from one class, such as valid transactions, but difficult to obtain sufficient samples of the targeted anomalies.
20
20
21
-
For example, to detect fraudulent transactions, very often you don't have enough examples of fraud to train on, but have many examples of good transactions. The **PCA-Based Anomaly Detection** module solves the problem by analyzing available features to determine what constitutes a "normal" class, and applying distance metrics to identify cases that represent anomalies. This let you train a model using existing imbalanced data.
21
+
For example, to detect fraudulent transactions, very often you don't have enough examples of fraud to train on, but have many examples of good transactions. The **PCA-Based Anomaly Detection** module solves the problem by analyzing available features to determine what constitutes a "normal" class, and applying distance metrics to identify cases that represent anomalies. This lets you train a model using existing imbalanced data.
22
22
23
23
## More about Principal Component Analysis
24
24
@@ -28,17 +28,17 @@ PCA works by analyzing data that contains multiple variables. It looks for corre
28
28
29
29
For anomaly detection, each new input is analyzed, and the anomaly detection algorithm computes its projection on the eigenvectors, together with a normalized reconstruction error. The normalized error is used as the anomaly score. The higher the error, the more anomalous the instance is.
30
30
31
-
For additional information about how PCA works, and about the implementation for anomaly detection, see these papers:
31
+
For more information about how PCA works, and about the implementation for anomaly detection, see these papers:
32
32
33
-
-[A randomized algorithm for principal component analysis](https://arxiv.org/abs/0809.2274). Rokhlin, Szlan and Tygert
33
+
-[A randomized algorithm for principal component analysis](https://arxiv.org/abs/0809.2274). Rokhlin, Szlan, and Tygert
34
34
35
-
-[Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions](http://users.cms.caltech.edu/~jtropp/papers/HMT11-Finding-Structure-SIREV.pdf) (PDF download). Halko, Martinsson and Tropp.
35
+
-[Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions](http://users.cms.caltech.edu/~jtropp/papers/HMT11-Finding-Structure-SIREV.pdf) (PDF download). Halko, Martinsson, and Tropp.
36
36
37
37
## How to configure PCA Anomaly Detection
38
38
39
39
1. Add the **PCA-Based Anomaly Detection** module to your pipeline in the designer. You can find this module in the **Anomaly Detection** category.
40
40
41
-
2. In the **Properties** pane for the **PCA-Based Anomaly Detection** module, click the **Training mode** option, and indicate whether you want to train the model using a specific set of parameters, or use a parameter sweep to find the best parameters.
41
+
2. In the right panel of the **PCA-Based Anomaly Detection** module, click the **Training mode** option, and indicate whether you want to train the model using a specific set of parameters, or use a parameter sweep to find the best parameters.
42
42
43
43
-**Single Parameter**: Select this option if you know how you want to configure the model, and provide a specific set of values as arguments.
44
44
@@ -57,7 +57,7 @@ For additional information about how PCA works, and about the implementation for
57
57
-**Oversampling parameter for randomized PCA**: Type a single whole number that represents the ratio of oversampling of the minority class over the normal class. (Available when using the **Single parameter** training method.)
58
58
59
59
> [!NOTE]
60
-
> You cannot view the oversampled data set. For additional details of how oversampling is used with PCA, see [Technical notes](#technical-notes).
60
+
> You cannot view the oversampled data set. For more information of how oversampling is used with PCA, see [Technical notes](#technical-notes).
61
61
62
62
5.**Enable input feature mean normalization**: Select this option to normalize all input features to a mean of zero. Normalization or scaling to zero is generally recommended for PCA, because the goal of PCA is to maximize variance among variables.
63
63
@@ -73,7 +73,7 @@ For additional information about how PCA works, and about the implementation for
73
73
74
74
When training is complete, you can either save the trained model, or connect it to the [Score Model](score-model.md) module to predict anomaly scores.
75
75
76
-
To evaluate the results of an anomaly detection models requires some additional steps:
76
+
Evaluating the results of an anomaly detection model requires some additional steps:
77
77
78
78
1. Ensure that a score column is available in both datasets
79
79
@@ -103,4 +103,4 @@ This algorithm uses PCA to approximate the subspace containing the normal class.
103
103
104
104
See the [set of modules available](module-reference.md) to Azure Machine Learning.
105
105
106
-
See [Exceptions and error codes for the designer (preview)](designer-error-codes.md) for a list of errors specific to the designer modules.
106
+
See [Exceptions and error codes for the designer (preview)](designer-error-codes.md) for a list of errors specific to the designer modules.''
0 commit comments