You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/r-server/r-server-overview.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
8
ms.topic: overview
9
9
ms.custom: hdinsightactive
10
-
ms.date: 04/03/2020
10
+
ms.date: 04/20/2020
11
11
#Customer intent: As a developer I want to have a basic understanding of Microsoft's implementation of machine learning in Azure HDInsight so I can decide if I want to use it rather than build my own cluster.
12
12
---
13
13
@@ -19,7 +19,7 @@ ML Services on HDInsight provides the latest capabilities for R-based analytics
19
19
20
20
The edge node provides a convenient place to connect to the cluster and run your R scripts. The edge node allows running the ScaleR parallelized distributed functions across the cores of the server. You can also run them across the nodes of the cluster by using ScaleR's Hadoop Map Reduce. You can also use Apache Spark compute contexts.
21
21
22
-
The models or predictions that result from analysis can be downloaded for on-premises use. They can also be operationalized elsewhere in Azure. In particular, through [Azure Machine Learning Studio (classic)](https://studio.azureml.net), and [web service](../../machine-learning/studio/deploy-a-machine-learning-web-service.md).
22
+
The models or predictions that result from analysis can be downloaded for on-premises use. They can also be `operationalized` elsewhere in Azure. In particular, through [Azure Machine Learning Studio (classic)](https://studio.azureml.net), and [web service](../../machine-learning/studio/deploy-a-machine-learning-web-service.md).
23
23
24
24
## Get started with ML Services on HDInsight
25
25
@@ -59,16 +59,16 @@ The following features are included in ML Services on HDInsight.
59
59
| R-enabled |[R packages](https://docs.microsoft.com/machine-learning-server/r-reference/introducing-r-server-r-package-reference) for solutions written in R, with an open-source distribution of R, and run-time infrastructure for script execution. |
60
60
| Python-enabled | [Python modules](https://docs.microsoft.com/machine-learning-server/python-reference/introducing-python-package-reference) for solutions written in Python, with an open-source distribution of Python, and run-time infrastructure for script execution.
61
61
|[Pre-trained models](https://docs.microsoft.com/machine-learning-server/install/microsoftml-install-pretrained-models)| For visual analysis and text sentiment analysis, ready to score data you provide. |
62
-
|[Deploy and consume](r-server-operationalize.md)| Operationalize your server and deploy solutions as a web service. |
62
+
|[Deploy and consume](r-server-operationalize.md)|`Operationalize` your server and deploy solutions as a web service. |
63
63
|[Remote execution](r-server-hdinsight-manage.md#connect-remotely-to-microsoft-ml-services)| Start remote sessions on ML Services cluster on your network from your client workstation. |
64
64
65
65
## Data storage options for ML Services on HDInsight
66
66
67
-
Default storage for the HDFS file system can be an Azure Storage account or Azure Data Lake Storage. Uploaded data to cluster storage during analysis is made persistent. The data is available even after the cluster is deleted. Various tools can handle the data transfer to storage. The tools include the portal-based upload facility of the storage account and the [AzCopy](../../storage/common/storage-use-azcopy.md) utility.
67
+
Default storage for the HDFS file system can be an Azure Storage account or Azure Data Lake Storage. Uploaded data to cluster storage during analysis is made persistent. The data is available even after the cluster is deleted. Various tools can handle the data transfer to storage. The tools include the portal-based upload facility of the storage account and the AzCopy utility.
68
68
69
69
You can enable access to additional Blob and Data lake stores during cluster creation. You aren't limited by the primary storage option in use. See [Azure Storage options for ML Services on HDInsight](./r-server-storage.md) article to learn more about using multiple storage accounts.
70
70
71
-
You can also use [Azure Files](../../storage/files/storage-how-to-use-files-linux.md) as a storage option for use on the edge node. Azure Files enables file shares created in Azure Storage to the Linux file system. For more information, see [Azure Storage options for ML Services on HDInsight](r-server-storage.md).
71
+
You can also use Azure Files as a storage option for use on the edge node. Azure Files enables file shares created in Azure Storage to the Linux file system. For more information, see [Azure Storage options for ML Services on HDInsight](r-server-storage.md).
72
72
73
73
## Access ML Services edge node
74
74
@@ -78,9 +78,9 @@ You can connect to Microsoft ML Server on the edge node using a browser, or SSH/
78
78
79
79
Your R scripts can use any of the 8000+ open-source R packages. You can also use the parallelized and distributed routines from the ScaleR library. Scripts run on the edge node run within the R interpreter on that node. Except for steps that call ScaleR functions with a Map Reduce (RxHadoopMR) or Spark (RxSpark) compute context. The functions run in a distributed fashion across the data nodes that are associated with the data. For more information about context options, see [Compute context options for ML Services on HDInsight](r-server-compute-contexts.md).
80
80
81
-
## Operationalize a model
81
+
## `Operationalize` a model
82
82
83
-
When your data modeling is complete, you can operationalize the model to make predictions for new data either from Azure or on-premises. This process is known as scoring. Scoring can be done in HDInsight, Azure Machine Learning, or on-premises.
83
+
When your data modeling is complete, `operationalize` the model to make predictions for new data either from Azure or on-premises. This process is known as scoring. Scoring can be done in HDInsight, Azure Machine Learning, or on-premises.
84
84
85
85
### Score in HDInsight
86
86
@@ -92,7 +92,7 @@ To score using Azure Machine Learning, use the open-source Azure Machine Learnin
92
92
93
93
### Score on-premises
94
94
95
-
To score on-premises after creating your model: serialize the model in R, download it, de-serialize it, then use it for scoring new data. You can score new data by using the approach described earlier in [Score in HDInsight](#score-in-hdinsight) or by using [web services](https://docs.microsoft.com/machine-learning-server/operationalize/concept-what-are-web-services).
95
+
To score on-premises after creating your model: serialize the model in R, download it, de-serialize it, then use it for scoring new data. You can score new data by using the approach described earlier in Score in HDInsight or by using [web services](https://docs.microsoft.com/machine-learning-server/operationalize/concept-what-are-web-services).
96
96
97
97
## Maintain the cluster
98
98
@@ -126,7 +126,7 @@ Running jobs might slow down during maintenance. However, they should still run
126
126
127
127
The Linux edge node of an HDInsight cluster is the landing zone for R-based analysis. Recent versions of HDInsight provide a browser-based IDE of RStudio Server on the edge node. RStudio Server is more productive than the R console for development and execution.
128
128
129
-
A desktop IDE can access the cluster through a remote MapReduce or Spark compute context. Options include: Microsoft's [R Tools for Visual Studio](https://marketplace.visualstudio.com/items?itemName=MikhailArkhipov007.RTVS2019) (RTVS), RStudio, and Walware's Eclipse-based [StatET](http://www.walware.de/goto/statet).
129
+
A desktop IDE can access the cluster through a remote MapReduce or Spark compute context. Options include: Microsoft's [R Tools for Visual Studio](https://marketplace.visualstudio.com/items?itemName=MikhailArkhipov007.RTVS2019) (RTVS), RStudio, and Walware's Eclipse-based StatET.
130
130
131
131
Access the R console on the edge node by typing **R** at the command prompt. When using the console interface, it's convenient to develop R script in a text editor. Then cut and paste sections of your script into the R console as needed.
0 commit comments