Skip to content

Commit b8c5bd2

Browse files
authored
Merge pull request #101481 from sdgilley/sdg-r-tutorial
remove DAAG package
2 parents 9fd3961 + 023175e commit b8c5bd2

File tree

2 files changed

+29
-28
lines changed

2 files changed

+29
-28
lines changed
-5.49 KB
Loading

articles/machine-learning/tutorial-1st-r-experiment.md

Lines changed: 29 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,31 @@
11
---
2-
title: "Tutorial: Your first ML model with R"
2+
title: "Tutorial: Logistic regression model in R"
33
titleSuffix: Azure Machine Learning
4-
description: In this tutorial, you learn the foundational design patterns in Azure Machine Learning, and train a logistic regression model model using R packages azuremlsdk and caret to predict likelihood of a fatality in an automobile accident.
4+
description: In this tutorial, you create a logistic regression model using R packages azuremlsdk and caret to predict likelihood of a fatality in an automobile accident.
55
services: machine-learning
66
ms.service: machine-learning
77
ms.subservice: core
88
ms.topic: tutorial
99
ms.reviewer: sgilley
1010
author: revodavid
1111
ms.author: davidsmi
12-
ms.date: 11/04/2019
12+
ms.date: 02/07/2020
1313
---
1414

15-
# Tutorial: Train and deploy your first model in R with Azure Machine Learning
15+
# Tutorial: Create a logistic regression model in R with Azure Machine Learning
1616
[!INCLUDE [applies-to-skus](../../includes/aml-applies-to-basic-enterprise-sku.md)]
1717

18-
In this tutorial, you learn the foundational design patterns in Azure Machine Learning. You'll train and deploy a **caret** model to predict the likelihood of a fatality in an automobile accident. After completing this tutorial, you'll have the practical knowledge of the R SDK to scale up to developing more-complex experiments and workflows.
19-
20-
In this tutorial, you learn the following tasks:
18+
In this tutorial you'll use R and Azure Machine Learning to create a logistic regression model that predicts the likelihood of a fatality in an automobile accident. After completing this tutorial, you'll have the practical knowledge of the Azure Machine Learning R SDK to scale up to developing more-complex experiments and workflows.
2119

20+
In this tutorial, you perform the following tasks:
2221
> [!div class="checklist"]
23-
> * Connect your workspace
22+
> * Create an Azure Machine Learning workspace
23+
> * Clone a notebook folder with the files necessary to run this tutorial into your workspace
24+
> * Open RStudio from your workspace
2425
> * Load data and prepare for training
25-
> * Upload data to the datastore so it is available for remote training
26-
> * Create a compute resource
27-
> * Train a caret model to predict probability of fatality
26+
> * Upload data to a datastore so it is available for remote training
27+
> * Create a compute resource to train the model remotely
28+
> * Train a `caret` model to predict probability of fatality
2829
> * Deploy a prediction endpoint
2930
> * Test the model from R
3031
@@ -61,13 +62,11 @@ You complete the following experiment set-up and run steps in Azure Machine Lear
6162

6263
1. Open the folder with a version number on it. This number represents the current release for the R SDK.
6364

64-
1. Open the **vignettes** folder.
65-
66-
1. Select the **"..."** at the right of the **train-and-deploy-to-aci** folder and then select **Clone**.
65+
1. Select the **"..."** at the right of the **vignettes** folder and then select **Clone**.
6766

6867
![Clone folder](media/tutorial-1st-r-experiment/clone-folder.png)
6968

70-
1. A list of folders displays showing each user who accesses the workspace. Select your folder to clone the **train-and-deploy-to-aci** folder there.
69+
1. A list of folders displays showing each user who accesses the workspace. Select your folder to clone the **vignettes** folder there.
7170

7271
## <a name="open">Open RStudio
7372

@@ -79,12 +78,13 @@ Use RStudio on a compute instance or Notebook VM to run this tutorial.
7978

8079
1. Once the compute is running, use the **RStudio** link to open RStudio.
8180

82-
1. In RStudio, your **train-and--deploy-to-aci** folder is a few levels down from **Users** in the **Files** section on the lower right. Select the **train-and-deploy-to-aci** folder to find the files needed in this tutorial.
81+
1. In RStudio, your *vignettes* folder is a few levels down from *Users* in the **Files** section on the lower right. Under *vignettes*, select the *train-and-deploy-to-aci* folder to find the files needed in this tutorial.
8382

8483
> [!Important]
85-
> The rest of this article contains the same content as you see in the **train-and-deploy-to-aci.Rmd** file.
84+
> The rest of this article contains the same content as you see in the *train-and-deploy-to-aci.Rmd* file.
8685
> If you are experienced with RMarkdown, feel free to use the code from that file. Or you can copy/paste the code snippets from there, or from this article into an R script or the command line.
8786
87+
8888
## Set up your development environment
8989
The setup for your development work in this tutorial includes the following actions:
9090

@@ -100,12 +100,6 @@ This tutorial assumes you already have the Azure ML SDK installed. Go ahead and
100100
library(azuremlsdk)
101101
```
102102

103-
The tutorial uses data from the [**DAAG** package](https://cran.r-project.org/package=DAAG). Install the package if you don't have it.
104-
105-
```R
106-
install.packages("DAAG")
107-
```
108-
109103
The training and scoring scripts (`accidents.R` and `accident_predict.R`) have some additional dependencies. If you plan on running those scripts locally, make sure you have those required packages as well.
110104

111105
### Load your workspace
@@ -143,15 +137,21 @@ wait_for_provisioning_completion(compute_target)
143137
```
144138

145139
## Prepare data for training
146-
This tutorial uses data from the **DAAG** package. This dataset includes data from over 25,000 car crashes in the US, with variables you can use to predict the likelihood of a fatality. First, import the data into R and transform it into a new dataframe `accidents` for analysis, and export it to an `Rdata` file.
140+
This tutorial uses data from the US [National Highway Traffic Safety Administration](https://cdan.nhtsa.gov/tsftables/tsfar.htm) (with thanks to [Mary C. Meyer and Tremika Finney](https://www.stat.colostate.edu/~meyer/airbags.htm)).
141+
This dataset includes data from over 25,000 car crashes in the US, with variables you can use to predict the likelihood of a fatality. First, import the data into R and transform it into a new dataframe `accidents` for analysis, and export it to an `Rdata` file.
147142

148143
```R
149-
library(DAAG)
150-
data(nassCDS)
151-
144+
nassCDS <- read.csv("nassCDS.csv",
145+
colClasses=c("factor","numeric","factor",
146+
"factor","factor","numeric",
147+
"factor","numeric","numeric",
148+
"numeric","character","character",
149+
"numeric","numeric","character"))
152150
accidents <- na.omit(nassCDS[,c("dead","dvcat","seatbelt","frontal","sex","ageOFocc","yearVeh","airbag","occRole")])
153151
accidents$frontal <- factor(accidents$frontal, labels=c("notfrontal","frontal"))
154152
accidents$occRole <- factor(accidents$occRole)
153+
accidents$dvcat <- ordered(accidents$dvcat,
154+
levels=c("1-9km/h","10-24","25-39","40-54","55+"))
155155

156156
saveRDS(accidents, file="accidents.Rd")
157157
```
@@ -390,5 +390,6 @@ You can also keep the resource group but delete a single workspace. Display the
390390

391391
## Next steps
392392

393-
Now that you've completed your first Azure Machine Learning experiment in R, learn more about the [Azure Machine Learning SDK for R](https://azure.github.io/azureml-sdk-for-r/index.html).
393+
* Now that you've completed your first Azure Machine Learning experiment in R, learn more about the [Azure Machine Learning SDK for R](https://azure.github.io/azureml-sdk-for-r/index.html).
394394

395+
* Learn more about Azure Machine Learning with R from the examples in the other *vignettes* folders.

0 commit comments

Comments
 (0)