Skip to content

Commit 959fba4

Browse files
committed
Update class type for freshness
1 parent 41aae8a commit 959fba4

File tree

1 file changed

+37
-38
lines changed

1 file changed

+37
-38
lines changed

articles/lab-services/class-type-big-data-analytics.md

Lines changed: 37 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,61 @@
11
---
2-
title: Set up a lab to teach big data analytics using Azure Lab Services | Microsoft Docs
3-
description: Learn how to set up a lab to teach the big data analytics using Docker deployment of Hortonworks Data Platform (HDP).
4-
author: nicolela
5-
ms.topic: how-to
6-
ms.date: 03/08/2022
7-
ms.custom: devdivchpfy22
2+
title: Set up big data analytics lab
3+
titleSuffix: Azure Lab Services
4+
description: Learn how to set up a lab in Azure Lab Services to teach the big data analytics using Docker deployment of Hortonworks Data Platform (HDP).
5+
services: lab-services
86
ms.service: lab-services
7+
author: ntrogh
8+
ms.author: nicktrog
9+
ms.topic: how-to
10+
ms.date: 04/25/2023
911
---
1012

11-
# Set up a lab for big data analytics using Docker deployment of HortonWorks Data Platform
13+
# Set up a lab for big data analytics in Azure Lab Services using Docker deployment of HortonWorks Data Platform
1214

1315
[!INCLUDE [preview note](./includes/lab-services-new-update-focused-article.md)]
1416

15-
This article shows you how to set up a lab to teach a big data analytics class. A big data analytics class teaches students to learn how to handle large volumes of data. It also teaches them to apply machine and statistical learning algorithms to derive data insights. A key objective for students is to learn how to use data analytics tools, such as [Apache Hadoop's open-source software package](https://hadoop.apache.org/). The software package provides tools for storing, managing, and processing big data.
17+
This article shows you how to set up a lab to teach a big data analytics class. A big data analytics class teaches users how to handle large volumes of data. It also teaches them to apply machine and statistical learning algorithms to derive data insights. A key objective is to learn how to use data analytics tools, such as [Apache Hadoop's open-source software package](https://hadoop.apache.org/). The software package provides tools for storing, managing, and processing big data.
1618

17-
In this lab, students will use a popular commercial version of Hadoop provided by [Cloudera](https://www.cloudera.com/), called [Hortonworks Data Platform (HDP)](https://www.cloudera.com/products/hdp.html). Specifically, students will use [HDP Sandbox 3.0.1](https://www.cloudera.com/tutorials/getting-started-with-hdp-sandbox/1.html) that's a simplified, easy-to-use version of the platform. HDP Sandbox 3.0.1 is also free of cost and is intended for learning and experimentation. Although this class may use either Windows or Linux virtual machines (VM) with HDP Sandbox deployed. This article will show you how to use Windows.
19+
In this lab, lab users work with a popular commercial version of Hadoop provided by [Cloudera](https://www.cloudera.com/), called [Hortonworks Data Platform (HDP)](https://www.cloudera.com/products/hdp.html). Specifically, lab users use [HDP Sandbox 3.0.1](https://www.cloudera.com/tutorials/getting-started-with-hdp-sandbox/1.html) that's a simplified, easy-to-use version of the platform. HDP Sandbox 3.0.1 is also free of cost and is intended for learning and experimentation. Although this class may use either Windows or Linux virtual machines (VM) with HDP Sandbox deployed. This article shows you how to use Windows.
1820

19-
Another interesting aspect is that we'll deploy HDP Sandbox on the lab VMs using [Docker](https://www.docker.com/) containers. Each Docker container provides its own isolated environment for software applications to run inside. Conceptually, Docker containers are like nested VMs and can be used to easily deploy and run a wide variety of software applications based on container images provided on [Docker Hub](https://www.docker.com/products/docker-hub). Cloudera's deployment script for HDP Sandbox automatically pulls the [HDP Sandbox 3.0.1 Docker image](https://hub.docker.com/r/hortonworks/sandbox-hdp) from Docker Hub and runs two Docker containers:
21+
Another interesting aspect is that you deploy the HDP Sandbox on the lab VMs using [Docker](https://www.docker.com/) containers. Each Docker container provides its own isolated environment for software applications to run inside. Conceptually, Docker containers are like nested VMs and can be used to easily deploy and run a wide variety of software applications based on container images provided on [Docker Hub](https://www.docker.com/products/docker-hub). Cloudera's deployment script for HDP Sandbox automatically pulls the [HDP Sandbox 3.0.1 Docker image](https://hub.docker.com/r/hortonworks/sandbox-hdp) from Docker Hub and runs two Docker containers:
2022

2123
- sandbox-hdp
2224
- sandbox-proxy
2325

24-
## Lab configuration
26+
## Prerequisites
2527

26-
To set up this lab, you need an Azure subscription to get started. If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/) before you begin.
28+
[!INCLUDE [must have subscription](./includes/lab-services-class-type-subscription.md)]
29+
30+
## Lab configuration
2731

2832
### Lab plan settings
2933

30-
Once you've an Azure subscription, you can create a new lab plan in Azure Lab Services. For more information about creating a new lab plan, see the tutorial on [how to set up a lab plan](./quick-create-resources.md). You can also use an existing lab plan.
34+
[!INCLUDE [must have lab plan](./includes/lab-services-class-type-lab-plan.md)]
3135

32-
Enable your lab plan settings as described in the following table. For more information about how to enable Azure Marketplace images, see [Specify the Azure Marketplace images available to lab creators](./specify-marketplace-images.md).
36+
This lab uses a Windows 10 Pro Azure Marketplace images as the base VM image. You first need to enable this image in your lab plan. This lets lab creators then select the image as a base image for their lab.
3337

34-
| Lab plan setting | Instructions |
35-
| ------------------- | ------------ |
36-
|Marketplace image| Enable the **Windows 10 Pro** image.|
38+
Follow these steps to [enable these Azure Marketplace images available to lab creators](specify-marketplace-images.md). Select one of the **Windows 10** Azure Marketplace images.
3739

3840
### Lab settings
3941

40-
For instructions on how to create a lab, see [Tutorial: Set up a lab](tutorial-setup-lab.md). Use the following settings when creating the lab.
42+
Create a lab for your lab plan. [!INCLUDE [create lab](./includes/lab-services-class-type-lab.md)] Use the following settings when creating the lab.
4143

4244
| Lab settings | Value/instructions |
4345
| ------------ | ------------------ |
4446
|Virtual Machine Size| **Medium (Nested Virtualization)**. This VM size is best suited for relational databases, in-memory caching, and analytics. The size also supports nested virtualization.|
45-
|Virtual Machine Image| Windows 10 Pro|
46-
47+
|Virtual Machine Image| **Windows 10 Pro**|
48+
4749
> [!NOTE]
48-
> We need to use Medium (Nested Virtualization) since deploying HDP Sandbox using Docker requires Windows Hyper-V with nested virtualization and at least 10 GB of RAM.
50+
> Use the Medium (Nested Virtualization) VM size because the HDP Sandbox using Docker requires Windows Hyper-V with nested virtualization and at least 10 GB of RAM.
4951
5052
## Template machine configuration
5153

52-
To set up the template machine, we'll:
54+
To set up the template machine:
5355

54-
- Install Docker
55-
- Deploy HDP Sandbox
56-
- Use PowerShell and Windows Task Scheduler to automatically start the Docker containers
56+
1. Install Docker
57+
1. Deploy HDP Sandbox
58+
1. Use PowerShell and Windows Task Scheduler to automatically start the Docker containers
5759

5860
### Install Docker
5961

@@ -77,7 +79,7 @@ To use Docker containers, you must first install Docker Desktop on the template
7779
7880
### Deploy HDP Sandbox
7981

80-
In this section, you'll deploy HDP Sandbox and then access HDP Sandbox using the browser.
82+
Next, deploy HDP Sandbox and then access HDP Sandbox using the browser.
8183

8284
1. Ensure that you have installed [Git Bash](https://gitforwindows.org/) as listed in the [Prerequisites section](https://www.cloudera.com/tutorials/sandbox-deployment-and-install-guide/3.html#prerequisites) of the guide. It's recommended for completing the next steps.
8385

@@ -97,26 +99,23 @@ In this section, you'll deploy HDP Sandbox and then access HDP Sandbox using the
9799
> [!NOTE]
98100
> These instructions assume that you have first mapped the local IP address of the sandbox environment to the sandbox-hdp.hortonworks.com in the host file on your template VM. If you **don't** do this mapping, you can access the Sandbox Welcome page by navigating to `http://localhost:8080`.
99101
100-
### Automatically start Docker containers when students log in
102+
### Automatically start Docker containers when lab users sign in
101103

102-
To provide an easy to use, experience for students, we'll use a PowerShell script that automatically:
104+
To provide an easy-to-use experience for lab users, create a PowerShell script that automatically:
103105

104-
- Starts the HDP Sandbox Docker containers when a student starts and connects to their lab VM.
105-
- Launches the browser and navigates to the Sandbox Welcome Page.
106+
1. Starts the HDP Sandbox Docker containers when a lab user starts and connects to their lab VM.
107+
1. Launches the browser and navigates to the Sandbox Welcome page.
106108

107-
We'll also use Windows Task Scheduler to automatically run this script when a student logs into their VM.
108-
To set up a Task Scheduler, follow these steps: [Big Data Analytics scripting](https://aka.ms/azlabs/scripts/BigDataAnalytics).
109+
Use Windows Task Scheduler to automatically run this script when a lab user logs into their VM. To set up a Task Scheduler, follow these steps: [Big Data Analytics scripting](https://aka.ms/azlabs/scripts/BigDataAnalytics).
109110

110111
## Cost estimate
111112

112-
If you would like to estimate the cost of this lab, you can use the following example:
113+
This section provides a cost estimate for running this class for 25 users. There are 20 hours of scheduled class time. Also, each user gets 10 hours quota for homework or assignments outside scheduled class time. The virtual machine size we chose was **Medium (Nested Virtualization)**, which is 55 lab units.
113114

114-
For a class of 25 students with 20 hours of scheduled class time and 10 hours of quota for homework or assignments, the price for the lab would be:
115-
116-
25 students \* (20 + 10) hours \* 55 Lab Units \* 0.01 USD per hour = 412.50 USD
115+
- 25 lab users × (20 scheduled hours + 10 quota hours) × 55 lab units
117116

118-
>[!IMPORTANT]
119-
>Cost estimate is for example purposes only. For current details on pricing, see [Azure Lab Services Pricing](https://azure.microsoft.com/pricing/details/lab-services/).
117+
> [!IMPORTANT]
118+
> The cost estimate is for example purposes only. For current pricing information, see [Azure Lab Services pricing](https://azure.microsoft.com/pricing/details/lab-services/).
120119
121120
## Conclusion
122121

0 commit comments

Comments
 (0)