Merge pull request #156668 from nicolehaugen/gpudrivers

v-andreaco · web-flow · commit 828f857e29be · 2021-05-10T15:45:08.000-06:00
Added tensorflow info for enabling GPU
diff --git a/articles/lab-services/class-type-jupyter-notebook.md b/articles/lab-services/class-type-jupyter-notebook.md
@@ -37,6 +37,7 @@ Configure **Virtual machine size** and **Virtual machine image** settings as sho
 | Virtual machine size | <p>The size you pick here depends on the workload you want to run:</p><ul><li>Small or Medium – good for a basic setup of accessing Jupyter Notebooks</li><li>Small GPU (Compute) – best suited for compute-intensive and network-intensive applications like Artificial Intelligence and Deep Learning</li></ul> | 
 | Virtual machine image | <p>Choose one of the following images based on your operating system needs:</p><ul><li>[Data Science Virtual Machine – Windows Server 2019](https://azuremarketplace.microsoft.com/marketplace/apps/microsoft-dsvm.dsvm-win-2019)</li><li>[Data Science Virtual Machine – Ubuntu 18.04](https://azuremarketplace.microsoft.com/marketplace/apps/microsoft-dsvm.ubuntu-1804?tab=Overview)</li></ul> |
 
+When you create a lab with the **Small GPU (Compute)** size, you have the option to [Install GPU drivers](./how-to-setup-lab-gpu.md#ensure-that-the-appropriate-gpu-drivers-are-installed).  This option installs recent NVIDIA drivers and Compute Unified Device Architecture (CUDA) toolkit which are required to enable high-performance computing with the GPU.  For more information, see the article [Set up a lab with GPU virtual machines](./how-to-setup-lab-gpu.md).
 
 ### Template virtual machine
 Once you create a lab, a template VM will be created based on the virtual machine size and image you chose. You configure the template VM with everything you want to provide to your students for this class. To learn more, see [how to manage the template virtual machine](how-to-create-manage-template.md). 
@@ -46,6 +47,53 @@ The Data Science VM images by default come with many of data science frameworks
 - [Jupyter Notebooks](http://jupyter-notebook.readthedocs.io/): A web application that allows data scientists to take raw data, run computations, and see the results all in the same environment. It will run locally in the template VM.  
 - [Visual Studio Code](https://code.visualstudio.com/): An integrated development environment (IDE) that provides a rich interactive experience when writing and testing a notebook. For more information, see [Working with Jupyter Notebooks in Visual Studio Code](https://code.visualstudio.com/docs/python/jupyter-support).
 
+If you are using the **Small GPU (Compute)** size, we recommend that you verify that the Data Science frameworks and libraries are properly set up with the GPU.  To properly set up the frameworks and libraries, you may need to install a different version of the NVIDIA Drivers and CUDA toolkit.  For example, to validate that the GPU is configured for TensorFlow, you can connect to the template VM and run the following Python-TensorFlow code in Jupyter Notebooks:
+
+```python
+import tensorflow as tf
+from tensorflow.python.client import device_lib
+
+print(device_lib.list_local_devices())
+```
+
+If the output from the above code looks like the following, this means that the GPU isn't configured for TensorFlow:
+
+```python
+[name: "/device:CPU:0"
+device_type: "CPU"
+memory_limit: 268435456
+locality {
+}
+incarnation: 15833696144144374634
+]
+```
+To properly configure the GPU, you should consult the framework's or library's documentation.  Continuing with the above example, TensorFlow provides the following guidance:
+- [TensorFlow GPU Support](https://www.tensorflow.org/install/gpu)
+
+Their guidance covers the required version of the [NVIDIA drivers](https://www.nvidia.com/drivers) and [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit-archive).  Their guidance also includes installing the [NVIDIA CUDA Deep Neural Network library (cudDNN)](https://developer.nvidia.com/cudnn).
+
+After you've followed TensorFlow's steps to configure the GPU, when you rerun the above code, you should see output similar to the following:
+
+```python
+[name: "/device:CPU:0"
+device_type: "CPU"
+memory_limit: 268435456
+locality {
+}
+incarnation: 15833696144144374634
+, name: "/device:GPU:0"
+device_type: "GPU"
+memory_limit: 11154792128
+locality {
+  bus_id: 1
+  links {
+  }
+}
+incarnation: 2659412736190423786
+physical_device_desc: "device: 0, name: NVIDIA Tesla K80, pci bus id: 0001:00:00.0, compute capability: 3.7"
+]
+```
+
 ### Provide notebooks for the class
 The next task is to provide students with notebooks that you want them to use. To provide your own notebooks, you can save notebooks locally on the template VM. 
 
@@ -124,7 +172,6 @@ Now, to connect to the VM, follow these steps:
 2. Enter the password to connect to the VM. (You may have to give X2Go permission to bypass your firewall to finish connecting.)
 3.	You should now see the graphical interface for your Ubuntu Data Science VM.
 
-
 #### SSH tunnel to Jupyter server on the VM
 Some students may want to connect directly from their local computer directly to the Jupyter server inside their VMs. The SSH protocol enables port forwarding between the local computer and a remote server (in our case, the student’s lab VM), so that an application running on a certain port on the server is **tunneled** to the mapping port on the local computer. Students should follow these steps to SSH tunnel to the Jupyter server on their lab VMs:
 
diff --git a/articles/lab-services/how-to-setup-lab-gpu.md b/articles/lab-services/how-to-setup-lab-gpu.md
@@ -42,16 +42,19 @@ To take advantage of the GPU capabilities of your lab VMs, ensure that the appro
 
 ![Screenshot of the "New lab" showing the "Install GPU drivers" option](./media/how-to-setup-gpu/lab-gpu-drivers.png)
 
-As shown in the preceding image, this option is enabled by default, which ensures that the *latest* drivers are installed for the type of GPU and image that you selected.
-- When you select a *compute* GPU size, your lab VMs are powered by the [NVIDIA Tesla K80](https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/tesla-product-literature/Tesla-K80-BoardSpec-07317-001-v05.pdf) GPU.  In this case, the latest [Compute Unified Device Architecture (CUDA)](http://developer.download.nvidia.com/compute/cuda/2_0/docs/CudaReferenceManual_2.0.pdf) drivers are installed, which enables high-performance computing.
-- When you select a *visualization* GPU size, your lab VMs are powered by the [NVIDIA Tesla M60](https://images.nvidia.com/content/tesla/pdf/188417-Tesla-M60-DS-A4-fnl-Web.pdf) GPU and [GRID technology](https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/solutions/resources/documents1/NVIDIA_GRID_vPC_Solution_Overview.pdf).  In this case, the latest GRID drivers are installed, which enables the use of graphics-intensive applications.
+As shown in the preceding image, this option is enabled by default, which ensures that recently released drivers are installed for the type of GPU and image that you selected:
+- When you select a *compute* GPU size, your lab VMs are powered by the [NVIDIA Tesla K80](https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/tesla-product-literature/Tesla-K80-BoardSpec-07317-001-v05.pdf) GPU.  In this case, recent [Compute Unified Device Architecture (CUDA)](http://developer.download.nvidia.com/compute/cuda/2_0/docs/CudaReferenceManual_2.0.pdf) drivers are installed, which enables high-performance computing.
+- When you select a *visualization* GPU size, your lab VMs are powered by the [NVIDIA Tesla M60](https://images.nvidia.com/content/tesla/pdf/188417-Tesla-M60-DS-A4-fnl-Web.pdf) GPU and [GRID technology](https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/solutions/resources/documents1/NVIDIA_GRID_vPC_Solution_Overview.pdf).  In this case, recent GRID drivers are installed, which enables the use of graphics-intensive applications.
+
+> [!IMPORTANT]
+> The **Install GPU drivers** option only installs the drivers when they aren't present on your lab's image.  For example, the GPU drivers are already installed on the Azure marketplace's [Data Science image](../machine-learning/data-science-virtual-machine/overview.md#whats-included-on-the-dsvm).  If you create a lab using the Data Science image and choose to **Install GPU drivers**, the drivers won't be updated to a more recent version.  To update the drivers, you will need to manually install them as explained in the next section.  
 
 ### Install the drivers manually
-You might need to install a driver version other than the latest version.  This section shows how to manually install the appropriate drivers, depending on whether you're using a *compute* GPU or a *visualization* GPU.
+You might need to install a different version of the drivers than the version that Azure Lab Services installs for you.  This section shows how to manually install the appropriate drivers, depending on whether you're using a *compute* GPU or a *visualization* GPU.
 
 #### Install the compute GPU drivers
 
-To manually install drivers for the compute GPU size, do the following:
+To manually install drivers for the *compute* GPU size, do the following:
 
 1. In the lab creation wizard, when you're [creating your lab](./how-to-manage-classroom-labs.md), disable the **Install GPU drivers** setting.
 
@@ -75,7 +78,7 @@ To manually install drivers for the compute GPU size, do the following:
 
 #### Install the visualization GPU drivers
 
-To manually install drivers for the visualization GPU size, do the following:
+To manually install drivers for the *visualization* GPU sizes, do the following:
 
 1. In the lab creation wizard, when you're [creating your lab](./how-to-manage-classroom-labs.md), disable the **Install GPU drivers** setting.
 1. After your lab is created, connect to the template VM to install the appropriate drivers.
@@ -104,6 +107,8 @@ This section describes how to validate that your GPU drivers are properly instal
       > [!IMPORTANT]
       > The NVIDIA Control Panel settings can be accessed only for *visualization* GPUs.  If you attempt to open the NVIDIA Control Panel for a compute GPU, you'll get the following error: "NVIDIA Display settings are not available.  You are not currently using a display attached to an NVIDIA GPU."  Similarly, the GPU performance information in Task Manager is provided only for visualization GPUs.
 
+ Depending on your scenario, you may also need to do additional validation to ensure the GPU is properly configured.  Read the class type about [Python and Jupyter Notebooks](./class-type-jupyter-notebook.md#template-virtual-machine) that explains an example where specific versions of drivers are needed.
+
 #### Linux images
 Follow the instructions in the "Verify driver installation" section of [Install NVIDIA GPU drivers on N-series VMs running Linux](../virtual-machines/linux/n-series-driver-setup.md#verify-driver-installation).