Merge pull request #110095 from dagiro/freshness32

PRMerger6 · web-flow · commit 5d1d2ee99552 · 2020-04-02T16:02:02.000-07:00
freshness32
diff --git a/articles/hdinsight/spark/apache-spark-jupyter-notebook-install-locally.md b/articles/hdinsight/spark/apache-spark-jupyter-notebook-install-locally.md
@@ -5,14 +5,14 @@ author: hrasheed-msft
 ms.author: hrasheed
 ms.reviewer: jasonh
 ms.service: hdinsight
-ms.custom: hdinsightactive
 ms.topic: conceptual
-ms.date: 11/07/2019
+ms.custom: hdinsightactive
+ms.date: 04/02/2020
 ---
 
 # Install Jupyter notebook on your computer and connect to Apache Spark on HDInsight
 
-In this article you learn how to install Jupyter notebook, with the custom PySpark (for Python) and Apache Spark (for Scala) kernels with Spark magic, and connect the notebook to an HDInsight cluster. There can be a number of reasons to install Jupyter on your local computer, and there can be some challenges as well. For more on this, see the section [Why should I install Jupyter on my computer](#why-should-i-install-jupyter-on-my-computer) at the end of this article.
+In this article, you learn how to install Jupyter notebook with the custom PySpark (for Python) and Apache Spark (for Scala) kernels with Spark magic. You then connect the notebook to an HDInsight cluster.
 
 There are four key steps involved in installing Jupyter and connecting to Apache Spark on HDInsight.
 
@@ -21,17 +21,17 @@ There are four key steps involved in installing Jupyter and connecting to Apache
 * Install the PySpark and Spark kernels with the Spark magic.
 * Configure Spark magic to access Spark cluster on HDInsight.
 
-For more information about the custom kernels and the Spark magic available for Jupyter notebooks with HDInsight cluster, see [Kernels available for Jupyter notebooks with Apache Spark Linux clusters on HDInsight](apache-spark-jupyter-notebook-kernels.md).
+For more information about custom kernels and Spark magic, see [Kernels available for Jupyter notebooks with Apache Spark Linux clusters on HDInsight](apache-spark-jupyter-notebook-kernels.md).
 
 ## Prerequisites
 
-* An Apache Spark cluster on HDInsight. For instructions, see [Create Apache Spark clusters in Azure HDInsight](apache-spark-jupyter-spark-sql.md). This is a prerequisite for connecting the Jupyter notebook to an HDInsight cluster once the notebook is installed.
+* An Apache Spark cluster on HDInsight. For instructions, see [Create Apache Spark clusters in Azure HDInsight](apache-spark-jupyter-spark-sql.md). The local notebook connects to the HDInsight cluster.
 
 * Familiarity with using Jupyter Notebooks with Spark on HDInsight.
 
 ## Install Jupyter notebook on your computer
 
-You must install Python before you can install Jupyter notebooks. The [Anaconda distribution](https://www.anaconda.com/download/) will install both, Python, and Jupyter Notebook.
+Install Python before you install Jupyter notebooks. The [Anaconda distribution](https://www.anaconda.com/download/) will install both, Python, and Jupyter Notebook.
 
 Download the [Anaconda installer](https://www.anaconda.com/download/) for your platform and run the setup. While running the setup wizard, make sure you select the option to add Anaconda to your PATH variable.  See also, [Installing Jupyter using Anaconda](https://jupyter.readthedocs.io/en/latest/install.html).
 
@@ -58,9 +58,9 @@ Download the [Anaconda installer](https://www.anaconda.com/download/) for your p
     pip show sparkmagic
     ```
 
-    Then change your working directory to the location identified with the above command.
+    Then change your working directory to the **location** identified with the above command.
 
-1. From your new working directory, enter one or more of the commands below to install the desired kernel(s):
+1. From your new working directory, enter one or more of the commands below to install the wanted kernel(s):
 
     |Kernel | Command |
     |---|---|
@@ -85,7 +85,7 @@ In this section, you configure the Spark magic that you installed earlier to con
     python
     ```
 
-2. The Jupyter configuration information is typically stored in the users home directory. Enter the following command to identify the home directory, and create a folder called **.sparkmagic**.  The full path will be outputted.
+2. The Jupyter configuration information is typically stored in the users home directory. Enter the following command to identify the home directory, and create a folder called **\.sparkmagic**.  The full path will be outputted.
 
     ```python
     import os
@@ -141,16 +141,16 @@ In this section, you configure the Spark magic that you installed earlier to con
     jupyter notebook
     ```
 
-6. Verify that you can use the Spark magic available with the kernels. Perform the following steps.
+6. Verify that you can use the Spark magic available with the kernels. Complete the following steps.
 
-	a. Create a new notebook. From the right-hand corner, select **New**. You should see the default kernel **Python 2** or **Python 3** and the kernels you installed. The actual values may vary depending on your installation choices.  Select **PySpark**.
+    a. Create a new notebook. From the right-hand corner, select **New**. You should see the default kernel **Python 2** or **Python 3** and the kernels you installed. The actual values may vary depending on your installation choices.  Select **PySpark**.
 
-	![Available kernels in Jupyter notebook](./media/apache-spark-jupyter-notebook-install-locally/jupyter-kernels-notebook.png "Kernels in Jupyter notebook")
+    ![Available kernels in Jupyter notebook](./media/apache-spark-jupyter-notebook-install-locally/jupyter-kernels-notebook.png "Kernels in Jupyter notebook")
 
     > [!IMPORTANT]  
     > After selecting **New** review your shell for any errors.  If you see the error `TypeError: __init__() got an unexpected keyword argument 'io_loop'` you may be experiencing a known issue with certain versions of Tornado.  If so, stop the kernel and then downgrade your Tornado installation with the following command: `pip install tornado==4.5.3`.
 
-	b. Run the following code snippet.
+    b. Run the following code snippet.
 
     ```sql
     %%sql
@@ -163,9 +163,9 @@ In this section, you configure the Spark magic that you installed earlier to con
 
 ## Why should I install Jupyter on my computer?
 
-There can be a number of reasons why you might want to install Jupyter on your computer and then connect it to an Apache Spark cluster on HDInsight.
+Reasons to install Jupyter on your computer and then connect it to an Apache Spark cluster on HDInsight:
 
-* Even though Jupyter notebooks are already available on the Spark cluster in Azure HDInsight, installing Jupyter on your computer provides you the option to create your notebooks locally, test your application against a running cluster, and then upload the notebooks to the cluster. To upload the notebooks to the cluster, you can either upload them using the Jupyter notebook that is running or the cluster, or save them to the /HdiNotebooks folder in the storage account associated with the cluster. For more information on how notebooks are stored on the cluster, see [Where are Jupyter notebooks stored](apache-spark-jupyter-notebook-kernels.md#where-are-the-notebooks-stored)?
+* Provides you the option to create your notebooks locally, test your application against a running cluster, and then upload the notebooks to the cluster. To upload the notebooks to the cluster, you can either upload them using the Jupyter notebook that is running or the cluster, or save them to the `/HdiNotebooks` folder in the storage account associated with the cluster. For more information on how notebooks are stored on the cluster, see [Where are Jupyter notebooks stored](apache-spark-jupyter-notebook-kernels.md#where-are-the-notebooks-stored)?
 * With the notebooks available locally, you can connect to different Spark clusters based on your application requirement.
 * You can use GitHub to implement a source control system and have version control for the notebooks. You can also have a collaborative environment where multiple users can work with the same notebook.
 * You can work with notebooks locally without even having a cluster up. You only need a cluster to test your notebooks against, not to manually manage your notebooks or a development environment.
@@ -177,5 +177,5 @@ There can be a number of reasons why you might want to install Jupyter on your c
 ## Next steps
 
 * [Overview: Apache Spark on Azure HDInsight](apache-spark-overview.md)
-* [Apache Spark with BI: Perform interactive data analysis using Spark in HDInsight with BI tools](apache-spark-use-bi-tools.md)
+* [Apache Spark with BI: Analyze Apache Spark data using Power BI in HDInsight](apache-spark-use-bi-tools.md)
 * [Apache Spark with Machine Learning: Use Spark in HDInsight for analyzing building temperature using HVAC data](apache-spark-ipython-notebook-machine-learning.md)