Merge pull request #259623 from v-akarnase/patch-11

prmerger-automator[bot] · web-flow · commit b3f64648561a · 2023-11-28T06:14:20.000Z
Update apache-spark-manage-dependencies.md
diff --git a/articles/hdinsight/spark/apache-spark-manage-dependencies.md b/articles/hdinsight/spark/apache-spark-manage-dependencies.md
@@ -6,7 +6,7 @@ ms.author: apsinhar
 ms.service: hdinsight
 ms.custom: hdinsightactive, ignite-2022, devx-track-python
 ms.topic: how-to
-ms.date: 11/23/2023
+ms.date: 11/28/2023
 #Customer intent: As a developer for Apache Spark and Apache Spark in Azure HDInsight, I want to learn how to manage my Spark application dependencies and install packages on my HDInsight cluster.
 ---
 
@@ -29,7 +29,7 @@ When a Spark session starts in Jupyter Notebook on Spark kernel for Scala, you c
 * [Maven Repository](https://search.maven.org/), or community-contributed packages at [Spark Packages](https://spark-packages.org/).
 * Jar files stored on your cluster's primary storage.
 
-You'll use the `%%configure` magic to configure the notebook to use an external package. In notebooks that use external packages, make sure you call the `%%configure` magic in the first code cell. This ensures that the kernel is configured to use the package before the session starts.
+You can use the `%%configure` magic to configure the notebook to use an external package. In notebooks that use external packages, make sure you call the `%%configure` magic in the first code cell. This ensures that the kernel is configured to use the package before the session starts.
 
 >
 >[!IMPORTANT]  
@@ -71,7 +71,7 @@ import com.microsoft.azure.cosmosdb.spark._
 ## Jar libs for cluster
 In some cases, you may want to configure the jar dependencies at cluster level so that every application can be set up with same dependencies by default. The approach is to add your jar paths to Spark driver and executor class path.
 
-1. Run below sample script actions to copy jar files from primary storage `wasb://mycontainer@mystorageaccount.blob.core.windows.net/libs/*` to cluster local file system `/usr/libs/sparklibs`. The step is needed as linux uses `:` to separate class path list, but HDInsight only support storage paths with scheme like `wasb://`. The remote storage path won't work correctly if you directly add it to class path.
+1. Run sample script actions to copy jar files from primary storage `wasb://mycontainer@mystorageaccount.blob.core.windows.net/libs/*` to cluster local file system `/usr/libs/sparklibs`. The step is needed as linux uses `:` to separate class path list, but HDInsight only support storage paths with scheme like `wasb://`. The remote storage path won't work correctly if you directly add it to class path.
 
     ```bash
     sudo mkdir -p /usr/libs/sparklibs
@@ -98,16 +98,17 @@ HDInsight cluster has built-in jar dependencies, and updates for these jar versi
 
 ## Python packages for one Spark job
 ### Use Jupyter Notebook
-HDInsight Jupyter Notebook PySpark kernel doesn't support installing Python packages from PyPi or Anaconda package repository directly. If you have `.zip`, `.egg`, or `.py` dependencies, and want to reference them for one Spark session, follow below steps:
 
-1. Run below sample script actions to copy `.zip`, `.egg` or `.py` files from primary storage `wasb://mycontainer@mystorageaccount.blob.core.windows.net/libs/*` to cluster local file system `/usr/libs/pylibs`. The step is needed as linux uses `:` to separate search path list, but HDInsight only support storage paths with scheme like `wasb://`. The remote storage path won't work correctly when you use `sys.path.insert`.
+HDInsight Jupyter Notebook PySpark kernel doesn't support installing Python packages from PyPi or Anaconda package repository directly. If you have `.zip`, `.egg`, or `.py` dependencies, and want to reference them for one Spark session, follow steps:
+
+1. Run sample script actions to copy `.zip`, `.egg` or `.py` files from primary storage `wasb://mycontainer@mystorageaccount.blob.core.windows.net/libs/*` to cluster local file system `/usr/libs/pylibs`. The step is needed as linux uses `:` to separate search path list, but HDInsight only support storage paths with scheme like `wasb://`. The remote storage path won't work correctly when you use `sys.path.insert`.
 
     ```bash
     sudo mkdir -p /usr/libs/pylibs
     sudo hadoop fs -copyToLocal wasb://mycontainer@mystorageaccount.blob.core.windows.net/libs/*.* /usr/libs/pylibs
     ```
 
-2. In your notebook, run below code in a code cell with PySpark kernel:
+2. In your notebook, run following code in a code cell with PySpark kernel:
 
    ```python
    import sys