You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"source": "## Use Case 1: I can have key packages in boxed\r\n - All pacakges that come with spark and hadoop distribution\r\n - Python3.5 and Python 2.7\r\n - Pandas, Sklearn and several other supporting ml packages\r\n - R and supporting pacakges as part of MRO\r\n - sparklyr\r\n\r\n\r\n",
25
-
"metadata": {}
40
+
"source": [
41
+
"# Built-in Tools\n",
42
+
"* Spark and Hadoop base packages\n",
43
+
"* Python 3.5 and Python 2.7\n",
44
+
"* Pandas, Sklearn, Numpy, and other data processing packages.\n",
"source": "## Use Case 2: I can install pacakges from maven repo to my spark cluster\r\nMaven central is a source of lot of packages. A lot of spark ecosystem pacakges are availble there. These pacakages can be installed to your spark cluster using notebook cell configuration at the start of your spark session.\r\n",
"# Install Packages from a Maven Repository onto the Spark Cluster at Runtime\r\n",
57
+
"Maven packages can be installed onto your Spark cluster using notebook cell configuration at the start of your spark session. Before starting a spark session in Azure Data Studio, run the following code:\r\n",
"source": "## Use Case 3: I have a local jar that i want to run in the spark cluster\r\nAs a user you may build your own customer pacakges that want to run as part of your spark jobs. These pacakges can be uploaded as HDFS and using a notebook configuration spark can consume these pacakges in a jar.\r\n\r\n\r\n",
74
-
"metadata": {}
88
+
"source": [
89
+
"# Import .jar at runtime through Azure Data Studio notebook cell configuration\n",
"One of the key scenarios for big data clusters is the ability to submit Spark jobs for SQL Server. The Spark job submission feature allows you to submit a local Jar or Py files with references to SQL Server 2019 big data cluster. It also enables you to execute a Jar or Py files, which are already located in the HDFS file system.\r\n",
129
+
"\r\n",
130
+
"* [Submit Spark jobs on SQL Server Big Data Clusters in Azure Data Studio](https://docs.microsoft.com/en-us/sql/big-data-cluster/spark-submit-job?view=sqlallproducts-allversions)\r\n",
131
+
"* [Submit Spark jobs on SQL Server Big Data Clusters in IntelliJ](https://docs.microsoft.com/en-us/sql/big-data-cluster/spark-submit-job-intellij-tool-plugin?view=sqlallproducts-allversions)\r\n",
132
+
"* [Submit Spark jobs on SQL Server big data cluster in Visual Studio Code](https://docs.microsoft.com/en-us/sql/big-data-cluster/spark-hive-tools-vscode?view=sqlallproducts-allversions)\r\n",
0 commit comments