connector sample updates for integrated auth with datapools

shivsood · shivsood · commit 5d60ac8622ef · 2020-03-02T22:26:52.000-08:00
diff --git a/samples/features/sql-big-data-cluster/spark/data-virtualization/mssql_spark_connector_ad_pyspark.ipynb b/samples/features/sql-big-data-cluster/spark/data-virtualization/mssql_spark_connector_ad_pyspark.ipynb
@@ -53,15 +53,15 @@
                 "The following section shows how to generate principal and keytab. This assumes you have a SS19 Big Data Cluster installed with Windows AD contoller for domain AZDATA.LOCAL. One of the users is testusera1@AZDATA.LOCAL and the user is part of Domain Admin group.\r\n",
                 "\r\n",
                 "##  Create KeyTab file using ktpass\r\n",
-                "1. Login to the Windows AD controller with user1 credentials.\r\n",
+                "1. Login to the Windows AD controller with testusera1 credentials.\r\n",
                 "2. Open command prompt in Administrator mode.\r\n",
                 "3. Use ktpass to create a key tab. Refer [here](https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/ktpass) for documentation on using ktpass. \r\n",
                 "\r\n",
                 "```sh\r\n",
                 "ktpass -out testusera1.keytab -mapUser testusera1@AZDATA.LOCAL -pass <testusera1 password> -mapOp set +DumpSalt -crypto AES256-SHA1 -ptype KRB5_NT_PRINCIPAL -princ testusera1@AZDATA.LOCAL\r\n",
                 "```\r\n",
                 "\r\n",
-                "The command above should generate a keytab file named testusera1.keytab. Transfer this file to hdfs folder in Big Data Cluster. In this sample we transfer the file to /user/testusera1/testusera1.keytab\r\n",
+                "Note that principal name in ktpass is case sensitive. The command above generates a keytab file named testusera1.keytab. Transfer this file to hdfs folder in Big Data Cluster. In this sample we transfer the file to /user/testusera1/testusera1.keytab\r\n",
                 "\r\n",
                 "## Create KeyTab file using kinit\r\n",
                 "\r\n",
@@ -72,12 +72,12 @@
                 "ktutil : add_entry -password -p testusera1@AZDATA.LOCAL -k 1 -e arcfour-hmac-md5\r\n",
                 "Password for testusera1@myDomain:\r\n",
                 "ktutil : add_entry -password -p testusera1@AZDATA.LOCAL -k 1 -e des-cbc-md4\r\n",
-                "ktutil : wkt testusera1@AZDATA.keytab \r\n",
+                "ktutil : wkt testusera1.keytab \r\n",
                 "```\r\n",
                 "\r\n",
                 "``` sh\r\n",
                 "## Check if keytab generated properly. Any error implies that keytab is not generated right.\r\n",
-                "kinit -kt testusera1.keytab testusera1@AZDATA.LOCAL\r\n",
+                "kinit -kt testusera1.keytab  testusera1@AZDATA.LOCAL\r\n",
                 "```\r\n",
                 "\r\n",
                 "Load Keytab to HDFS for use\r\n",
@@ -118,6 +118,38 @@
                 "azdata_cell_guid": "453e7b2f-e590-4b95-9fbd-5dc8b9d1f02c"
             }
         },
+        {
+            "cell_type": "markdown",
+            "source": [
+                "# Create Data Pool user\r\n",
+                "\r\n",
+                "```\r\n",
+                "-- To create external tables in data pools\r\n",
+                "grant alter any external data source to [aris\\testuser];\r\n",
+                "\r\n",
+                "-- To create external table\r\n",
+                "grant create table to [aris\\testuser];\r\n",
+                "grant alter any schema to [aris\\testuser];\r\n",
+                "\r\n",
+                "ALTER ROLE [db_datareader] ADD MEMBER [aris\\testuser]\r\n",
+                "ALTER ROLE [db_datawriter] ADD MEMBER [aris\\testuser]\r\n",
+                "```\r\n",
+                "\r\n",
+                "```\r\n",
+                "CREATE EXTERNAL DATA SOURCE connector_ds  WITH (LOCATION = 'sqldatapool://controller-svc/default');\r\n",
+                "EXECUTE('USE spark_mssql_db; CREATE EXTERNAL TABLE [dummy3] ([number] int, [word] nvarchar(2048)) WITH (DATA_SOURCE = connector_ds, DISTRIBUTION = ROUND_ROBIN)')\r\n",
+                "\r\n",
+                "-- Create a login in data pools and Provide right permissions to this user\r\n",
+                "EXECUTE( ' Use spark_mssql_db; CREATE LOGIN [aris\\testusera1]  FROM WINDOWS ' )  AT  DATA_SOURCE connector_ds;\r\n",
+                "\r\n",
+                "EXECUTE( ' Use spark_mssql_db; CREATE USER  [aris\\testusera1] ; ALTER ROLE [db_datareader] ADD MEMBER [aris\\testusera1];  ALTER ROLE [db_datawriter] ADD MEMBER [aris\\testusera1] ;')  AT  DATA_SOURCE connector_ds;\r\n",
+                "\r\n",
+                "```"
+            ],
+            "metadata": {
+                "azdata_cell_guid": "15b7588a-d1ad-4e17-84c6-bf4a862d1905"
+            }
+        },
         {
             "cell_type": "markdown",
             "source": [
@@ -288,7 +320,7 @@
         {
             "cell_type": "markdown",
             "source": [
-                "# Write and READ to/from SQL Table ( using Integrated Auth)\r\n",
+                "# (Part 1) Write and READ to/from SQL Table ( using Integrated Auth)\r\n",
                 "- Write dataframe to SQL table to Master instance\r\n",
                 "- Read SQL Table to Spark dataframe\r\n",
                 "\r\n",
@@ -380,6 +412,85 @@
                 }
             ],
             "execution_count": 7
+        },
+        {
+            "cell_type": "markdown",
+            "source": [
+                "# (PART 2) Write and READ to/from Data Pools ( using Integrated Auth)\r\n",
+                "- Write dataframe to SQL external table in Data Pools in Big Data Cluste\r\n",
+                "- Read SQL external Table to Spark dataframe\r\n",
+                "\r\n",
+                "\r\n",
+                "User creation as follows\r\n",
+                "```\r\n",
+                "\r\n",
+                "```"
+            ],
+            "metadata": {
+                "azdata_cell_guid": "99c044b2-6b1f-4b22-97ff-4ef48b5ff8b3"
+            }
+        },
+        {
+            "cell_type": "code",
+            "source": [
+                "#Write from Spark to datapools using Spark Connector\r\n",
+                "print(\"MSSQL Spark Connector write(overwrite) start \")\r\n",
+                "\r\n",
+                "servername = \"jdbc:sqlserver://master-p-svc:1433\"\r\n",
+                "dbname = \"spark_mssql_db\"\r\n",
+                "security_spec = \";integratedSecurity=true;authenticationScheme=JavaKerberos;\"\r\n",
+                "url = servername + \";\" + \"databaseName=\" + dbname + security_spec\r\n",
+                "\r\n",
+                "datapool_table = \"AdultCensus_DataPoolTable\"\r\n",
+                "principal = \"testusera1@AZDATA.LOCAL\"\r\n",
+                "keytab = \"/user/testuser/testusera1.keytab\" \r\n",
+                "\r\n",
+                "datasource_name = \"connector_ds\"\r\n",
+                "\r\n",
+                "try:\r\n",
+                "  df.write \\\r\n",
+                "    .format(\"com.microsoft.sqlserver.jdbc.spark\") \\\r\n",
+                "    .mode(\"overwrite\") \\\r\n",
+                "    .option(\"url\", url) \\\r\n",
+                "    .option(\"dbtable\", datapool_table) \\\r\n",
+                "    .option(\"principal\", principal) \\\r\n",
+                "    .option(\"keytab\", keytab) \\\r\n",
+                "    .option(\"dataPoolDataSource\",datasource_name) \\\r\n",
+                "    .save()\r\n",
+                "except ValueError as error :\r\n",
+                "    print(\"MSSQL Spark Connector write(overwrite) failed\", error)\r\n",
+                "\r\n",
+                "print(\"MSSQL Connector write(overwrite) done  \")"
+            ],
+            "metadata": {
+                "azdata_cell_guid": "9cbe5af3-6ddb-4f19-8423-d10cbb7d48a7"
+            },
+            "outputs": [],
+            "execution_count": null
+        },
+        {
+            "cell_type": "code",
+            "source": [
+                "#Read from SQL table using MSSQ Connector\r\n",
+                "print(\"MSSQL Spark Connector read data pool external table start \")\r\n",
+                "jdbcDF = spark.read \\\r\n",
+                "        .format(\"com.microsoft.sqlserver.jdbc.spark\") \\\r\n",
+                "        .option(\"url\", url) \\\r\n",
+                "        .option(\"dbtable\", datapool_table) \\\r\n",
+                "        .option(\"url\", url) \\\r\n",
+                "        .option(\"dbtable\", dbtable) \\\r\n",
+                "        .option(\"principal\", principal) \\\r\n",
+                "        .option(\"keytab\", keytab).load()\r\n",
+                "\r\n",
+                "jdbcDF.show(5)\r\n",
+                "\r\n",
+                "print(\"MSSQL Connector read from data pool external table succeeded\")"
+            ],
+            "metadata": {
+                "azdata_cell_guid": "5550bfce-4852-4fb8-9caa-1120c05dafb7"
+            },
+            "outputs": [],
+            "execution_count": null
         }
     ]
 }