Skip to content

Commit 5d60ac8

Browse files
committed
connector sample updates for integrated auth with datapools
1 parent e7b9d6b commit 5d60ac8

File tree

1 file changed

+116
-5
lines changed

1 file changed

+116
-5
lines changed

samples/features/sql-big-data-cluster/spark/data-virtualization/mssql_spark_connector_ad_pyspark.ipynb

Lines changed: 116 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -53,15 +53,15 @@
5353
"The following section shows how to generate principal and keytab. This assumes you have a SS19 Big Data Cluster installed with Windows AD contoller for domain AZDATA.LOCAL. One of the users is [email protected] and the user is part of Domain Admin group.\r\n",
5454
"\r\n",
5555
"## Create KeyTab file using ktpass\r\n",
56-
"1. Login to the Windows AD controller with user1 credentials.\r\n",
56+
"1. Login to the Windows AD controller with testusera1 credentials.\r\n",
5757
"2. Open command prompt in Administrator mode.\r\n",
5858
"3. Use ktpass to create a key tab. Refer [here](https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/ktpass) for documentation on using ktpass. \r\n",
5959
"\r\n",
6060
"```sh\r\n",
6161
"ktpass -out testusera1.keytab -mapUser [email protected] -pass <testusera1 password> -mapOp set +DumpSalt -crypto AES256-SHA1 -ptype KRB5_NT_PRINCIPAL -princ [email protected]\r\n",
6262
"```\r\n",
6363
"\r\n",
64-
"The command above should generate a keytab file named testusera1.keytab. Transfer this file to hdfs folder in Big Data Cluster. In this sample we transfer the file to /user/testusera1/testusera1.keytab\r\n",
64+
"Note that principal name in ktpass is case sensitive. The command above generates a keytab file named testusera1.keytab. Transfer this file to hdfs folder in Big Data Cluster. In this sample we transfer the file to /user/testusera1/testusera1.keytab\r\n",
6565
"\r\n",
6666
"## Create KeyTab file using kinit\r\n",
6767
"\r\n",
@@ -72,12 +72,12 @@
7272
"ktutil : add_entry -password -p [email protected] -k 1 -e arcfour-hmac-md5\r\n",
7373
"Password for testusera1@myDomain:\r\n",
7474
"ktutil : add_entry -password -p [email protected] -k 1 -e des-cbc-md4\r\n",
75-
"ktutil : wkt testusera1@AZDATA.keytab \r\n",
75+
"ktutil : wkt testusera1.keytab \r\n",
7676
"```\r\n",
7777
"\r\n",
7878
"``` sh\r\n",
7979
"## Check if keytab generated properly. Any error implies that keytab is not generated right.\r\n",
80-
"kinit -kt testusera1.keytab [email protected]\r\n",
80+
"kinit -kt testusera1.keytab [email protected]\r\n",
8181
"```\r\n",
8282
"\r\n",
8383
"Load Keytab to HDFS for use\r\n",
@@ -118,6 +118,38 @@
118118
"azdata_cell_guid": "453e7b2f-e590-4b95-9fbd-5dc8b9d1f02c"
119119
}
120120
},
121+
{
122+
"cell_type": "markdown",
123+
"source": [
124+
"# Create Data Pool user\r\n",
125+
"\r\n",
126+
"```\r\n",
127+
"-- To create external tables in data pools\r\n",
128+
"grant alter any external data source to [aris\\testuser];\r\n",
129+
"\r\n",
130+
"-- To create external table\r\n",
131+
"grant create table to [aris\\testuser];\r\n",
132+
"grant alter any schema to [aris\\testuser];\r\n",
133+
"\r\n",
134+
"ALTER ROLE [db_datareader] ADD MEMBER [aris\\testuser]\r\n",
135+
"ALTER ROLE [db_datawriter] ADD MEMBER [aris\\testuser]\r\n",
136+
"```\r\n",
137+
"\r\n",
138+
"```\r\n",
139+
"CREATE EXTERNAL DATA SOURCE connector_ds WITH (LOCATION = 'sqldatapool://controller-svc/default');\r\n",
140+
"EXECUTE('USE spark_mssql_db; CREATE EXTERNAL TABLE [dummy3] ([number] int, [word] nvarchar(2048)) WITH (DATA_SOURCE = connector_ds, DISTRIBUTION = ROUND_ROBIN)')\r\n",
141+
"\r\n",
142+
"-- Create a login in data pools and Provide right permissions to this user\r\n",
143+
"EXECUTE( ' Use spark_mssql_db; CREATE LOGIN [aris\\testusera1] FROM WINDOWS ' ) AT DATA_SOURCE connector_ds;\r\n",
144+
"\r\n",
145+
"EXECUTE( ' Use spark_mssql_db; CREATE USER [aris\\testusera1] ; ALTER ROLE [db_datareader] ADD MEMBER [aris\\testusera1]; ALTER ROLE [db_datawriter] ADD MEMBER [aris\\testusera1] ;') AT DATA_SOURCE connector_ds;\r\n",
146+
"\r\n",
147+
"```"
148+
],
149+
"metadata": {
150+
"azdata_cell_guid": "15b7588a-d1ad-4e17-84c6-bf4a862d1905"
151+
}
152+
},
121153
{
122154
"cell_type": "markdown",
123155
"source": [
@@ -288,7 +320,7 @@
288320
{
289321
"cell_type": "markdown",
290322
"source": [
291-
"# Write and READ to/from SQL Table ( using Integrated Auth)\r\n",
323+
"# (Part 1) Write and READ to/from SQL Table ( using Integrated Auth)\r\n",
292324
"- Write dataframe to SQL table to Master instance\r\n",
293325
"- Read SQL Table to Spark dataframe\r\n",
294326
"\r\n",
@@ -380,6 +412,85 @@
380412
}
381413
],
382414
"execution_count": 7
415+
},
416+
{
417+
"cell_type": "markdown",
418+
"source": [
419+
"# (PART 2) Write and READ to/from Data Pools ( using Integrated Auth)\r\n",
420+
"- Write dataframe to SQL external table in Data Pools in Big Data Cluste\r\n",
421+
"- Read SQL external Table to Spark dataframe\r\n",
422+
"\r\n",
423+
"\r\n",
424+
"User creation as follows\r\n",
425+
"```\r\n",
426+
"\r\n",
427+
"```"
428+
],
429+
"metadata": {
430+
"azdata_cell_guid": "99c044b2-6b1f-4b22-97ff-4ef48b5ff8b3"
431+
}
432+
},
433+
{
434+
"cell_type": "code",
435+
"source": [
436+
"#Write from Spark to datapools using Spark Connector\r\n",
437+
"print(\"MSSQL Spark Connector write(overwrite) start \")\r\n",
438+
"\r\n",
439+
"servername = \"jdbc:sqlserver://master-p-svc:1433\"\r\n",
440+
"dbname = \"spark_mssql_db\"\r\n",
441+
"security_spec = \";integratedSecurity=true;authenticationScheme=JavaKerberos;\"\r\n",
442+
"url = servername + \";\" + \"databaseName=\" + dbname + security_spec\r\n",
443+
"\r\n",
444+
"datapool_table = \"AdultCensus_DataPoolTable\"\r\n",
445+
"principal = \"[email protected]\"\r\n",
446+
"keytab = \"/user/testuser/testusera1.keytab\" \r\n",
447+
"\r\n",
448+
"datasource_name = \"connector_ds\"\r\n",
449+
"\r\n",
450+
"try:\r\n",
451+
" df.write \\\r\n",
452+
" .format(\"com.microsoft.sqlserver.jdbc.spark\") \\\r\n",
453+
" .mode(\"overwrite\") \\\r\n",
454+
" .option(\"url\", url) \\\r\n",
455+
" .option(\"dbtable\", datapool_table) \\\r\n",
456+
" .option(\"principal\", principal) \\\r\n",
457+
" .option(\"keytab\", keytab) \\\r\n",
458+
" .option(\"dataPoolDataSource\",datasource_name) \\\r\n",
459+
" .save()\r\n",
460+
"except ValueError as error :\r\n",
461+
" print(\"MSSQL Spark Connector write(overwrite) failed\", error)\r\n",
462+
"\r\n",
463+
"print(\"MSSQL Connector write(overwrite) done \")"
464+
],
465+
"metadata": {
466+
"azdata_cell_guid": "9cbe5af3-6ddb-4f19-8423-d10cbb7d48a7"
467+
},
468+
"outputs": [],
469+
"execution_count": null
470+
},
471+
{
472+
"cell_type": "code",
473+
"source": [
474+
"#Read from SQL table using MSSQ Connector\r\n",
475+
"print(\"MSSQL Spark Connector read data pool external table start \")\r\n",
476+
"jdbcDF = spark.read \\\r\n",
477+
" .format(\"com.microsoft.sqlserver.jdbc.spark\") \\\r\n",
478+
" .option(\"url\", url) \\\r\n",
479+
" .option(\"dbtable\", datapool_table) \\\r\n",
480+
" .option(\"url\", url) \\\r\n",
481+
" .option(\"dbtable\", dbtable) \\\r\n",
482+
" .option(\"principal\", principal) \\\r\n",
483+
" .option(\"keytab\", keytab).load()\r\n",
484+
"\r\n",
485+
"jdbcDF.show(5)\r\n",
486+
"\r\n",
487+
"print(\"MSSQL Connector read from data pool external table succeeded\")"
488+
],
489+
"metadata": {
490+
"azdata_cell_guid": "5550bfce-4852-4fb8-9caa-1120c05dafb7"
491+
},
492+
"outputs": [],
493+
"execution_count": null
383494
}
384495
]
385496
}

0 commit comments

Comments
 (0)