|
1 |
| -# Azure Arc Data Controller clusters |
2 |
| - |
3 |
| -Installation instructions for SQL Server 2019 big data clusters can be found [here](https://docs.microsoft.com/en-us/sql/big-data-cluster/deployment-guidance?view=sql-server-ver15). |
| 1 | +# Azure Arc Data Controller cluster |
4 | 2 |
|
5 | 3 | ## Samples Setup
|
6 |
| - |
7 |
| -**Before you begin**, load the sample data into your big data cluster. For instructions, see [Load sample data into a SQL Server 2019 big data cluster](https://docs.microsoft.com/en-us/sql/big-data-cluster/tutorial-load-sample-data). |
8 |
| - |
9 |
| -## Executing the sample scripts |
10 |
| -The scripts should be executed in a specific order to test the various features. Execute the scripts from each folder in below order: |
11 |
| - |
12 |
| -1. __[spark/data-loading/transform-csv-files.ipynb](spark/data-loading/transform-csv-files.ipynb)__ |
13 |
| -1. __[data-virtualization/generic-odbc](data-virtualization/generic-odbc)__ |
14 |
| -1. __[data-virtualization/hadoop](data-virtualization/hadoop)__ |
15 |
| -1. __[data-virtualization/storage-pool](data-virtualization/storage-pool)__ |
16 |
| -1. __[data-virtualization/oracle](data-virtualization/oracle)__ |
17 |
| -1. __[data-pool](data-pool/)__ |
18 |
| -1. __[machine-learning/sql/r](machine-learning/sql/r)__ |
19 |
| -1. __[machine-learning/sql/python](machine-learning/sql/python)__ |
20 |
| - |
21 |
| -## __[data-pool](data-pool/)__ |
22 |
| - |
23 |
| -SQL Server 2019 big data cluster contains a data pool which consists of many SQL Server instances to store data & query in a scale-out manner. |
24 |
| - |
25 |
| -### Data ingestion using Spark |
26 |
| -The sample script [data-pool/data-ingestion-spark.sql](data-pool/data-ingestion-spark.sql) shows how to perform data ingestion from Spark into data pool table(s). |
27 |
| - |
28 |
| -### Data ingestion using sql |
29 |
| -The sample script [data-pool/data-ingestion-sql.sql](data-pool/data-ingestion-sql.sql) shows how to perform data ingestion from T-SQL into data pool table(s). |
30 |
| - |
31 |
| -## __[data-virtualization](data-virtualization/)__ |
32 |
| - |
33 |
| -SQL Server 2019 or SQL Server 2019 big data cluster can use PolyBase external tables to connect to other data sources. |
34 |
| - |
35 |
| -### External table over Generic ODBC data source |
36 |
| -The [data-virtualization/generic-odbc](data-virtualization/generic-odbc) folder contains samples that demonstrate how to query data in MySQL & PostgreSQL using external tables and generic ODBC data source. The generic ODBC data soruce can be used only in SQL Server 2019 on Windows. |
37 |
| - |
38 |
| -### External table over Hadoop |
39 |
| -The [data-virtualization/hadoop](data-virtualization/hadoop) folder contains samples that demonstrate how to query data in HDFS using external tables. This demonstrates the functionality available from SQL Server 2016 using the HADOOP data source. |
40 |
| - |
41 |
| -### External table over Oracle |
42 |
| -The [data-virtualization/oracle](data-virtualization/oracle) folder contains samples that demonstrate how to query data in Oracle using external tables. |
43 |
| - |
44 |
| -### External table over Storage Pool |
45 |
| -SQL Server 2019 big data cluster contains a storage pool consisting of HDFS, Spark and SQL Server instances. The [data-virtualization/storage-pool](data-virtualization/storage-pool) folder contains samples that demonstrate how to query data in HDFS inside SQL Server 2019 big data cluster. |
46 |
| - |
47 |
| -## __[deployment](deployment/)__ |
48 |
| - |
49 |
| -The [deployment](deployment) folder contains the scripts for deploying a Kubernetes cluster for SQL Server 2019 big data cluster. |
50 |
| - |
51 |
| -## __[machine-learning](machine-learning/)__ |
52 |
| - |
53 |
| -SQL Server 2016 added support executing R scripts from T-SQL. SQL Server 2017 added support for executing Python scripts from T-SQL. SQL Server 2019 adds support for executing Java code from T-SQL. SQL Server 2019 big data cluster adds support for executing Spark code inside the big data cluster. |
54 |
| - |
55 |
| -### SQL Server Machine Learning Services |
56 |
| -The [machine-learning\sql](machine-learning\sql) folder contains the sample SQL scripts that show how to invoke R, Python, and Java code from T-SQL. |
57 |
| - |
58 |
| -### Spark Machine Learning |
59 |
| -The [machine-learning\spark](machine-learning\spark) folder contains the Spark samples. |
| 4 | +Follow the instrutions here: https://raw.githubusercontent.com/ananto-msft/sql-server-samples/master/samples/features/azure-arc-data-controller/deployment/kubeadm/ubuntu-single-node-vm/README.md |
0 commit comments