Skip to content

Commit 0ff1ae7

Browse files
egranetperrysk-msft
authored andcommitted
Adding graph examples (#258)
* Adding graph examples * Updating README.md
1 parent efa5427 commit 0ff1ae7

11 files changed

+1310
-0
lines changed

samples/features/readme.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ Built-in JSON functions enable you to easily parse and query JSON data stored in
2020

2121
Built-in temporal functions enable you to easily track history of changes in a table, go back in history, and analyze historical data.
2222

23+
[Graph Tables](sql-graph)
24+
25+
Graph tables enable you to add a non-relational capability to your database.
26+
2327
## Samples for Business Intelligence features within SQL Server
2428

2529
[Reporting Services (SSRS)](reporting-services)
416 KB
Binary file not shown.
13.9 KB

Graph Tables

This code sample demonstrates how to populate graph tables from an existing table or a csv files. It provides a couple of examples as well.

Contents

About this sample
Before you begin
Run this sample
Sample details
Disclaimers
Related links

About this sample

  1. Applies to:
    • Azure SQL Database v12 (or higher) Standard / Premium / Premium RS service tiers.
    • SQL Server 2017 (or higher) Evaluation / Developer / Enterprise editions.
  2. Key feature:
    • populating graph tables
    • graph in filter clauses
  3. Workload: Transactional queries executed on WideWorldImporters
  4. Programming Language: Pyhton, T-SQL
  5. Authors: Estienne Granet [esgranet-msft]

Before you begin

To run this sample, you need the following prerequisites.

Account and Software prerequisites:

  1. Either
    • Azure SQL Database v12 (or higher) and Azure Blob Storage
    • SQL Server 2017 (or higher)
  2. SQL Server Management Studio 17.1 (or higher)
  3. Python 3.6 (or higher)

Azure prerequisites:

  1. Permission to create an Azure SQL Database. For further explanation on how to create an Azure SQL Database and access it from SQL Server Management Studio, you can refer to this tutorial.

  2. Permission to create an Azure blob container.

Run this sample

Setup

Azure SQL Database Setup

  1. Download WideWorldImporters-Standard.bacpac from the WideWorldImporters page.

  2. Log into the Azure portal and go to your Azure Storage account. You will need to upload WideWorldImporters-Standard.bacpac to a blob container. To do so, go on the home page of your Azure storage account, click on the Blobs icon located under the Services tab and select ** + Container** to create a new blob container. Alternatively, you can choose an existing blob container. There, upload WideWorldImporters-Standard.bacpac. When prompted for a blob type and a blob size, choose respectively Block blob and 100MB.

  3. Go to your Azure SQL Server account to import the newly uploaded WideWorldImporters-Standard.bacpac database from your Azure storage account. To do so, click Import database on the top ribbon and fill the different fields required to import a new database. Under the section Storage, select your storage account and the blob container that you just used to upload WideWorldImporters-Standard.bacpac. Select the WideWorldImporters-Standard.bacpac file. Choose a Standard S0 or higher pricing tier. Name the database "WideWorldImporters-Standard" and provide your credentials. For screenshots on the import operation, please refer to this documentation (page)[https://docs.microsoft.com/en-us/azure/sql-database/sql-database-import].

  4. Launch SQL Server Management Studio and connect to the newly created WideWorldImporters-Standard database.

SQL Server Setup

  1. Download WideWorldImporters-Full.bak from the WideWorldImporters page.

  2. Launch SQL Server Management Studio, connect to your SQL Server instance and restore WideWorldImporters-Full.bak. For detailed information on how to restore a database backup file from SQL Server Management Studio, you can refer to this documentation page. Finally run "USE WideWorldImporters; GO" to use WideWorldImporters.

Create and populate graph tables

The first step will consist in creating graph tables. This is the purpose of create_graph_tables.sql. create_graph_tables.sql creates node equivalent of Sales.Customers, Warehouse.StockItems and Purchasing.Suppliers. On top, it creates six edge tables between these node tables. Please keep in mind that all graph edges are oriented, meaning that an edge table from Node Table A to Node Table B is not equivalent to an edge table from Node Table B to Node Table A. The second step is to populate the graph tables that you have just created. This is the purpose of populate_graph_tables.sql.

On a side note, a quick way to create an edge table from an existing table is to go on the Object Explorer of SQL Server Management Studio, select the table that you would like to convert and right click on it. Select Script Table, CREATE TO and then New Query Editor Window. This will open a window with a T-SQL CREATE script for the table. As **AS NODE ** at the end of the statement to declare the table as a graph table. A live demonstration of this method is presented in this video.

To improve performance, populate_graph_tables.sql disables default indexes during the insert operations and rebuilds them at the end. This avoids the constant update of indexes while data is inserted in the table. Below is an image of the graph layout.

Query graph tables

top_10_buyers.sql finds the top 10 buyers who purchased a specific item ordered by how much they spent. customers_who_bought_this_also_bought.sql lists all the items purchased by customers who bought item X.

Import a csv file as a node table using the bcp utility.

csv_as_node.sql and csv_as_node.py show how to import a csv file into a node table. Specifically, they add a $node_id column to the csv file and export it back into a graph table. For the sake of the example, csv_as_node.sql

In addition to disabling default indexes during the bulk insert operations, we also set the recovery mode to bulk_logged in order to increase performance.

Import a csv file as a node table using the openrowset bulk.

openrowset_bulk_insert.sql retreives a table from a csv file and import it into a node table using (openrowset)[https://docs.microsoft.com/en-us/sql/t-sql/functions/openrowset-transact-sql].

Disclaimers

The code included in this sample is not intended to be a set of best practices on how to build scalable enterprise grade applications. This is beyond the scope of this quick start sample.

Related Links

For more information, see these articles:

Lines changed: 333 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,333 @@
1+
-- clean database
2+
DROP TABLE IF EXISTS Graph.Customers;
3+
DROP TABLE IF EXISTS Graph.StockItems;
4+
DROP TABLE IF EXISTS Graph.Suppliers;
5+
6+
DROP TABLE IF EXISTS Graph.OrderLines_CustomersToStockItems;
7+
DROP TABLE IF EXISTS Graph.OrderLines_StockItemsToCustomers;
8+
9+
DROP TABLE IF EXISTS Graph.InvoiceLines_CustomersToStockItems;
10+
DROP TABLE IF EXISTS Graph.InvoiceLines_StockItemsToCustomers;
11+
12+
DROP TABLE IF EXISTS Graph.PurchaseOrderLines_SuppliersToStockItems;
13+
DROP TABLE IF EXISTS Graph.PurchaseOrderLines_StockItemsToSuppliers;
14+
15+
DROP SCHEMA IF EXISTS Graph;
16+
go
17+
18+
-- define schema
19+
CREATE SCHEMA Graph;
20+
go
21+
22+
/* SALES schema */
23+
-- create node table for Sales.Customers
24+
CREATE TABLE Graph.Customers (
25+
CustomerID int,
26+
CustomerName nvarchar(100),
27+
BillToCustomerID int,
28+
CustomerCategoryID int,
29+
BuyingGroupID int,
30+
PrimaryContactPersonID int,
31+
AlternateContactPersonID int,
32+
DeliveryMethodID int,
33+
DeliveryCityID int,
34+
PostalCityID int,
35+
CreditLimit decimal(18,2),
36+
AccountOpenedDate date,
37+
StandardDiscountPercentage decimal(18,3),
38+
IsStatementSent bit,
39+
IsOnCreditHold bit,
40+
PaymentDays int,
41+
PhoneNumber nvarchar(20),
42+
WebsiteURL nvarchar(256),
43+
DeliveryAddressLine1 nvarchar(60),
44+
DeliveryAddressLine2 nvarchar(60),
45+
DeliveryPostalCode nvarchar(10),
46+
DeliveryLocation geography,
47+
PostalAddressLine1 nvarchar(60),
48+
PostalAddressLine2 nvarchar(60),
49+
PostalPostalCode nvarchar(10),
50+
LastEditedBy int,
51+
ValidFrom datetime2,
52+
ValidTo datetime2
53+
) AS NODE;
54+
55+
-- create edge table for Sales.Orders and Sales.Orderlines
56+
-- LINK: CUSTOMERS -->> STOCKITEMS
57+
CREATE TABLE Graph.OrderLines_CustomersToStockItems (
58+
59+
-- from Sales.Orders
60+
OrderID int,
61+
CustomerID int,
62+
SalespersonPersonID int,
63+
PickedByPersonID int,
64+
ContactPersonID int,
65+
BackorderOrderID int,
66+
OrderDate date,
67+
ExpectedDeliveryDate date,
68+
CustomerPurchaseOrderNumber nvarchar(20),
69+
IsUndersupplyBackordered bit,
70+
Comments nvarchar(max),
71+
DeliveryInstructions nvarchar(max),
72+
InternalComments nvarchar(max),
73+
74+
-- from Sales.OrderLines
75+
StockItemID int,
76+
Description nvarchar(100),
77+
PackageTypeID int,
78+
Quantity int,
79+
UnitPrice decimal(18,2),
80+
TaxRate decimal(18,3),
81+
PickedQuantity int,
82+
PickingCompletedWhen datetime2,
83+
LastEditedBy int,
84+
LastEditedWhen datetime2
85+
) AS EDGE;
86+
87+
-- create edge table for Sales.Invoices and Sales.InvoiceLines
88+
-- LINK: CUSTOMERS -->> STOCKITEMS
89+
CREATE TABLE Graph.InvoiceLines_CustomersToStockItems (
90+
91+
-- from Sales.Invoices
92+
InvoiceID int,
93+
CustomerID int,
94+
BillToCustomerID int,
95+
OrderID int,
96+
DeliveryMethodID int,
97+
ContactPersonID int,
98+
AccountsPersonID int,
99+
SalespersonPersonID int,
100+
PackedByPersonID int,
101+
InvoiceDate date,
102+
CustomerPurchaseOrderNumber nvarchar(20),
103+
IsCreditNote bit,
104+
CreditNoteReason nvarchar(max),
105+
Comments nvarchar(max),
106+
DeliveryInstructions nvarchar(max),
107+
InternalComments nvarchar(max),
108+
TotalDryItems int,
109+
TotalChillerItems int,
110+
DeliveryRun nvarchar(5),
111+
RunPosition nvarchar(5),
112+
ReturnedDeliveryData nvarchar(max),
113+
ConfirmedDeliveryTime datetime2,
114+
ConfirmedReceivedBy nvarchar(4000),
115+
116+
-- from Sales.InvoiceLines
117+
InvoiceLineID int,
118+
StockItemID int,
119+
Description nvarchar(100),
120+
PackageTypeID int,
121+
Quantity int,
122+
UnitPrice decimal(18,3),
123+
TaxRate decimal(18,2),
124+
TaxAmount decimal(18,2),
125+
LineProfit decimal(18,2),
126+
ExtendedPrice decimal(18,2),
127+
LastEditedBy int,
128+
LastEditedWhen datetime2
129+
) AS EDGE;
130+
131+
132+
-- create edge table for Sales.Orders and Sales.Orderlines
133+
-- LINK: STOCKITEMS -->> CUSTOMERS
134+
CREATE TABLE Graph.OrderLines_StockItemsToCustomers (
135+
136+
-- from Sales.Orders
137+
OrderID int,
138+
CustomerID int,
139+
SalespersonPersonID int,
140+
PickedByPersonID int,
141+
ContactPersonID int,
142+
BackorderOrderID int,
143+
OrderDate date,
144+
ExpectedDeliveryDate date,
145+
CustomerPurchaseOrderNumber nvarchar(20),
146+
IsUndersupplyBackordered bit,
147+
Comments nvarchar(max),
148+
DeliveryInstructions nvarchar(max),
149+
InternalComments nvarchar(max),
150+
151+
-- from Sales.OrderLines
152+
StockItemID int,
153+
Description nvarchar(100),
154+
PackageTypeID int,
155+
Quantity int,
156+
UnitPrice decimal(18,2),
157+
TaxRate decimal(18,3),
158+
PickedQuantity int,
159+
PickingCompletedWhen datetime2,
160+
LastEditedBy int,
161+
LastEditedWhen datetime2
162+
) AS EDGE;
163+
164+
-- create edge table for Sales.Invoices and Sales.InvoiceLines
165+
-- LINK: STOCKITEMS -->> CUSTOMERS
166+
CREATE TABLE Graph.InvoiceLines_StockItemsToCustomers (
167+
168+
-- from Sales.Invoices
169+
InvoiceID int,
170+
CustomerID int,
171+
BillToCustomerID int,
172+
OrderID int,
173+
DeliveryMethodID int,
174+
ContactPersonID int,
175+
AccountsPersonID int,
176+
SalespersonPersonID int,
177+
PackedByPersonID int,
178+
InvoiceDate date,
179+
CustomerPurchaseOrderNumber nvarchar(20),
180+
IsCreditNote bit,
181+
CreditNoteReason nvarchar(max),
182+
Comments nvarchar(max),
183+
DeliveryInstructions nvarchar(max),
184+
InternalComments nvarchar(max),
185+
TotalDryItems int,
186+
TotalChillerItems int,
187+
DeliveryRun nvarchar(5),
188+
RunPosition nvarchar(5),
189+
ReturnedDeliveryData nvarchar(max),
190+
ConfirmedDeliveryTime datetime2,
191+
ConfirmedReceivedBy nvarchar(4000),
192+
193+
-- from Sales.InvoiceLines
194+
InvoiceLineID int,
195+
StockItemID int,
196+
Description nvarchar(100),
197+
PackageTypeID int,
198+
Quantity int,
199+
UnitPrice decimal(18,3),
200+
TaxRate decimal(18,2),
201+
TaxAmount decimal(18,2),
202+
LineProfit decimal(18,2),
203+
ExtendedPrice decimal(18,2),
204+
LastEditedBy int,
205+
LastEditedWhen datetime2
206+
) AS EDGE;
207+
208+
209+
/* WAREHOUSE schema */
210+
-- create node table for Warehouse.StockItems
211+
CREATE TABLE Graph.StockItems (
212+
StockItemID int,
213+
StockItemName nvarchar(100),
214+
SupplierID int,
215+
ColorID int,
216+
UnitPackageID int,
217+
OuterPackageID int,
218+
Brand nvarchar(50),
219+
Size nvarchar(20),
220+
LeadTimeDays int,
221+
QuantityPerOuter int,
222+
IsChillerStock bit,
223+
Barcode nvarchar(50),
224+
TaxRate decimal(18,3),
225+
UnitPrice decimal(18,2),
226+
RecommendedRetailPrice decimal(18,2),
227+
TypicalWeightPerUnit decimal(5,2),
228+
MarketingComments nvarchar(max),
229+
InternalComments nvarchar(max),
230+
Photo varbinary(max),
231+
CustomFields nvarchar(max),
232+
Tags nvarchar(max),
233+
SearchDetails nvarchar(max),
234+
LastEditedBy int,
235+
ValidFrom datetime2,
236+
ValidTo datetime2
237+
) AS NODE;
238+
239+
/* PURCHASING schema */
240+
-- create node table for Purchasing.Suppliers
241+
CREATE TABLE Graph.Suppliers (
242+
SupplierID int,
243+
SupplierName nvarchar(100),
244+
SupplierCategoryID int,
245+
PrimaryContactPersonID int,
246+
AlternateContactPersonID int,
247+
DeliveryMethodID int,
248+
DeliveryCityID int,
249+
PostalCityID int,
250+
SupplierReference nvarchar(20),
251+
BankAccountName nvarchar(50),
252+
BankAccountBranch nvarchar(50),
253+
BankAccountCode nvarchar(20),
254+
BankAccountNumber nvarchar(20),
255+
BankInternationalCode nvarchar(20),
256+
PaymentDays int,
257+
InternalComments nvarchar(max),
258+
PhoneNumber nvarchar(20),
259+
FaxNumber nvarchar(20),
260+
WebsiteURL nvarchar(256),
261+
DeliveryAddressLine1 nvarchar(60),
262+
DeliveryAddressLine2 nvarchar(60),
263+
DeliveryPostalCode nvarchar(10),
264+
DeliveryLocation geography,
265+
PostalAddressLine1 nvarchar(60),
266+
PostalAddressLine2 nvarchar(60),
267+
PostalPostalCode nvarchar(10),
268+
LastEditedBy int,
269+
ValidFrom datetime2,
270+
ValidTo datetime2
271+
) AS NODE
272+
273+
-- create edge table for Purchasing.PurchaseOrders and Purchasing.PurchaseOrderLines
274+
-- LINK: SUPPLIERS -->> STOCKITEMS
275+
CREATE TABLE Graph.PurchaseOrderLines_SuppliersToStockItems (
276+
277+
-- from Purchasing.PurchaseOrders
278+
PurchaseOrderID int,
279+
SupplierID int,
280+
OrderDate date,
281+
DeliveryMethodID int,
282+
ContactPersonID int,
283+
ExpectedDeliveryDate date,
284+
SupplierReference nvarchar(20),
285+
IsOrderFinalized bit,
286+
Comments nvarchar(max),
287+
InternalComments nvarchar(max),
288+
289+
-- from Purchasing.PurchaseOrderLines
290+
PurchaseOrderLineID int,
291+
StockItemID int,
292+
OrderedOuters int,
293+
Description nvarchar(100),
294+
ReceivedOuters int,
295+
PackageTypeID int,
296+
ExpectedUnitPricePerOuter decimal(18,2),
297+
LastReceiptDate date,
298+
IsOrderLineFinalized bit,
299+
LastEditedBy int,
300+
LastEditedWhen datetime2
301+
) AS EDGE;
302+
303+
-- create edge table for Purchasing.PurchaseOrders and Purchasing.PurchaseOrderLines
304+
-- LINK: STOCKITEMS -->> SUPPLIERS
305+
CREATE TABLE Graph.PurchaseOrderLines_StockItemsToSuppliers (
306+
307+
-- from Purchasing.PurchaseOrders
308+
PurchaseOrderID int,
309+
SupplierID int,
310+
OrderDate date,
311+
DeliveryMethodID int,
312+
ContactPersonID int,
313+
ExpectedDeliveryDate date,
314+
SupplierReference nvarchar(20),
315+
IsOrderFinalized bit,
316+
Comments nvarchar(max),
317+
InternalComments nvarchar(max),
318+
319+
-- from Purchasing.PurchaseOrderLines
320+
PurchaseOrderLineID int,
321+
StockItemID int,
322+
OrderedOuters int,
323+
Description nvarchar(100),
324+
ReceivedOuters int,
325+
PackageTypeID int,
326+
ExpectedUnitPricePerOuter decimal(18,2),
327+
LastReceiptDate date,
328+
IsOrderLineFinalized bit,
329+
LastEditedBy int,
330+
LastEditedWhen datetime2
331+
) AS EDGE;
332+
333+
go
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# -*- coding: utf-8 -*-
2+
import os
3+
from optparse import OptionParser
4+
5+
def main(input_file_path, schema, table):
6+
7+
# create the output file in the same directory as the input file
8+
output_file_path = os.path.splitext(input_file_path)[0] + '_as_node' + '.csv'
9+
10+
# you may have to change the encoding
11+
with open(input_file_path, mode='r', encoding = 'utf-16le', newline='') as input_file, \
12+
open(output_file_path, mode='w', encoding = 'utf-16le', newline='') as output_file:
13+
14+
line = input_file.readline()
15+
line_number = 0
16+
17+
# read each line of the input file, add a $node_id column and write down the result on the output file
18+
while line:
19+
if line_number == 0:
20+
newline = '\ufeff' + '{"type":"node","schema":"' + schema + '","table":"' + table + '","id":' + str(line_number) + '}\t' + line[1:]
21+
else:
22+
newline = '{"type":"node","schema":"' + schema + '","table":"' + table + '","id":' + str(line_number) + '}\t' + line
23+
output_file.write(newline)
24+
line_number += 1
25+
line = input_file.readline()
26+
27+
if __name__ == '__main__':
28+
29+
parser = OptionParser()
30+
31+
parser.add_option("-f", "--file", action="store", type="string", dest="input_file_name")
32+
parser.add_option("-s", "--schema", action="store", type="string", dest="schema")
33+
parser.add_option("-t", "--table", action="store", type="string", dest="table")
34+
35+
(options, args) = parser.parse_args()
36+
37+
# retrieve options if not provided by the user
38+
if (options.input_file_name == None):
39+
options.name = input('Please enter the full path to the csv file you will import as a node table:')
40+
if (options.schema == None):
41+
options.name = input('Please enter the SQL schema of the node table you will populate:')
42+
if (options.table == None):
43+
options.name = input('Please enter the SQL name of the node table you are populate:')
44+
45+
main(options.input_file_name, options.schema, options.table)
46+

0 commit comments

Comments
 (0)