Merge pull request #110769 from kevinvngo/patch-148

v-shils · web-flow · commit 26ff709fae9d · 2020-04-08T16:59:52.000-07:00
Updated TOC with data load quickstart
diff --git a/articles/synapse-analytics/sql-data-warehouse/quickstart-bulk-load-copy-tsql.md b/articles/synapse-analytics/sql-data-warehouse/quickstart-bulk-load-copy-tsql.md
@@ -0,0 +1,109 @@
+---
+title: 'Quickstart: Bulk load data using a single T-SQL statement'
+description: Bulk load data using the COPY statement
+services: synapse-analytics
+author: kevinvngo
+manager: craigg
+ms.service: synapse-analytics
+ms.topic: quickstart
+ms.subservice: 
+ms.date: 04/08/2020
+ms.author: kevin
+ms.reviewer: jrasnick
+ms.custom: azure-synapse
+---
+
+# Quickstart: Bulk load data using the COPY statement
+
+In this quickstart, you'll bulk load data into your SQL pool using the simple and flexible [COPY statement](https://docs.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest) for high-throughput data ingestion. The COPY statement is the recommended loading utility as it enables you to seamlessly and flexibly load data by providing functionality to:
+
+- Allow lower privileged users to load without needing strict CONTROL permissions on the data warehouse
+- Leverage only a single T-SQL statement without having to create any additional database objects
+- Leverage a finer permission model without exposing storage account keys using Share Access Signatures (SAS)
+- Specify a different storage account for the ERRORFILE location (REJECTED_ROW_LOCATION)
+- Customize default values for each target column and specify source data fields to load into specific target columns
+- Specify a custom row terminator for CSV files
+- Escape string, field, and row delimiters for CSV files
+- Leverage SQL Server Date formats for CSV files
+- Specify wildcards and multiple files in the storage location path
+
+## Prerequisites
+
+This quickstart assumes you already have a SQL pool. If a SQL pool hasn't been created, use the [Create and Connect - portal](create-data-warehouse-portal.md) quickstart.
+
+## Create the target table
+
+In this example, we'll be loading data from the New York taxi dataset. we'll load a table called Trip that represents taxi trips taken within a single year. Run the following to create the table:
+
+```sql
+CREATE TABLE [dbo].[Trip]
+(
+    [DateID] int NOT NULL,
+    [MedallionID] int NOT NULL,
+    [HackneyLicenseID] int NOT NULL,
+    [PickupTimeID] int NOT NULL,
+    [DropoffTimeID] int NOT NULL,
+    [PickupGeographyID] int NULL,
+    [DropoffGeographyID] int NULL,
+    [PickupLatitude] float NULL,
+    [PickupLongitude] float NULL,
+    [PickupLatLong] varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
+    [DropoffLatitude] float NULL,
+    [DropoffLongitude] float NULL,
+    [DropoffLatLong] varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
+    [PassengerCount] int NULL,
+    [TripDurationSeconds] int NULL,
+    [TripDistanceMiles] float NULL,
+    [PaymentType] varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
+    [FareAmount] money NULL,
+    [SurchargeAmount] money NULL,
+    [TaxAmount] money NULL,
+    [TipAmount] money NULL,
+    [TollsAmount] money NULL,
+    [TotalAmount] money NULL
+)
+WITH
+(
+    DISTRIBUTION = ROUND_ROBIN,
+    CLUSTERED COLUMNSTORE INDEX
+);
+```
+
+## Run the COPY statement
+
+Run the following COPY statement that will load data from the Azure blob storage account into the Trip table.
+
+```sql
+COPY INTO [dbo].[Trip] FROM 'https://nytaxiblob.blob.core.windows.net/2013/Trip2013/'
+WITH (
+   FIELDTERMINATOR='|',
+   ROWTERMINATOR='0x0A'
+) OPTION (LABEL = 'COPY: dbo.trip');
+```
+
+## Monitor the load
+
+Check whether your load is making progress by periodically running the following query:
+
+```sql
+SELECT  r.[request_id]                           
+,       r.[status]                               
+,       r.resource_class                         
+,       r.command
+,       sum(bytes_processed) AS bytes_processed
+,       sum(rows_processed) AS rows_processed
+FROM    sys.dm_pdw_exec_requests r
+              JOIN sys.dm_pdw_dms_workers w
+                     ON r.[request_id] = w.request_id
+WHERE [label] = 'COPY: dbo.trip' and session_id <> session_id() and type = 'WRITER'
+GROUP BY r.[request_id]                           
+,       r.[status]                               
+,       r.resource_class                         
+,       r.command;
+
+```
+
+## Next steps
+
+- For best practices on data loading, see [Best Practices for Loading Data](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/guidance-for-loading-data).
+- For information on how to manage the resources for your data loads, see [Workload Isolation](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/quickstart-configure-workload-isolation-tsql). 
diff --git a/articles/synapse-analytics/sql-data-warehouse/toc.yml b/articles/synapse-analytics/sql-data-warehouse/toc.yml
@@ -31,12 +31,10 @@
       href: create-data-warehouse-portal.md
     - name: PowerShell
       href: create-data-warehouse-powershell.md
-  - name: Pause and resume
-    items:
-    - name: Portal
-      href: pause-and-resume-compute-portal.md
-    - name: PowerShell
-      href: pause-and-resume-compute-powershell.md
+  - name: Load data
+    items: 
+    - name: COPY statement
+      href: quickstart-bulk-load-copy-tsql.md
   - name: Scale
     items:
     - name: Portal
@@ -55,6 +53,12 @@
       items:
       - name: T-SQL
         href: quickstart-configure-workload-isolation-tsql.md
+  - name: Pause and resume
+    items:
+    - name: Portal
+      href: pause-and-resume-compute-portal.md
+    - name: PowerShell
+      href: pause-and-resume-compute-powershell.md
 - name: Concepts
   items:
   - name: Security