Create quickstart for data loading

kevinvngo · web-flow · commit 2d31f462adb4 · 2020-04-08T16:09:01.000-07:00
diff --git a/articles/synapse-analytics/sql-data-warehouse/quickstart-bulk-load-copy-tsql.md b/articles/synapse-analytics/sql-data-warehouse/quickstart-bulk-load-copy-tsql.md
@@ -0,0 +1,110 @@
+---
+title: 'Quickstart: Bulk load data using a single T-SQL statement'
+description: Bulk load data using the COPY statement
+services: synapse-analytics
+author: kevinvngo
+manager: craigg
+ms.service: synapse-analytics
+ms.topic: quickstart
+ms.subservice: 
+ms.date: 04/08/2020
+ms.author: kevin
+ms.reviewer: jrasnick
+ms.custom: azure-synapse
+---
+
+# Quickstart: Bulk load data using the COPY statement
+
+In this quickstart, you'll bulk load data into your SQL pool using the simple and flexible [COPY statement](https://docs.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest) for high-throughput data ingestion. The COPY statement is the recommended loading utility as it enables you to seamlessly and flexibly load data by providing functionality to:
+
+- Allow lower privileged users to load without needing strict CONTROL permissions on the data warehouse
+- Leverage only a single T-SQL statement without having to create any additional database objects
+- Leverage a finer permission model without exposing storage account keys using Share Access Signatures (SAS)
+- Specify a different storage account for the ERRORFILE location (REJECTED_ROW_LOCATION)
+- Customize default values for each target column and specify source data fields to load into specific target columns
+- Specify a custom row terminator for CSV files
+- Escape string, field, and row delimiters for CSV files
+- Leverage SQL Server Date formats for CSV files
+- Specify wildcards and multiple files in the storage location path
+
+## Prerequisites
+
+This quickstart assumes you already have a SQL pool. If a SQL pool has not been created, use the [Create and Connect - portal](create-data-warehouse-portal.md) quickstart.
+
+## Create the target table
+
+In this example, we will be loading data from the New York taxi dataset. We will be loading a table called Trip which represents taxi trips taken within a single year. Run the following to create the table:
+
+```sql
+CREATE TABLE [dbo].[Trip]
+(
+    [DateID] int NOT NULL,
+    [MedallionID] int NOT NULL,
+    [HackneyLicenseID] int NOT NULL,
+    [PickupTimeID] int NOT NULL,
+    [DropoffTimeID] int NOT NULL,
+    [PickupGeographyID] int NULL,
+    [DropoffGeographyID] int NULL,
+    [PickupLatitude] float NULL,
+    [PickupLongitude] float NULL,
+    [PickupLatLong] varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
+    [DropoffLatitude] float NULL,
+    [DropoffLongitude] float NULL,
+    [DropoffLatLong] varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
+    [PassengerCount] int NULL,
+    [TripDurationSeconds] int NULL,
+    [TripDistanceMiles] float NULL,
+    [PaymentType] varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
+    [FareAmount] money NULL,
+    [SurchargeAmount] money NULL,
+    [TaxAmount] money NULL,
+    [TipAmount] money NULL,
+    [TollsAmount] money NULL,
+    [TotalAmount] money NULL
+)
+WITH
+(
+    DISTRIBUTION = ROUND_ROBIN,
+    CLUSTERED COLUMNSTORE INDEX
+);
+```
+
+## Run the COPY statement
+
+Run the following COPY statement which will load data from the Azure blob storage account into the Trip table.
+
+```sql
+COPY INTO [dbo].[Trip] FROM 'https://nytaxiblob.blob.core.windows.net/2013/Trip2013/'
+WITH (
+   FIELDTERMINATOR='|',
+   ROWTERMINATOR='0x0A'
+) OPTION (LABEL = 'COPY: dbo.trip');
+```
+
+## Monitor the load
+
+Check whether your load is making progress by periodically running the following query:
+
+```sql
+SELECT  r.[request_id]                           
+,       r.[status]                               
+,       r.resource_class                         
+,             r.command
+,             sum(bytes_processed) AS bytes_processed
+,             sum(rows_processed) AS rows_processed
+FROM    sys.dm_pdw_exec_requests r
+              JOIN sys.dm_pdw_dms_workers w
+                     ON r.[request_id] = w.request_id
+WHERE [label] = 'COPY: dbo.trip' and session_id <> session_id() and type = 'WRITER'
+GROUP BY r.[request_id]                           
+,       r.[status]                               
+,       r.resource_class                         
+,             r.command
+,             [type];
+
+```
+
+## Next steps
+
+- For best practices on data loading, see [Best Practices for Loading Data](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/guidance-for-loading-data).
+- For information on how to manage the resources for your data loads, see [Workload Isolation](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/quickstart-configure-workload-isolation-tsql).