MicrosoftDocs
diff --git a/‎articles/data-factory/TOC.yml
Lines changed: 35 additions & 23 deletions b/‎articles/data-factory/TOC.yml
Lines changed: 35 additions & 23 deletions
diff --git a/‎articles/data-factory/frequently-asked-questions.md
Lines changed: 76 additions & 0 deletions b/‎articles/data-factory/frequently-asked-questions.md
Lines changed: 76 additions & 0 deletions
diff --git a/‎articles/data-factory/media/wrangling-data-flow/tutorial1.png
271 KB b/‎articles/data-factory/media/wrangling-data-flow/tutorial1.png
271 KB
diff --git a/‎articles/data-factory/media/wrangling-data-flow/tutorial2.png
153 KB b/‎articles/data-factory/media/wrangling-data-flow/tutorial2.png
153 KB
diff --git a/‎articles/data-factory/media/wrangling-data-flow/tutorial3.png
276 KB b/‎articles/data-factory/media/wrangling-data-flow/tutorial3.png
276 KB
diff --git a/‎articles/data-factory/media/wrangling-data-flow/tutorial4.png
233 KB b/‎articles/data-factory/media/wrangling-data-flow/tutorial4.png
233 KB
diff --git a/‎articles/data-factory/media/wrangling-data-flow/tutorial5.png
276 KB b/‎articles/data-factory/media/wrangling-data-flow/tutorial5.png
276 KB
diff --git a/‎articles/data-factory/media/wrangling-data-flow/tutorial6.png
598 KB b/‎articles/data-factory/media/wrangling-data-flow/tutorial6.png
598 KB
diff --git a/‎articles/data-factory/media/wrangling-data-flow/tutorial7.png
337 KB b/‎articles/data-factory/media/wrangling-data-flow/tutorial7.png
337 KB
diff --git a/‎articles/data-factory/transform-data.md
Lines changed: 25 additions & 13 deletions b/‎articles/data-factory/transform-data.md
Lines changed: 25 additions & 13 deletions
@@ -80,8 +80,12 @@
       items:
       - name: Copy Data tool
         href: tutorial-incremental-copy-partitioned-file-name-copy-data-tool.md
-  - name: Transform data with mapping data flow
-    href: tutorial-data-flow.md
+  - name: Transform and prepare data with data flows
+    items:
+    - name: Transform data with a mapping data flow
+      href: tutorial-data-flow.md
+    - name: Prepare data with a wrangling data flow
+      href: wrangling-data-flow-tutorial.md
   - name: External transformations
     items:
     - name: HDInsight Spark
@@ -143,28 +147,36 @@
     href: concepts-pipeline-execution-triggers.md
   - name: Integration runtime
     href: concepts-integration-runtime.md
-  - name: Mapping data flows
+  - name: Data flows
     items:
-    - name: Mapping data flow overview
-      href: concepts-data-flow-overview.md
-    - name: Debug mode
-      href: concepts-data-flow-debug-mode.md
-    - name: JSON handling
-      href: concepts-data-flow-json.md
-    - name: Schema drift
-      href: concepts-data-flow-schema-drift.md
-    - name: Column patterns
-      href: concepts-data-flow-column-pattern.md
-    - name: Data flow monitoring
-      href: concepts-data-flow-monitoring.md
-    - name: Data flow performance
-      href: concepts-data-flow-performance.md
-    - name: Move nodes
-      href: concepts-data-flow-move-nodes.md
-    - name: Expression builder
-      href: concepts-data-flow-expression-builder.md
-    - name: Expression language
-      href: data-flow-expression-functions.md
+    - name: Transform data with mapping data flows
+      items:
+      - name: Mapping data flow overview
+        href: concepts-data-flow-overview.md
+      - name: Debug mode
+        href: concepts-data-flow-debug-mode.md
+      - name: JSON handling
+        href: concepts-data-flow-json.md
+      - name: Schema drift
+        href: concepts-data-flow-schema-drift.md
+      - name: Column patterns
+        href: concepts-data-flow-column-pattern.md
+      - name: Data flow monitoring
+        href: concepts-data-flow-monitoring.md
+      - name: Data flow performance
+        href: concepts-data-flow-performance.md
+      - name: Move nodes
+        href: concepts-data-flow-move-nodes.md
+      - name: Expression builder
+        href: concepts-data-flow-expression-builder.md
+      - name: Expression language
+        href: data-flow-expression-functions.md
+    - name: Prepare data with Wrangling data flows
+      items:
+      - name: Wrangling data flow overview
+        href: wrangling-data-flow-overview.md
+      - name: Supported functions
+        href: wrangling-data-flow-functions.md
   - name: Roles and permissions
     href: concepts-roles-permissions.md
   - name: Understanding pricing
 
@@ -187,6 +187,82 @@ Use the Copy activity to stage data from any of the other connectors, and then e
 
 Self-hosted IR is an ADF pipeline construct that you can use with the Copy Activity to acquire or move data to and from on-prem or VM-based data sources and sinks. Stage the data first with a Copy, then Data Flow for transformation, and then a subsequent copy if you need to move that transformed data back to the on-prem store.
 
+## Wrangling data flows
+
+### What are the supported regions for wrangling data flow?
+
+Wrangling data flow is currently supported in data factories created in following regions:
+
+* Australia East
+* Canada Central
+* Central India
+* Central US
+* East US
+* East US 2
+* Japan East
+* North Europe
+* Southeast Asia
+* South Central US
+* UK South
+* West Central US
+* West Europe
+* West US
+* West US 2
+
+### What are the limitations and constraints with wrangling data flow?
+
+Dataset names can only contain alpha-numeric characters. The following data stores are supported:
+
+* DelimitedText dataset in Azure Blob Storage using account key authentication
+* DelimitedText dataset in Azure Data Lake Storage gen2 using account key or service principal authentication
+* DelimitedText dataset in Azure Data Lake Storage gen1 using service principal authentication
+* Azure SQL Database and Data Warehouse using sql authentication. See supported SQL types below. There is no PolyBase or staging support for data warehouse.
+
+At this time, linked service Key Vault integration is not supported in wrangling data flows.
+
+### What is the difference between mapping and wrangling data flows?
+
+Mapping data flows provide a way to transform data at scale without any coding required. You can design a data transformation job in the data flow canvas by constructing a series of transformations. Start with any number of source transformations followed by data transformation steps. Complete your data flow with a sink to land your results in a destination. Mapping data flow is great at mapping and transforming data with both known and unknown schemas in the sinks and sources.
+
+Wrangling data flows allow you to do agile data preparation and exploration using the Power Query Online mashup editor at scale via spark execution. With the rise of data lakes sometimes you just need to explore a data set or create a dataset in the lake. You aren't mapping to a known target. Wrangling data flows are used for less formal and model-based analytics scenarios.
+
+### What is the difference between Power Platform Dataflows and wrangling data flows?
+
+Power Platform Dataflows allow users to import and transform data from a wide range of data sources into the Common Data Service and Azure Data Lake to build PowerApps applications, Power BI reports or Flow automations. Power Platform Dataflows use the established Power Query data preparation experiences, similar to Power BI and Excel. Power Platform Dataflows also enable easy reuse within an organization and automatically handle orchestration (e.g. automatically refreshing dataflows that depend on another dataflow when the former one is refreshed).
+
+Azure Data Factory (ADF) is a managed data integration service that allows data engineers and citizen data integrator to create complex hybrid extract-transform-load (ETL) and extract-load-transform (ELT) workflows. Wrangling data flow in ADF empowers users with a code-free, serverless environment that simplifies data preparation in the cloud and scales to any data size with no infrastructure management required. It uses the Power Query data preparation technology (also used in Power Platform dataflows, Excel, Power BI) to prepare and shape the data. Built to handle all the complexities and scale challenges of big data integration, wrangling data flows allow users to quickly prepare data at scale via spark execution. Users can build resilient data pipelines in an accessible visual environment with our browser-based interface and let ADF handle the complexities of Spark execution. Build schedules for your pipelines and monitor your data flow executions from the ADF monitoring portal. Easily manage data availability SLAs with ADF’s rich availability monitoring and alerts and leverage built-in continuous integration and deployment capabilities to save and manage your flows in a managed environment. Establish alerts and view execution plans to validate that your logic is performing as planned as you tune your data flows.
+
+### Supported SQL Types
+
+Wrangling data flow supports the following data types in SQL. You will get a validation error for using a data type that isn't supported.
+
+* short
+* double
+* real
+* float
+* char
+* nchar
+* varchar
+* nvarchar
+* integer
+* int
+* bit
+* boolean
+* smallint
+* tinyint
+* bigint
+* long
+* text
+* date
+* datetime
+* datetime2
+* smalldatetime
+* timestamp
+* uniqueidentifier
+* xml
+
+Other data types will be supported in the future.
+
 ## Next steps
 For step-by-step instructions to create a data factory, see the following tutorials:
 
 
@@ -32,52 +32,64 @@ This article explains data transformation activities in Azure Data Factory that
 
 Data Factory supports the following data transformation activities that can be added to [pipelines](concepts-pipelines-activities.md) either individually or chained with another activity.
 
-## HDInsight Hive activity
+## Transform natively in Azure Data Factory with data flows
+
+### Mapping data flows
+
+Mapping data flows are visually designed data transformations in Azure Data Factory. Data flows allow data engineers to develop graphical data transformation logic without writing code. The resulting data flows are executed as activities within Azure Data Factory pipelines that use scaled-out Spark clusters. Data flow activities can be operationalized via existing Data Factory scheduling, control, flow, and monitoring capabilities. For more information, see [mapping data flows](concepts-data-flow-overview.md).
+
+### Wrangling data flows
+
+Wrangling data flows in Azure Data Factory allow you to do code-free data preparation at cloud scale iteratively. Wrangling data flows integrate with [Power Query Online](https://docs.microsoft.com/power-query/) and makes Power Query M functions available for data wrangling at cloud scale via spark execution. For more information, see [wrangling data flows](wrangling-data-flow-overview.md).
+
+## External transformations
+
+### HDInsight Hive activity
 The HDInsight Hive activity in a Data Factory pipeline executes Hive queries on your own or on-demand Windows/Linux-based HDInsight cluster. See [Hive activity](transform-data-using-hadoop-hive.md) article for details about this activity. 
 
-## HDInsight Pig activity
+### HDInsight Pig activity
 The HDInsight Pig activity in a Data Factory pipeline executes Pig queries on your own or on-demand Windows/Linux-based HDInsight cluster. See [Pig activity](transform-data-using-hadoop-pig.md) article for details about this activity. 
 
-## HDInsight MapReduce activity
+### HDInsight MapReduce activity
 The HDInsight MapReduce activity in a Data Factory pipeline executes MapReduce programs on your own or on-demand Windows/Linux-based HDInsight cluster. See [MapReduce activity](transform-data-using-hadoop-map-reduce.md) article for details about this activity.
 
-## HDInsight Streaming activity
+### HDInsight Streaming activity
 The HDInsight Streaming activity in a Data Factory pipeline executes Hadoop Streaming programs on your own or on-demand Windows/Linux-based HDInsight cluster. See [HDInsight Streaming activity](transform-data-using-hadoop-streaming.md) for details about this activity.
 
-## HDInsight Spark activity
+### HDInsight Spark activity
 The HDInsight Spark activity in a Data Factory pipeline executes Spark programs on your own HDInsight cluster. For details, see [Invoke Spark programs from Azure Data Factory](transform-data-using-spark.md). 
 
-## Machine Learning activities
+### Machine Learning activities
 Azure Data Factory enables you to easily create pipelines that use a published Azure Machine Learning web service for predictive analytics. Using the [Batch Execution activity](transform-data-using-machine-learning.md) in an Azure Data Factory pipeline, you can invoke a Machine Learning web service to make predictions on the data in batch.
 
 Over time, the predictive models in the Machine Learning scoring experiments need to be retrained using new input datasets. After you are done with retraining, you want to update the scoring web service with the retrained Machine Learning model. You can use the [Update Resource activity](update-machine-learning-models.md) to update the web service with the newly trained model.  
 
 See [Use Machine Learning activities](transform-data-using-machine-learning.md) for details about these Machine Learning activities. 
 
-## Stored procedure activity
+### Stored procedure activity
 You can use the SQL Server Stored Procedure activity in a Data Factory pipeline to invoke a stored procedure in one of the following data stores: Azure SQL Database, Azure SQL Data Warehouse, SQL Server Database in your enterprise or an Azure VM. See [Stored Procedure activity](transform-data-using-stored-procedure.md) article for details.  
 
-## Data Lake Analytics U-SQL activity
+### Data Lake Analytics U-SQL activity
 Data Lake Analytics U-SQL activity runs a U-SQL script on an Azure Data Lake Analytics cluster. See [Data Analytics U-SQL activity](transform-data-using-data-lake-analytics.md) article for details. 
 
-## Databricks Notebook activity
+### Databricks Notebook activity
 
 The Azure Databricks Notebook Activity in a Data Factory pipeline runs a Databricks notebook in your Azure Databricks workspace.Azure Databricks is a managed platform for running Apache Spark. See [Transform data by running a Databricks notebook](transform-data-databricks-notebook.md).
 
-## Databricks Jar activity
+### Databricks Jar activity
 
 The Azure Databricks Jar Activity in a Data Factory pipeline runs a Spark Jar in your Azure Databricks cluster. Azure Databricks is a managed platform for running Apache Spark. See [Transform data by running a Jar activity in Azure Databricks](transform-data-databricks-jar.md).
 
-## Databricks Python activity
+### Databricks Python activity
 
 The Azure Databricks Python Activity in a Data Factory pipeline runs a Python file in your Azure Databricks cluster. Azure Databricks is a managed platform for running Apache Spark. See [Transform data by running a Python activity in Azure Databricks](transform-data-databricks-python.md).
 
-## Custom activity
+### Custom activity
 If you need to transform data in a way that is not supported by Data Factory, you can create a custom activity with your own data processing logic and use the activity in the pipeline. You can configure the custom .NET activity to run using either an Azure Batch service or an Azure HDInsight cluster. See [Use custom activities](transform-data-using-dotnet-custom-activity.md) article for details. 
 
 You can create a custom activity to run R scripts on your HDInsight cluster with R installed. See [Run R Script using Azure Data Factory](https://github.com/Azure/Azure-DataFactory/tree/master/Samples/RunRScriptUsingADFSample). 
 
-## Compute environments
+### Compute environments
 You create a linked service for the compute environment and then use the linked service when defining a transformation activity. There are two types of compute environments supported by Data Factory. 
 
 - **On-Demand**:  In this case, the computing environment is fully managed by Data Factory. It is automatically created by the Data Factory service before a job is submitted to process data and removed when the job is completed. You can configure and control granular settings of the on-demand compute environment for job execution, cluster management, and bootstrapping actions.