You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/frequently-asked-questions.md
+1-76Lines changed: 1 addition & 76 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -221,87 +221,12 @@ Use the Copy activity to stage data from any of the other connectors, and then e
221
221
222
222
### Is the self-hosted integration runtime available for data flows?
223
223
224
-
Self-hosted IR is an ADF pipeline construct that you can use with the Copy Activity to acquire or move data to and from on-prem or VM-based data sources and sinks. Stage the data first with a Copy, then Data Flow for transformation, and then a subsequent copy if you need to move that transformed data back to the on-prem store.
224
+
Self-hosted IR is an ADF pipeline construct that you can use with the Copy Activity to acquire or move data to and from on-prem or VM-based data sources and sinks. The virtual machines that you use for a self-hosted IR can also be placed inside of the same VNET as your protected data stores for access to those data stores from ADF. With data flows, you'll achieve these same end-results using the Azure IR with managed VNET instead.
225
225
226
226
### Does the data flow compute engine serve multiple tenants?
227
227
228
228
Clusters are never shared. We guarantee isolation for each job run in production runs. In case of debug scenario one person gets one cluster, and all debugs will go to that cluster which are initiated by that user.
229
229
230
-
## Wrangling data flows
231
-
232
-
### What are the supported regions for wrangling data flow?
233
-
234
-
Wrangling data flow is currently supported in data factories created in following regions:
235
-
236
-
* Australia East
237
-
* Canada Central
238
-
* Central India
239
-
* East US
240
-
* East US 2
241
-
* Japan East
242
-
* North Europe
243
-
* Southeast Asia
244
-
* South Central US
245
-
* UK South
246
-
* West Central US
247
-
* West Europe
248
-
* West US
249
-
* West US 2
250
-
251
-
### What are the limitations and constraints with wrangling data flow?
252
-
253
-
Dataset names can only contain alpha-numeric characters. The following data stores are supported:
254
-
255
-
* DelimitedText dataset in Azure Blob Storage using account key authentication
256
-
* DelimitedText dataset in Azure Data Lake Storage gen2 using account key or service principal authentication
257
-
* DelimitedText dataset in Azure Data Lake Storage gen1 using service principal authentication
258
-
* Azure SQL Database and Data Warehouse using sql authentication. See supported SQL types below. There is no PolyBase or staging support for data warehouse.
259
-
260
-
At this time, linked service Key Vault integration is not supported in wrangling data flows.
261
-
262
-
### What is the difference between mapping and wrangling data flows?
263
-
264
-
Mapping data flows provide a way to transform data at scale without any coding required. You can design a data transformation job in the data flow canvas by constructing a series of transformations. Start with any number of source transformations followed by data transformation steps. Complete your data flow with a sink to land your results in a destination. Mapping data flow is great at mapping and transforming data with both known and unknown schemas in the sinks and sources.
265
-
266
-
Wrangling data flows allow you to do agile data preparation and exploration using the Power Query Online mashup editor at scale via spark execution. With the rise of data lakes sometimes you just need to explore a data set or create a dataset in the lake. You aren't mapping to a known target. Wrangling data flows are used for less formal and model-based analytics scenarios.
267
-
268
-
### What is the difference between Power Platform Dataflows and wrangling data flows?
269
-
270
-
Power Platform Dataflows allow users to import and transform data from a wide range of data sources into the Common Data Service and Azure Data Lake to build PowerApps applications, Power BI reports or Flow automations. Power Platform Dataflows use the established Power Query data preparation experiences, similar to Power BI and Excel. Power Platform Dataflows also enable easy reuse within an organization and automatically handle orchestration (e.g. automatically refreshing dataflows that depend on another dataflow when the former one is refreshed).
271
-
272
-
Azure Data Factory (ADF) is a managed data integration service that allows data engineers and citizen data integrator to create complex hybrid extract-transform-load (ETL) and extract-load-transform (ELT) workflows. Wrangling data flow in ADF empowers users with a code-free, serverless environment that simplifies data preparation in the cloud and scales to any data size with no infrastructure management required. It uses the Power Query data preparation technology (also used in Power Platform dataflows, Excel, Power BI) to prepare and shape the data. Built to handle all the complexities and scale challenges of big data integration, wrangling data flows allow users to quickly prepare data at scale via spark execution. Users can build resilient data pipelines in an accessible visual environment with our browser-based interface and let ADF handle the complexities of Spark execution. Build schedules for your pipelines and monitor your data flow executions from the ADF monitoring portal. Easily manage data availability SLAs with ADF's rich availability monitoring and alerts and leverage built-in continuous integration and deployment capabilities to save and manage your flows in a managed environment. Establish alerts and view execution plans to validate that your logic is performing as planned as you tune your data flows.
273
-
274
-
### Supported SQL Types
275
-
276
-
Wrangling data flow supports the following data types in SQL. You will get a validation error for using a data type that isn't supported.
277
-
278
-
* short
279
-
* double
280
-
* real
281
-
* float
282
-
* char
283
-
* nchar
284
-
* varchar
285
-
* nvarchar
286
-
* integer
287
-
* int
288
-
* bit
289
-
* boolean
290
-
* smallint
291
-
* tinyint
292
-
* bigint
293
-
* long
294
-
* text
295
-
* date
296
-
* datetime
297
-
* datetime2
298
-
* smalldatetime
299
-
* timestamp
300
-
* uniqueidentifier
301
-
* xml
302
-
303
-
Other data types will be supported in the future.
304
-
305
230
## Next steps
306
231
307
232
For step-by-step instructions to create a data factory, see the following tutorials:
0 commit comments