|
1 | 1 | ---
|
2 |
| -title: Azure Data Factory mapping data flow Exists transformation |
3 |
| -description: How to check for existing rows using data factory mapping data flows with Exists transformation |
| 2 | +title: Exists transformation in Azure Data Factory mapping data flow | Microsoft Docs |
| 3 | +description: Check for existing rows using the exists transformation in Azure Data Factory mapping data flow |
4 | 4 | author: kromerm
|
5 | 5 | ms.author: makromer
|
| 6 | +ms.reviewer: daperlov |
6 | 7 | ms.service: data-factory
|
7 | 8 | ms.topic: conceptual
|
8 |
| -ms.date: 01/30/2019 |
| 9 | +ms.date: 10/16/2019 |
9 | 10 | ---
|
10 | 11 |
|
11 | 12 | # Mapping data flow exists transformation
|
12 | 13 |
|
| 14 | +The exists transformation is a row filtering transformation that checks whether your data exists in another source or stream. The output stream includes all rows in the left stream that either exist or don't exist in the right stream. The exists transformation is similar to ```SQL WHERE EXISTS``` and ```SQL WHERE NOT EXISTS```. |
13 | 15 |
|
| 16 | +## Configuration |
14 | 17 |
|
15 |
| -The Exists transformation is a row filtering transformation that stops or allows rows in your data to flow through. The Exists Transform is similar to ```SQL WHERE EXISTS``` and ```SQL WHERE NOT EXISTS```. After the Exists Transformation, the resulting rows from your data stream will either include all rows where column values from source 1 exist in source 2 or do not exist in source 2. |
| 18 | +Choose which data stream you're checking for existence in the **Right stream** dropdown. |
| 19 | + |
| 20 | +Specify whether you're looking for the data to exist or not exist in the **Exist type** setting. |
| 21 | + |
| 22 | +Choose which key columns you want to compare as your exists conditions. By default, data flow looks for equality between one column in each stream. To compare via a compute value, hover over the column dropdown and select **Computed column**. |
16 | 23 |
|
17 | 24 | 
|
18 | 25 |
|
19 |
| -Choose the second source for your Exists so that Data Flow can compare values from Stream 1 against Stream 2. |
| 26 | +### Multiple exists conditions |
20 | 27 |
|
21 |
| -Select the column from Source 1 and from Source 2 whose values you wish to check against for Exists or Not Exists. |
| 28 | +To compare multiple columns from each stream, add a new exists condition by clicking the plus icon next to an existing row. Each additional condition is joined by an "and" statement. Comparing two columns is the same as the following expression: |
22 | 29 |
|
23 |
| -## Multiple exists conditions |
| 30 | +`source1@column1 == source2@column1 && source1@column2 == source2@column2` |
24 | 31 |
|
25 |
| -Next to each row in your column conditions for Exists, you'll find a + sign available when you hover over reach row. This will allow you to add multiple rows for Exists conditions. Each additional condition is an "And". |
| 32 | +### Custom expression |
26 | 33 |
|
27 |
| -## Custom expression |
| 34 | +To create a free-form expression that contains operators other than "and" and "equals to", select the **Custom expression** field. Enter a custom expression via the data flow expression builder by clicking on the blue box. |
28 | 35 |
|
29 | 36 | 
|
30 | 37 |
|
31 |
| -You can click "Custom Expression" to instead create a free-form expression as your exists or not-exists condition. Checking this box will allow you to type in your own expression as a condition. |
| 38 | +## Data flow script |
| 39 | + |
| 40 | +### Syntax |
| 41 | + |
| 42 | +``` |
| 43 | +<lefttream>, <rightStream> |
| 44 | + exists( |
| 45 | + <conditionalExpression>, |
| 46 | + negate: true | <false>, |
| 47 | + broadcast: 'none' | 'left' | 'right' | 'both' |
| 48 | + ) ~> <existsTransformationName> |
| 49 | +``` |
| 50 | + |
| 51 | +### Example |
| 52 | + |
| 53 | +The below example is an exists transformation named `checkForChanges` that takes left stream `NameNorm2` and right stream `TypeConversions`. The exists condition is the expression `NameNorm2@EmpID == TypeConversions@EmpID && NameNorm2@Region == DimEmployees@Region` that returns true if both the `EMPID` and `Region` columns in each stream matches. As we're checking for existence, `negate` is false. We aren't enabling any broadcasting in the optimize tab so `broadcast` has value `'none'`. |
| 54 | + |
| 55 | +In the Data Factory UX, this transformation looks like the below image: |
| 56 | + |
| 57 | + |
| 58 | + |
| 59 | +The data flow script for this transformation is in the snippet below: |
| 60 | + |
| 61 | +``` |
| 62 | +NameNorm2, TypeConversions |
| 63 | + exists( |
| 64 | + NameNorm2@EmpID == TypeConversions@EmpID && NameNorm2@Region == DimEmployees@Region, |
| 65 | + negate:false, |
| 66 | + broadcast: 'none' |
| 67 | + ) ~> checkForChanges |
| 68 | +``` |
32 | 69 |
|
33 | 70 | ## Next steps
|
34 | 71 |
|
|
0 commit comments