|
1 | 1 | ---
|
2 |
| -title: Mapping data flow Surrogate Key Transformation |
| 2 | +title: Surrogate key transformation in mapping data flow |
3 | 3 | description: How to use Azure Data Factory's mapping data flow Surrogate Key Transformation to generate sequential key values
|
4 | 4 | author: kromerm
|
5 | 5 | ms.author: makromer
|
6 |
| -ms.reviewer: douglasl |
| 6 | +ms.reviewer: daperlov |
7 | 7 | ms.service: data-factory
|
8 | 8 | ms.topic: conceptual
|
9 | 9 | ms.custom: seo-lt-2019
|
10 |
| -ms.date: 02/12/2019 |
| 10 | +ms.date: 04/08/2020 |
11 | 11 | ---
|
12 | 12 |
|
13 |
| -# Mapping data flow Surrogate Key Transformation |
| 13 | +# Surrogate key transformation in mapping data flow |
14 | 14 |
|
| 15 | +Use the surrogate key transformation to add an incrementing key value to each row of data. This is useful when designing dimension tables in a star schema analytical data model. In a star schema, each member in your dimension tables requires a unique key that is a non-business key. |
15 | 16 |
|
16 |
| - |
17 |
| -Use the Surrogate Key Transformation to add an incrementing non-business arbitrary key value to your data flow rowset. This is useful when designing dimension tables in a star schema analytical data model where each member in your dimension tables needs to have a unique key that is a non-business key, part of the Kimball DW methodology. |
| 17 | +## Configuration |
18 | 18 |
|
19 | 19 | 
|
20 | 20 |
|
21 |
| -"Key Column" is the name that you will give to your new surrogate key column. |
| 21 | +**Key column:** The name of the generated surrogate key column. |
22 | 22 |
|
23 |
| -"Start Value" is the beginning point of the incremental value. |
| 23 | +**Start value:** The lowest key value that will be generated. |
24 | 24 |
|
25 | 25 | ## Increment keys from existing sources
|
26 | 26 |
|
27 |
| -If you'd like to start your sequence from a value that exists in a Source, you can use a Derived Column transformation immediately following your Surrogate Key transformation and add the two values together: |
| 27 | +To start your sequence from a value that exists in a source, use a derived column transformation following your surrogate key transformation to add the two values together: |
28 | 28 |
|
29 | 29 | 
|
30 | 30 |
|
31 |
| -To seed the key value with the previous max, there are two techniques that you can use: |
| 31 | +### Increment from existing maximum value |
| 32 | + |
| 33 | +To seed the key value with the previous max, there are two techniques that you can use based on where your source data is. |
32 | 34 |
|
33 |
| -### Database sources |
| 35 | +#### Database sources |
34 | 36 |
|
35 |
| -Use the "Query" option to select MAX() from your source using the Source transformation: |
| 37 | +Use a SQL query option to select MAX() from your source. For example, `Select MAX(<surrogateKeyName>) as maxval from <sourceTable>`/ |
36 | 38 |
|
37 | 39 | 
|
38 | 40 |
|
39 |
| -### File sources |
| 41 | +#### File sources |
40 | 42 |
|
41 |
| -If your previous max value is in a file, you can use your Source transformation together with an Aggregate transformation and use the MAX() expression function to get the previous max value: |
| 43 | +If your previous max value is in a file, use the `max()` function in the aggregate transformation to get the previous max value: |
42 | 44 |
|
43 | 45 | 
|
44 | 46 |
|
45 |
| -In both cases, you must Join your incoming new data together with your source that contains the previous max value: |
| 47 | +In both cases, you must join your incoming new data together with your source that contains the previous max value. |
46 | 48 |
|
47 | 49 | 
|
48 | 50 |
|
| 51 | +## Data flow script |
| 52 | + |
| 53 | +### Syntax |
| 54 | + |
| 55 | +``` |
| 56 | +<incomingStream> |
| 57 | + keyGenerate( |
| 58 | + output(<surrogateColumnName> as long), |
| 59 | + startAt: <number>L |
| 60 | + ) ~> <surrogateKeyTransformationName> |
| 61 | +``` |
| 62 | + |
| 63 | +### Example |
| 64 | + |
| 65 | + |
| 66 | + |
| 67 | +The data flow script for the above surrogate key configuration is in the code snippet below. |
| 68 | + |
| 69 | +``` |
| 70 | +AggregateDayStats |
| 71 | + keyGenerate( |
| 72 | + output(key as long), |
| 73 | + startAt: 1L |
| 74 | + ) ~> SurrogateKey1 |
| 75 | +``` |
| 76 | + |
49 | 77 | ## Next steps
|
50 | 78 |
|
51 | 79 | These examples use the [Join](data-flow-join.md) and [Derived Column](data-flow-derived-column.md) transformations.
|
0 commit comments