Skip to content

Commit ff88f08

Browse files
committed
surrogate key update
1 parent dab6b19 commit ff88f08

File tree

6 files changed

+43
-15
lines changed

6 files changed

+43
-15
lines changed
Lines changed: 43 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,79 @@
11
---
2-
title: Mapping data flow Surrogate Key Transformation
2+
title: Surrogate key transformation in mapping data flow
33
description: How to use Azure Data Factory's mapping data flow Surrogate Key Transformation to generate sequential key values
44
author: kromerm
55
ms.author: makromer
6-
ms.reviewer: douglasl
6+
ms.reviewer: daperlov
77
ms.service: data-factory
88
ms.topic: conceptual
99
ms.custom: seo-lt-2019
10-
ms.date: 02/12/2019
10+
ms.date: 04/08/2020
1111
---
1212

13-
# Mapping data flow Surrogate Key Transformation
13+
# Surrogate key transformation in mapping data flow
1414

15+
Use the surrogate key transformation to add an incrementing key value to each row of data. This is useful when designing dimension tables in a star schema analytical data model. In a star schema, each member in your dimension tables requires a unique key that is a non-business key.
1516

16-
17-
Use the Surrogate Key Transformation to add an incrementing non-business arbitrary key value to your data flow rowset. This is useful when designing dimension tables in a star schema analytical data model where each member in your dimension tables needs to have a unique key that is a non-business key, part of the Kimball DW methodology.
17+
## Configuration
1818

1919
![Surrogate Key Transform](media/data-flow/surrogate.png "Surrogate Key Transformation")
2020

21-
"Key Column" is the name that you will give to your new surrogate key column.
21+
**Key column:** The name of the generated surrogate key column.
2222

23-
"Start Value" is the beginning point of the incremental value.
23+
**Start value:** The lowest key value that will be generated.
2424

2525
## Increment keys from existing sources
2626

27-
If you'd like to start your sequence from a value that exists in a Source, you can use a Derived Column transformation immediately following your Surrogate Key transformation and add the two values together:
27+
To start your sequence from a value that exists in a source, use a derived column transformation following your surrogate key transformation to add the two values together:
2828

2929
![SK add Max](media/data-flow/sk006.png "Surrogate Key Transformation Add Max")
3030

31-
To seed the key value with the previous max, there are two techniques that you can use:
31+
### Increment from existing maximum value
32+
33+
To seed the key value with the previous max, there are two techniques that you can use based on where your source data is.
3234

33-
### Database sources
35+
#### Database sources
3436

35-
Use the "Query" option to select MAX() from your source using the Source transformation:
37+
Use a SQL query option to select MAX() from your source. For example, `Select MAX(<surrogateKeyName>) as maxval from <sourceTable>`/
3638

3739
![Surrogate Key Query](media/data-flow/sk002.png "Surrogate Key Transformation Query")
3840

39-
### File sources
41+
#### File sources
4042

41-
If your previous max value is in a file, you can use your Source transformation together with an Aggregate transformation and use the MAX() expression function to get the previous max value:
43+
If your previous max value is in a file, use the `max()` function in the aggregate transformation to get the previous max value:
4244

4345
![Surrogate Key File](media/data-flow/sk008.png "Surrogate Key File")
4446

45-
In both cases, you must Join your incoming new data together with your source that contains the previous max value:
47+
In both cases, you must join your incoming new data together with your source that contains the previous max value.
4648

4749
![Surrogate Key Join](media/data-flow/sk004.png "Surrogate Key Join")
4850

51+
## Data flow script
52+
53+
### Syntax
54+
55+
```
56+
<incomingStream>
57+
keyGenerate(
58+
output(<surrogateColumnName> as long),
59+
startAt: <number>L
60+
) ~> <surrogateKeyTransformationName>
61+
```
62+
63+
### Example
64+
65+
![Surrogate Key Transform](media/data-flow/surrogate.png "Surrogate Key Transformation")
66+
67+
The data flow script for the above surrogate key configuration is in the code snippet below.
68+
69+
```
70+
AggregateDayStats
71+
keyGenerate(
72+
output(key as long),
73+
startAt: 1L
74+
) ~> SurrogateKey1
75+
```
76+
4977
## Next steps
5078

5179
These examples use the [Join](data-flow-join.md) and [Derived Column](data-flow-derived-column.md) transformations.
130 KB
Loading
30.8 KB
Loading
34.3 KB
Loading
29.1 KB
Loading
75.3 KB
Loading

0 commit comments

Comments
 (0)