-
Notifications
You must be signed in to change notification settings - Fork 601
Open
Labels
Description
Describe the feature
Introduce an optional, opt-in parameter to dbt_utils.generate_surrogate_key that allows trimming leading and trailing whitespace from input fields before hashing.
The default behavior remains unchanged to preserve backward compatibility. When explicitly enabled, trim() is applied to each field prior to casting and hashing.
dbt_utils.generate_surrogate_key(
['col1', 'col2'],
trim=true
)Describe alternatives you've considered
- Manually applying
trim()to each column when calling the macro - Creating a project-level wrapper macro
These alternatives work but require repetitive SQL and reduce consistency across models.
Additional context
- Opt-in only; no default behavior change
- Adapter-agnostic and warehouse-independent
- Intended to improve surrogate key stability when source data contains inconsistent whitespace
Who will this benefit?
Teams generating surrogate keys from:
- Source systems with inconsistent leading or trailing whitespace
- External systems where whitespace is not semantically meaningful
- Pipelines that require stable keys across reprocessing
This helps prevent unintended key changes caused solely by formatting differences.
Are you interested in contributing this feature?
Yes — I am happy to implement the change, add tests, and update documentation.
Reactions are currently unavailable