Merging dbt source yaml files to preserve existing elements, add new tables, columns #1223
-
The data transformation tool dbt uses source.yaml files to expose, document and test raw data tables. While there is tooling that will generate fresh yaml that includes any schema changes since it last ran, the original file accumulates additional yaml elements that are typically added manually such as I have been unable to find the right combination of yq parameters to merge the old file (with accumulated nested metdata) and the new file (that has tables and columns that no longer exist). Either the result is the same as the old file, or it is the same as the new file. I have tried various combinations of:
For the moment, I am not concerned with changes such as deleted tables or columns (that would be a great bonus), and I'm only concerned with the So for example if I have the following where the original has two tables with descriptive metadata, and then I have an update yaml file that has a new table Appreciate any help! original.yaml
update.yaml
merged.yaml
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Ok I understand what you are trying to do. The That said, there is a more complex merge that you can use this script here: https://mikefarah.gitbook.io/yq/operators/multiply-merge#merge-arrays-of-objects-together-matching-on-a-key However this only does a single level of merge, where you actually need two; one on the table name, and then again on the matching column. I think you can still do this in |
Beta Was this translation helpful? Give feedback.
Ok I understand what you are trying to do. The
yq
merge operator however is a naive merge, it has no knowledge of your datastructure (and does not know, for instance, that the 'name' field uniquely identifies entries). The naive merge just merges the data structures matching on the yaml key name, and arrays are either appended together, or merged together by their index position.That said, there is a more complex merge that you can use this script here: https://mikefarah.gitbook.io/yq/operators/multiply-merge#merge-arrays-of-objects-together-matching-on-a-key
However this only does a single level of merge, where you actually need two; one on the table name, and then again on the matching…