@@ -82,6 +82,110 @@ providers:
8282We offer a [ python provider starter project] ( https://github.com/apache/beam-starter-python-provider )
8383that serves as a complete example for how to do this.
8484
85+ ## YAML
86+
87+ New, re-usable transforms can be defined in YAML as well.
88+ This type of provider simply has a mapping of names to their YAML definitions.
89+ Jinja2 templatization of their string representations is used to parameterize
90+ them.
91+
92+ The ` config_schema ` section of the transform definition specifies what
93+ parameters are required (with their types) and the ` body ` section gives
94+ the implementation in terms of other YAML transforms.
95+
96+ ```
97+ - type: yaml
98+ transforms:
99+ # Define the first transform of type "RaiseElementToPower"
100+ RaiseElementToPower:
101+ config_schema:
102+ properties:
103+ n: {type: integer}
104+ body:
105+ type: MapToFields
106+ config:
107+ language: python
108+ append: true
109+ fields:
110+ power: "element ** {{n}}"
111+
112+ # Define a second transform that produces consecutive integers.
113+ Range:
114+ config_schema:
115+ properties:
116+ end: {type: integer}
117+ # Setting this parameter lets this transform type be used as a source.
118+ requires_inputs: false
119+ body: |
120+ type: Create
121+ config:
122+ elements:
123+ {% for ix in range(end) %}
124+ - {{ix}}
125+ {% endfor %}
126+ ```
127+
128+ Note that in this second example the ` body ` of Range is defined as a
129+ [ block string literal] ( https://yaml-multiline.info/ )
130+ to prevent any attempt by the system to parse the ` {% ` and ` %} ` pragmas used
131+ for control statements before a specialization with a concrete value for ` end `
132+ is instantiated and the loop is expanded.
133+
134+ These could then be used in a pipeline as
135+
136+ ```
137+ transforms:
138+ - type: Range
139+ config:
140+ end: 10
141+ - type: RaiseElementToPower
142+ input: Range
143+ config:
144+ n: 3
145+ ...
146+ ```
147+
148+ One can define composite transforms as well, e.g. in a provider listing one
149+ could have
150+
151+ ```
152+ - type: yaml
153+ transforms:
154+ ConsecutivePowers:
155+ # This takes two parameters.
156+ config_schema:
157+ properties:
158+ end: {type: integer}
159+ n: {type: integer}
160+
161+ # It can be used as a source transform.
162+ requires_inputs: false
163+
164+ # The body uses the transforms defined above linked together in a chain.
165+ body: |
166+ type: chain
167+ transforms:
168+ - type: Range
169+ config:
170+ end: {{end}}
171+ - type: RaiseElementToPower
172+ config:
173+ n: {{n}}
174+ ```
175+
176+ which allows one to use this whole fragment as
177+
178+ ```
179+ type: ConsecutivePowers
180+ config:
181+ end: 10
182+ n: 3
183+ ```
184+
185+ Note that YAML-defined transforms work better in a listing file than directly
186+ in the ` providers ` block of a pipeline file as pipeline files are always
187+ pre-processed with Jinja2 themselves which would necessitate double escaping.
188+
85189## YAML Provider listing files
86190
87191One can reference an external listings of providers in the yaml pipeline file
0 commit comments