[YAML] - Jinja % include example#35914
Conversation
|
Run Python PreCommit 3.10 |
|
Run Python PreCommit 3.11 |
1 similar comment
|
Run Python PreCommit 3.11 |
|
assign set of reviewers |
|
Assigning reviewers: R: @liferoad for label python. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Love the jinia idea for include. :) |
|
Run Python_Transforms PreCommit 3.10 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #35914 +/- ##
=========================================
Coverage 56.67% 56.67%
Complexity 3380 3380
=========================================
Files 1219 1219
Lines 184561 184565 +4
Branches 3507 3507
=========================================
+ Hits 104595 104605 +10
+ Misses 76639 76633 -6
Partials 3327 3327
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Run Python PreCommit 3.12 |
damccorm
left a comment
There was a problem hiding this comment.
Thanks for putting this together!
| - [wordCount.yaml](#TODO: pending) | ||
|
|
||
| % include: | ||
| - [wordCount.yaml](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/yaml/examples/transforms/jinja/include/wordCount.yaml) |
There was a problem hiding this comment.
Should this be wordCountInclude.yaml?
There was a problem hiding this comment.
It would also be good to give an example of invoking this (and how you would pass in appropriate parameters)
There was a problem hiding this comment.
- I had it here before, but I moved it to the other Readme. Will update name.
- Similar pattern that Charles used on some of his ML work with Readme usage more in the applicable folder and not in the main one since multiple folders will be under Jinja.
There was a problem hiding this comment.
Similar pattern that Charles used on some of his ML work with Readme usage more in the applicable folder and not in the main one since multiple folders will be under Jinja.
I missed that Readme entirely when I did my review. SGTM
There was a problem hiding this comment.
Maybe we could just link to that folder or the Readme here then? I worry that it will be easy to miss if you just follow the link (basically how I reviewed this)
| inputs from the user through `% include`, `% import`, and inheritance | ||
| directives. | ||
|
|
||
| % import: |
There was a problem hiding this comment.
Should these be code blocks? Also looks like some pieces are still missing, we should probably not include them for now
There was a problem hiding this comment.
It was more of a heading and not usable code. Will try to rethink this.
| - name: Read from GCS | ||
| type: ReadFromText | ||
| config: | ||
| {% include 'apache_beam/yaml/examples/transforms/jinja/include/submodules/readFromText.yaml' %} |
There was a problem hiding this comment.
Rather than just including the config, could we maybe map in whole transforms for simple ones like this?
Since this is an example, we should also explain what we're doing in comments
There was a problem hiding this comment.
Practically, I don't think it would be common to include just the config like this, especially if it is very simple (and also parameterized)
There was a problem hiding this comment.
I'd also recommend using more descriptive file names. So for examples:
readFromTextTransform.yaml (assuming we put the whole transform there)
or
mapToFieldsCountConfig.yaml (assuming we keep that one as config since it is more complicated)
There was a problem hiding this comment.
- We generally don't want to put the whole transform here because it limits us from changing names or inputs. Wanted to give an example of how to use. How about a mix? Some transforms, some configs only?
- Ok on better names.
There was a problem hiding this comment.
We generally don't want to put the whole transform here because it limits us from changing names or inputs.
I don't follow - could you give an example of someething we'd want to change but wouldn't be able to?
How about a mix? Some transforms, some configs only?
This SGTM - I think the more complicated configs make sense, the one liners seem less useful
There was a problem hiding this comment.
I don't follow - could you give an example of someething we'd want to change but wouldn't be able to?
Maybe a mute point now with both options, but if we have the whole transform in the include yaml. We can't change the name or input paths etc unless we pass even more stuff through the jinjia variables. So the option I went with was to give an example where a user could control or orient the pipeline graph from the main yaml file and then include what configs they needed from a library. When we push everything to a library, we lose that control, but maybe that is fine.
There was a problem hiding this comment.
Updated, thanks.
|
/gemini review |
|
I am trying to test how good Gemini knows our YAML. The review comments could be skipped. :) |
There was a problem hiding this comment.
Code Review
This pull request introduces an example for using Jinja's % include directive in Beam YAML pipelines. The changes include new example files, corresponding tests, and documentation updates. While the core functionality is a valuable addition, there are several issues that need to be addressed. These include broken commands and incorrect links in the documentation, fragile test logic, and potential blockers in the test data, such as a reference to a missing file. Addressing these points will improve the quality and usability of the new example.
sdks/python/apache_beam/yaml/examples/transforms/jinja/include/README.md
Outdated
Show resolved
Hide resolved
.../python/apache_beam/yaml/examples/transforms/jinja/include/submodules/mapToFields_split.yaml
Outdated
Show resolved
Hide resolved
sdks/python/apache_beam/yaml/examples/transforms/jinja/include/README.md
Outdated
Show resolved
Hide resolved
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new example demonstrating the use of Jinja's % include directive in Beam YAML pipelines. The changes include the new example files, corresponding tests, and updates to documentation. My review focuses on improving code clarity, fixing a bug in the documentation, and enhancing the maintainability of the new test code and YAML examples. Overall, this is a great addition that showcases a powerful feature of Beam YAML.
sdks/python/apache_beam/yaml/examples/transforms/jinja/include/README.md
Show resolved
Hide resolved
sdks/python/apache_beam/yaml/examples/transforms/jinja/include/submodules/combineTransform.yaml
Outdated
Show resolved
Hide resolved
sdks/python/apache_beam/yaml/examples/transforms/jinja/include/wordCountInclude.yaml
Show resolved
Hide resolved
|
Run Prism_Python PreCommit 3.12 |
|
Run Python_Integration PreCommit 3.9 |
|
Run Python_Transforms PreCommit 3.10 |
|
Run Python_ML PreCommit 3.10 |
damccorm
left a comment
There was a problem hiding this comment.
This LGTM. There are failing precommits which look to me like flakes, I'm going to try to run them again and see if they pass before merging
#35909
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.