Skip to content

Comments

Fix Jinja template for YAML generation#349

Open
amc-corey-cox wants to merge 2 commits intomainfrom
jinja_update
Open

Fix Jinja template for YAML generation#349
amc-corey-cox wants to merge 2 commits intomainfrom
jinja_update

Conversation

@amc-corey-cox
Copy link
Collaborator

Summary

Hey Sabrina! Great work getting the Jinja template set up — you were really close. I just cleaned up a few things that are easy to miss when you're first working with Jinja:

Template file (yaml_measobs.j2):

  • Tabs vs spaces in YAML: The {% if %} blocks and some of the lines underneath them were indented with tabs, while the rest of the file used spaces. YAML is very picky about this — mixing the two breaks parsing. I re-indented everything with spaces to match the rest of the file.

  • Escaping literal curly braces: The output YAML needs actual { and } characters in expressions like expr: {phv00123} * 365. But Jinja uses {{ }} for its own variable substitution, so it gets confused when it sees bare braces. The fix is to tell Jinja to output a literal brace using {{ '{' }} and {{ '}' }}. Weird-looking but it works!

  • Indentation of unit_conversion and unit/range lines: These were a bit off from where they need to land in the output — I lined them up to match the existing good output files.

Python script (section 2B):

  • Template filename: The get_template() call referenced "yaml_measob.txt" but the file is actually named "yaml_measobs.j2" (with the s, and .j2 extension).

  • trim_blocks and lstrip_blocks: Without these two settings on the Jinja Environment, every {% if %} / {% elif %} / {% endif %} line in the template produces a blank line in the output. Adding trim_blocks=True, lstrip_blocks=True tells Jinja to swallow those control lines silently.

  • Building the context inside the row loop: The context dictionary was defined once, outside the loop, referencing variables (phv, bdchm_entity, etc.) that only existed inside the earlier section 2 loop. The fix is to build it from each row as you iterate. Since your template variables already match the CSV column names, we can just pass each row directly with row.to_dict() — no manual mapping needed!

  • Rendering the template: yaml.write(context) was writing the Python dictionary object to the file. The Jinja way is template.render(**row.to_dict()), which fills in all the {{ }} placeholders and returns the finished string.

I tested the output against the existing ARIC reference files (bdy_wgt for unit_convert, albumin_bld for unit_match) and it matches.

Test plan

  • Run the script end-to-end on the ARIC MeasurementObservation data
  • Diff a few output files against the existing good output to confirm they match
  • Spot-check a variable that uses the unit_expr path (if one exists in the current data)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants