Add support for full recipe loading#4661
Add support for full recipe loading#4661therazix wants to merge 2 commits intofvagner-conversion-methodsfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements full recipe loading, serializing the complete plan state into a recipe file to enable reproducible runs. A key change involves dynamically deriving the discover-phase attribute from test paths. However, a critical security vulnerability has been identified: the implementation of test path recreation in the discover step is susceptible to Path Traversal. The path attribute from the recipe is used to construct filesystem paths for directory creation without adequate sanitization, potentially allowing an attacker to create directories outside the intended workdir using .. sequences. This requires remediation by validating that the resulting paths do not escape the intended base directory. Additionally, there are a couple of suggestions to improve code robustness and clarity in tmt/recipe.py.
tmt/recipe.py
Outdated
| def from_serialized(cls, serialized: dict[str, Any], logger: Logger) -> '_RecipeTest': # type: ignore[override] | ||
| raw_checks = serialized.pop('check', []) | ||
| serialized['check'] = [] | ||
| recipe_test = super().from_serialized(serialized) | ||
| recipe_test.check = [Check.from_spec(check, logger) for check in raw_checks] | ||
| return recipe_test |
There was a problem hiding this comment.
This method modifies the serialized dictionary in-place, which can be an unexpected side effect. To improve encapsulation and prevent potential issues, create a copy of the dictionary to work with.
| def from_serialized(cls, serialized: dict[str, Any], logger: Logger) -> '_RecipeTest': # type: ignore[override] | |
| raw_checks = serialized.pop('check', []) | |
| serialized['check'] = [] | |
| recipe_test = super().from_serialized(serialized) | |
| recipe_test.check = [Check.from_spec(check, logger) for check in raw_checks] | |
| return recipe_test | |
| def from_serialized(cls, serialized: dict[str, Any], logger: Logger) -> '_RecipeTest': # type: ignore[override] | |
| serialized = serialized.copy() | |
| raw_checks = serialized.pop('check', []) | |
| recipe_test = super().from_serialized(serialized) | |
| recipe_test.check = [Check.from_spec(check, logger) for check in raw_checks] | |
| return recipe_test |
There was a problem hiding this comment.
worth documenting if expected
3c63f94 to
18b7353
Compare
thrix
left a comment
There was a problem hiding this comment.
The full recipe loading direction is good. Three findings, one requiring a change:
-
Run.environmentsilently ignores CLI--environmentwhen recipe is loaded - The recipe env unconditionally overrides CLI options. New--environmentoverrides on replay are silently lost. This should merge recipe env with CLI env, letting CLI take precedence. -
linkfield type mismatch after unserialization removal - Raw data stored whereLinksobject is expected. Not an active runtime bug but incorrect typing that could break on re-serialization paths. -
Unrelated schema change -
display-guestinreport/display.yamlshould be split out.
|
Besides other comments, the code appears to assume the recipe file is well-formed, do you consider add some validation to the recipe file provided, say, a schema file? |
|
@therazix please, set the "Size" of this PR. |
thrix
left a comment
There was a problem hiding this comment.
Review: Add support for full recipe loading
Good progress on extending recipe loading from discover-only to all plan steps. The environment simplification and removal of discover_phase are clean.
Issues
Blocking:
- CLI environment silently ignored (
tmt/base/core.py): The early return bypasses_environment_from_cli, so--environment FOO=baris silently dropped when using--recipe. See inline comment for suggested fix.
Latent bug:
_RecipeTest.linkunserialization removed (tmt/recipe.py): Works today because all test links are[](falsy short-circuit), but wouldAttributeErroron any recipe with non-empty test links. See inline comment.
Hygiene:
- Unrelated schema change (
tmt/schemas/report/display.yaml):display-guestaddition is not related to recipe loading — should be a separate commit/PR. - PR checklist: All items are unchecked — docs, spec, schema, version, release note still needed.
Verified non-issues
- Path traversal in
discover_from_recipe: Therelative_to()+resolve()+ parent check is sufficient. The gemini-code-assist security warning is overblown. - Removed "non-existent plan" test: Correct — with
tree.children.clear(), the tree IS the recipe, so the old error case no longer applies. - Only saving
_environment_from_fmfin_RecipePlan: Reasonable —_RecipeRun.environmentcaptures the full merged env, and intrinsics should be regenerated per run.
Generated-by: Claude Code
tmt/recipe.py
Outdated
| link: Optional['tmt.base.core.Links'] = field( | ||
| serialize=lambda value: value.to_spec() if value else [], | ||
| unserialize=lambda value: _unserialize_links(value), | ||
| ) |
There was a problem hiding this comment.
Latent bug: the unserialize callback was removed, so when loading from YAML via from_serialized(), the raw value (list of dicts) is stored instead of a Links object.
This is fine when link is [] (falsy — the serialize callback short-circuits to []), which is the case for all current test data. However, if a recipe test ever has a non-empty link (e.g., [{"relates": "..."}]), the serialize callback lambda value: value.to_spec() if value else [] would call .to_spec() on a raw list, causing an AttributeError.
Either restore the unserialize callback or handle raw data in the serialize callback.
There was a problem hiding this comment.
This has been addressed. serialize/unserialize has been replaced with to_spec/from_spec.
18b7353 to
4201f23
Compare
452c14b to
7165403
Compare
7165403 to
8360382
Compare
8810887 to
27d199a
Compare
|
@therazix would it be possible to provide a solid MR description for the changes, so it is easier to follow the changes, for example I am looking at this diff: And I would like to understand why this was changed, I would expect it would be mentioned that this is one of the improvements made to support full recipe loading ... (or something similar) |
|
|
||
| return spec | ||
|
|
||
| def to_minimal_spec(self) -> tmt.steps._RawStepData: |
There was a problem hiding this comment.
I think even functions like these would deserve comments ...
tmt/steps/provision/connect.py
Outdated
| 'hard-reboot': str(self.systemd_soft_reboot) | ||
| if isinstance(self.systemd_soft_reboot, ShellScript) | ||
| else None, |
There was a problem hiding this comment.
hard-reboot referencing systemds-soft-reboot, does not sound right
tmt/steps/provision/connect.py
Outdated
| for key, transform in field_map.items(): | ||
| value = getattr(self, key, None) |
There was a problem hiding this comment.
field_map items use hyphens, but this is accessing keys of self, which are with underscores?
27d199a to
62ddebc
Compare
|
Two observations, and I'm open to discussion:
|
62ddebc to
b430111
Compare
b430111 to
207f6c7
Compare
I created a standalone PR #4727.
I agree. I tried to get rid of them in 207f6c7.
If I understand it correctly, a step is only enabled when its name is in |
This PR implements a full recipe loading feature. All phases can now be loaded directly from the recipe. Serialization and deserialization were replaced with
to_spec/from_specmethods, and the generated recipe will now contain only non-empty values to reduce its size.Resolves: #4531
Pull Request Checklist