fix: remove nested split directory artifacts on parent merge#2143
fix: remove nested split directory artifacts on parent merge#2143Acuspeedster wants to merge 1 commit intooscal-compass:developfrom
Conversation
…ompass#2136) When merging a non-collection element (e.g. 'trestle merge -e catalog.metadata'), the element may have been split further after its initial split, leaving a sibling directory tree alongside the .json file: catalog/metadata.json <- the target file (removed by existing code) catalog/metadata/ <- sibling directory with nested splits (NOT removed: the bug) roles/00000__role.json responsible-parties/... The merge correctly loaded all nested content via ModelUtils.load_distributed() and wrote it back into the parent (catalog.json), but only removed the .json file. The sibling directory was left behind, creating an inconsistent workspace state that confused follow-up split/merge operations. Fix: after removing the target .json file, also remove the sibling directory when it exists and differs from target_model_filename (i.e. we are in the non-collection case where the .json file was found). Collection cases (where target_model_filename is the directory itself) are unaffected since target_model_path == target_model_filename. Also remove the FIXME and TODO comments that acknowledged this gap. Tests: - Update test_merge_expanded_metadata_into_catalog to expect the new RemovePathAction for the sibling metadata/ directory in the generated plan. - Add test_merge_cleans_nested_split_dirs_on_parent_merge which exercises the exact scenario from oscal-compass#2136 end-to-end and asserts that catalog/metadata/ does not exist after 'trestle merge -e catalog.metadata'. Closes oscal-compass#2136 Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>
|
Since I opened this issue and am currently planning to work on implementing the fix, it would be better to avoid duplicate efforts. Could you please close this PR for now? |
Thanks for opening the issue. |
Hi @Jay2006sawant, Thanks again for opening the issue and taking the time to investigate the bug. Since the fix is already implemented in #2143, it might be great if we collaborate on it rather than duplicate work. If you'd like, you’re very welcome to contribute directly to the PR as well you can push improvements or refinements to the branch, and we can iterate on the implementation together. I’m happy to incorporate your ideas and make sure you’re credited on the PR for the work you add. Open source works best when we build things together, so it would be great to co-work on this and get the best possible fix in for the maintainers to review. Feel free to push changes, leave suggestions, or review the PR looking forward to collaborating! 🙂 |
|
Hi @Acuspeedster, tthanks for working on the fix. I’ll go through the PR and share any feedback if I find something that could improve it. |
Description
fix: remove nested split directory artifacts on parent merge (#2136)
closes #2136
trestle/core/commands/merge.py: when merging a non-collection element(e.g.
trestle merge -e catalog.metadata), also remove the sibling directory(e.g.
catalog/metadata/) that may contain nested split artifacts from deepersplits of that element. Previously, only the
.jsonfile was removed, leavingstale files/directories behind and creating an inconsistent workspace state.
# FIXMEand# TODOcomments inmerge.pythat acknowledged this gap.tests/trestle/core/commands/merge_test.py: updatetest_merge_expanded_metadata_into_catalogto expect the new
RemovePathActionfor the sibling directory in the generated plan.test_merge_cleans_nested_split_dirs_on_parent_mergeend-to-end regressiontest that exercises the exact scenario from Parent merge leaves nested split artifacts in catalog metadata #2136 and asserts
catalog/metadata/isgone after
trestle merge -e catalog.metadata.Closes #2136
Types of changes
develop->main)Quality assurance (all should be covered).
Summary
Root cause
In
MergeCmd.merge(), when merging a non-collection element likecatalog.metadata, the code found and loadedcatalog/metadata.json(and all its nested split content viaload_distributed), wrote the merged result back intocatalog.json, then issued aRemovePathActionforcatalog/metadata.jsononly.If the user had run further splits after the initial split (e.g. split
metadata.rolesproducingcatalog/metadata/roles/*.json), the sibling directorycatalog/metadata/was left untouched. The# FIXMEcomment in the original code acknowledged this:Fix
After removing the target
.jsonfile, check whether a sibling directory with the same stem exists (target_model_path.is_dir()) and, if so, add a secondRemovePathActionfor it. The conditiontarget_model_path != target_model_filenameensures this only fires in the non-collection case (where the.jsonfile was found), leaving collection merges (metadata.rolesetc.) unaffected sincetarget_model_filenameis already the directory in those cases.Edge cases verified
back-matter.json, noback-matter/)is_dir()guard prevents spurious action)metadata.json+metadata/)metadata/metadata.roles→ directory only)target_model_path == target_model_filename)catalog.*)Tests
14 merge tests pass (1 pre-existing failure
test_merge_deletes_foldersis caused by a space in the local workspace path and is unrelated to this PR it fails identically on cleandevelop).Key links:
Before you merge