Skip to content

fix: remove nested split directory artifacts on parent merge#2143

Open
Acuspeedster wants to merge 1 commit intooscal-compass:developfrom
Acuspeedster:fix/nested-split-cleanup-on-merge-2136
Open

fix: remove nested split directory artifacts on parent merge#2143
Acuspeedster wants to merge 1 commit intooscal-compass:developfrom
Acuspeedster:fix/nested-split-cleanup-on-merge-2136

Conversation

@Acuspeedster
Copy link
Contributor

Description

fix: remove nested split directory artifacts on parent merge (#2136)
closes #2136

  • trestle/core/commands/merge.py: when merging a non-collection element
    (e.g. trestle merge -e catalog.metadata), also remove the sibling directory
    (e.g. catalog/metadata/) that may contain nested split artifacts from deeper
    splits of that element. Previously, only the .json file was removed, leaving
    stale files/directories behind and creating an inconsistent workspace state.
  • Remove the # FIXME and # TODO comments in merge.py that acknowledged this gap.
  • tests/trestle/core/commands/merge_test.py: update test_merge_expanded_metadata_into_catalog
    to expect the new RemovePathAction for the sibling directory in the generated plan.
  • Add test_merge_cleans_nested_split_dirs_on_parent_merge end-to-end regression
    test that exercises the exact scenario from Parent merge leaves nested split artifacts in catalog metadata #2136 and asserts catalog/metadata/ is
    gone after trestle merge -e catalog.metadata.

Closes #2136

Types of changes

  • Hot fix (emergency fix and release)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Documentation (change which affects the documentation site)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Release (develop -> main)

Quality assurance (all should be covered).

  • My code follows the code style of this project.
  • Documentation for my change is up to date?
  • My PR meets testing requirements.
  • All new and existing tests passed.
  • All commits are signed-off.

Summary

Root cause

In MergeCmd.merge(), when merging a non-collection element like catalog.metadata, the code found and loaded catalog/metadata.json (and all its nested split content via load_distributed), wrote the merged result back into catalog.json, then issued a RemovePathAction for catalog/metadata.json only.

If the user had run further splits after the initial split (e.g. split metadata.roles producing catalog/metadata/roles/*.json), the sibling directory catalog/metadata/ was left untouched. The # FIXME comment in the original code acknowledged this:

# FIXME this will delete metadata.json but it will leave metadata/roles/roles.*
# need to clean up all lower dirs

Fix

After removing the target .json file, check whether a sibling directory with the same stem exists (target_model_path.is_dir()) and, if so, add a second RemovePathAction for it. The condition target_model_path != target_model_filename ensures this only fires in the non-collection case (where the .json file was found), leaving collection merges (metadata.roles etc.) unaffected since target_model_filename is already the directory in those cases.

Edge cases verified

Scenario Before After
Non-collection merge with no sibling dir (back-matter.json, no back-matter/) ✅ correct ✅ correct (is_dir() guard prevents spurious action)
Non-collection merge with sibling dir (metadata.json + metadata/) ❌ leaves metadata/ ✅ removes both
Collection merge (metadata.roles → directory only) ✅ correct ✅ unaffected (target_model_path == target_model_filename)
Wildcard merge (catalog.*) ✅ correct ✅ unaffected (returns early in wildcard branch)

Tests

14 merge tests pass (1 pre-existing failure test_merge_deletes_folders is caused by a space in the local workspace path and is unrelated to this PR it fails identically on clean develop).

Key links:

Before you merge

  • Ensure it is a 'squash commit' if not a release.
  • Ensure CI is currently passing
  • Check sonar. If you are working for a fork a maintainer will reach out, if required.

…ompass#2136)

When merging a non-collection element (e.g. 'trestle merge -e catalog.metadata'),
the element may have been split further after its initial split, leaving a sibling
directory tree alongside the .json file:

  catalog/metadata.json      <- the target file (removed by existing code)
  catalog/metadata/          <- sibling directory with nested splits (NOT removed: the bug)
    roles/00000__role.json
    responsible-parties/...

The merge correctly loaded all nested content via ModelUtils.load_distributed()
and wrote it back into the parent (catalog.json), but only removed the .json file.
The sibling directory was left behind, creating an inconsistent workspace state that
confused follow-up split/merge operations.

Fix: after removing the target .json file, also remove the sibling directory when it
     exists and differs from target_model_filename (i.e. we are in the non-collection
     case where the .json file was found). Collection cases (where target_model_filename
     is the directory itself) are unaffected since target_model_path == target_model_filename.

Also remove the FIXME and TODO comments that acknowledged this gap.

Tests:
- Update test_merge_expanded_metadata_into_catalog to expect the new RemovePathAction
  for the sibling metadata/ directory in the generated plan.
- Add test_merge_cleans_nested_split_dirs_on_parent_merge which exercises the exact
  scenario from oscal-compass#2136 end-to-end and asserts that catalog/metadata/ does not exist
  after 'trestle merge -e catalog.metadata'.

Closes oscal-compass#2136

Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>
@Acuspeedster Acuspeedster requested a review from a team as a code owner March 11, 2026 17:20
@Acuspeedster
Copy link
Contributor Author

@Jay2006sawant
Copy link

Since I opened this issue and am currently planning to work on implementing the fix, it would be better to avoid duplicate efforts. Could you please close this PR for now?

@Acuspeedster
Copy link
Contributor Author

Acuspeedster commented Mar 11, 2026

Since I opened this issue and am currently planning to work on implementing the fix, it would be better to avoid duplicate efforts. Could you please close this PR for now?

Thanks for opening the issue.
Since the issue didn’t appear to be assigned or claimed and the fix looked relatively small, I went ahead and implemented it and opened this PR for maintainers to review.
Of course, I’m happy to follow whatever technical reviews the maintainers give and I am open to your review as well.
Thanks Again.

@Acuspeedster
Copy link
Contributor Author

Since I opened this issue and am currently planning to work on implementing the fix, it would be better to avoid duplicate efforts. Could you please close this PR for now?

Hi @Jay2006sawant,

Thanks again for opening the issue and taking the time to investigate the bug. Since the fix is already implemented in #2143, it might be great if we collaborate on it rather than duplicate work.

If you'd like, you’re very welcome to contribute directly to the PR as well you can push improvements or refinements to the branch, and we can iterate on the implementation together. I’m happy to incorporate your ideas and make sure you’re credited on the PR for the work you add.

Open source works best when we build things together, so it would be great to co-work on this and get the best possible fix in for the maintainers to review.

Feel free to push changes, leave suggestions, or review the PR looking forward to collaborating! 🙂

@Jay2006sawant
Copy link

Hi @Acuspeedster, tthanks for working on the fix.

I’ll go through the PR and share any feedback if I find something that could improve it.
Happy to collaborate on it .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parent merge leaves nested split artifacts in catalog metadata

2 participants