Skip to content

Conversation

@sei-vsarvepalli
Copy link
Contributor

Added extension methods to support export of SSVC Selection data into schema supported format.

This pull request introduces enhancements to the serialization logic for selection models in src/ssvc/selection.py. The changes ensure that dumped data is consistently formatted, with empty or invalid values removed and timestamps properly formatted in JSON outputs.

Key improvements to serialization and data cleanup:

  • Added a _post_process method to ensure all Selection.values are lists, remove empty or falsy items, and clean up empty array fields from the top-level dictionary.
  • Overrode the model_dump and model_dump_json methods to apply the post-processing step before returning data, and to format timestamps as UTC RFC3339 strings in JSON output.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds post-processing and serialization enhancements for SSVC selection models to produce cleaner, schema-ready dumps and JSON output with RFC3339 UTC timestamps.

  • Introduces a _post_process step to normalize selection values and strip empty arrays.
  • Overrides model_dump and model_dump_json to apply post-processing and format timestamps.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@ahouseholder ahouseholder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Neither the issue #996 nor the PR explain what specific problems this PR is intended to solve. The PR description says what changes are here, but not why.
  • There are no tests for the newly added functionality. This would also help show what the expected behavior is (i.e., what's getting eliminated by _post_process()?) → src/test/test_selections.py
  • I have inline comments/questions too

fix_selection(sel) for sel in data["selections"] if sel
]
# Remove empty array fields from the top level
keys_to_delete = [k for k, v in data.items() if isinstance(v, list) and not v]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we eliminate the need for the keys_to_delete loop by using exclude_none=True or exclude_if = lambda: ... in the class definition?

"""
Ensures all Selection.values are lists and removes empty array elements.
"""
def fix_selection(selection):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what this is doing, but I don't understand why we need it? Is there an example of a Selection object that needs this treatment? How would that get created? If so, can we put that into a unit test to demonstrate the need for this method?

model_dump_kwargs = kwargs.copy()
json_kwargs = {}
# List of json.dumps kwargs you want to support
json_kwarg_names = ['indent', 'sort_keys', 'separators', 'ensure_ascii']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to be picky about this instead of just passing **kwargs along? (If we do it this way, then we're breaking the expectation that model_dump_json passes kwargs through.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh will get rid of this for something simpler

return json.dumps(data, **{k: v for k, v in kwargs.items() if k in json.dumps.__code__.co_varnames})

data = self._post_process(data)
# Format timestamp as UTC RFC3339 string
if "timestamp" in data and isinstance(data["timestamp"], datetime):
ts = data["timestamp"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this in a field serializer instead? If so, I think this would let us avoid having to override model_dump_json() entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to standardize UTC conversion. I think it is not necessary. People may want to keep Timezone. I will get rid of it.

@sei-vsarvepalli
Copy link
Contributor Author

  1. The specific need being addressed in this PR - is that for an organization that wants create an SSVC Selection (valid to the schema) that uses any of the readily available Decision Trees or Decision Points. Currently SSVC calculator allows someone to do this, but the python libraries do not have something ready to us.
  2. I can add as a test the example provided in the src/README.md file. In brief without the _post_process - here is the before and after
{
    "timestamp": "2025-10-09T13:59:00.132847-04:00",
    "schemaVersion": "2.0.0",
    "target_ids": [],
    "selections": [
        {
            "namespace": "ssvc",
            "key": "E",
            "version": "1.1.0",
            "name": "Exploitation",
            "values": [
                {
                    "name": "Public PoC",
                    "key": "P"
                }
            ]
        }
    ],
    "decision_point_resources": [],
    "references": []
}

after _post_process

{
    "timestamp": "2025-10-09T17:50:50Z",
    "schemaVersion": "2.0.0",
    "selections": [
        {
            "namespace": "ssvc",
            "key": "E",
            "version": "1.1.0",
            "name": "Exploitation",
            "values": [
                {
                    "name": "Public PoC",
                    "key": "P"
                },
                {
                    "name": "Active",
                    "key": "A"
                }
            ]
        }
    ]
}
  1. I will resolve the inline comments.

@ahouseholder
Copy link
Contributor

Ah, okay, I was totally missing the top-level items that could be empty lists, and was only thinking about {"selections": {...,"values":...}. This makes more sense, thanks for the explanation.

Copy link
Contributor

@ahouseholder ahouseholder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version looks good to me.

@ahouseholder ahouseholder merged commit bf89719 into main Oct 10, 2025
5 checks passed
@ahouseholder ahouseholder deleted the Issue_996 branch October 10, 2025 15:12
@ahouseholder ahouseholder added this to the 2025-12 milestone Oct 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants