Skip to content

refactor(RHINENG-24461): Move system profile JSON construction from DB to Python#3696

Open
rodrigonull wants to merge 2 commits intomasterfrom
RHINENG-24461
Open

refactor(RHINENG-24461): Move system profile JSON construction from DB to Python#3696
rodrigonull wants to merge 2 commits intomasterfrom
RHINENG-24461

Conversation

@rodrigonull
Copy link
Member

@rodrigonull rodrigonull commented Mar 4, 2026

Overview

This PR is being created to address RHINENG-24461.

Replaces jsonb_build_object/jsonb_strip_nulls in get_sparse_system_profile with individual column SELECTs and Python-side dict assembly, matching the pattern used by build_system_profile_from_normalized. This offloads CPU- intensive JSON construction from the database to the application tier.

PR Checklist

  • Keep PR title short, ideally under 72 characters
  • Descriptive comments provided in complex code blocks
  • Include raw query examples in the PR description, if adding/modifying SQL query
  • Tests: validate optimal/expected output
  • Tests: validate exceptions and failure scenarios
  • Tests: edge cases
  • Recovers or fails gracefully during potential resource outages (e.g. DB, Kafka)
  • Uses type hinting, if convenient
  • Documentation, if this PR changes the way other services interact with host inventory
  • Links to related PRs

Secure Coding Practices Documentation Reference

You can find documentation on this checklist here.

Secure Coding Checklist

  • Input Validation
  • Output Encoding
  • Authentication and Password Management
  • Session Management
  • Access Control
  • Cryptographic Practices
  • Error Handling and Logging
  • Data Protection
  • Communication Security
  • System Configuration
  • Database Security
  • File Management
  • Memory Management
  • General Coding Practices

Summary by Sourcery

Move sparse system profile construction from database-side JSONB assembly to Python, reusing shared field serializers for normalized system profile data.

Enhancements:

  • Select individual system profile columns from static and dynamic tables and assemble sparse profiles in Python instead of using PostgreSQL jsonb_build_object/jsonb_strip_nulls.
  • Centralize system profile field serializers in a shared SP_FIELD_SERIALIZERS mapping and reuse it in normalized system profile construction.

…B to Python

Replace jsonb_build_object/jsonb_strip_nulls in get_sparse_system_profile
with individual column SELECTs and Python-side dict assembly, matching the
pattern used by build_system_profile_from_normalized. This offloads CPU-
intensive JSON construction from the database to the application tier.
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 4, 2026

Reviewer's Guide

Refactors sparse system profile retrieval to select individual normalized columns instead of building JSON in PostgreSQL, then assembles and serializes the system profile dict in Python using a shared serializer mapping with the existing normalized-profile builder.

Sequence diagram for sparse system profile retrieval refactor

sequenceDiagram
    actor APIClient
    participant HostQueryDB
    participant SQLAlchemySession
    participant PostgreSQL
    participant Serialization

    APIClient->>HostQueryDB: get_sparse_system_profile(page, per_page, host_id_list, requested_sp_fields, workloads_needed_for_compat, workloads_requested, rbac_filter)
    activate HostQueryDB

    HostQueryDB->>HostQueryDB: determine fields_to_fetch
    HostQueryDB->>HostQueryDB: build sp_columns and sp_field_names

    HostQueryDB->>SQLAlchemySession: construct sp_query with Host.id and sp_columns
    SQLAlchemySession->>PostgreSQL: execute SELECT Host.id, normalized_columns
    PostgreSQL-->>SQLAlchemySession: rows with Host.id and individual columns
    SQLAlchemySession-->>HostQueryDB: query_results

    HostQueryDB->>HostQueryDB: iterate query_results.items
    loop for each row
        HostQueryDB->>Serialization: SP_FIELD_SERIALIZERS.get(field_name)
        Serialization-->>HostQueryDB: serializer or None
        HostQueryDB->>HostQueryDB: apply serializer(value) or keep value
        HostQueryDB->>HostQueryDB: build system_profile dict without None values
    end

    HostQueryDB-->>APIClient: list of {id, system_profile}
    deactivate HostQueryDB
Loading

Updated class diagram for sparse system profile and serializers

classDiagram
    class Host {
        +id: UUID
    }

    class HostStaticSystemProfile {
        <<table>>
        +owner_id
        +rhc_client_id
        +rhc_config_state
        +virtual_host_uuid
        +captured_date
        +last_boot_time
        +other_static_fields
    }

    class HostDynamicSystemProfile {
        <<table>>
        +other_dynamic_fields
    }

    class HostQueryDB {
        +get_sparse_system_profile(page, per_page, host_id_list, requested_sp_fields, workloads_needed_for_compat, workloads_requested, rbac_filter)
    }

    class SerializationModule {
        +build_system_profile_from_normalized(host, system_profile_fields)
        +serialize_uuid(u)
        +_serialize_datetime(dt)
    }

    class SP_FIELD_SERIALIZERS {
        <<mapping>>
        +owner_id: serialize_uuid
        +rhc_client_id: serialize_uuid
        +rhc_config_state: serialize_uuid
        +virtual_host_uuid: serialize_uuid
        +captured_date: _serialize_datetime
        +last_boot_time: _serialize_datetime
    }

    HostQueryDB ..> Host : uses
    HostQueryDB ..> HostStaticSystemProfile : selects_columns_from
    HostQueryDB ..> HostDynamicSystemProfile : selects_columns_from
    HostQueryDB ..> SP_FIELD_SERIALIZERS : applies_serializers

    SerializationModule ..> Host : reads_normalized_profiles
    SerializationModule ..> HostStaticSystemProfile : reads_fields
    SerializationModule ..> HostDynamicSystemProfile : reads_fields
    SerializationModule ..> SP_FIELD_SERIALIZERS : applies_serializers

    SP_FIELD_SERIALIZERS ..> SerializationModule : uses_functions_from
Loading

File-Level Changes

Change Details Files
Move JSON construction for sparse system profiles from a PostgreSQL jsonb_build_object query into Python-side dict assembly.
  • Replace conditional jsonb_build_object/jsonb_strip_nulls construction with per-field column selection from HostStaticSystemProfile and HostDynamicSystemProfile based on requested or full field lists.
  • Track selected system profile columns and their field names to build the SQL columns list as [Host.id] plus the chosen normalized columns, setting flags to join the static and/or dynamic profile tables only when required.
  • Iterate over query results and rebuild each system_profile dict in Python, skipping None values and applying per-field serializers when present.
api/host_query_db.py
Centralize system profile field serializers for reuse between sparse and full normalized profile builders.
  • Extract the inline SERIALIZERS mapping from build_system_profile_from_normalized into a module-level SP_FIELD_SERIALIZERS constant.
  • Update build_system_profile_from_normalized to use SP_FIELD_SERIALIZERS when serializing field values.
  • Reuse SP_FIELD_SERIALIZERS in get_sparse_system_profile when constructing system_profile dicts from query results.
app/serialization.py
api/host_query_db.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

SC Environment Impact Assessment

Overall Impact:NONE

No SC Environment-specific impacts detected in this PR.

What was checked

This PR was automatically scanned for:

  • Database migrations
  • ClowdApp configuration changes
  • Kessel integration changes
  • AWS service integrations (S3, RDS, ElastiCache)
  • Kafka topic changes
  • Secrets management changes
  • External dependencies

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In get_sparse_system_profile, consider iterating over zip(sp_field_names, item[1:]) instead of using enumerate and i + 1 to reduce index arithmetic and make the mapping between columns and field names clearer.
  • Now that SP_FIELD_SERIALIZERS is reused across modules, it might be worth centralizing the related field names (or documenting the mapping) to make it easier to keep the serializer list in sync with the schema as fields are added/removed.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `get_sparse_system_profile`, consider iterating over `zip(sp_field_names, item[1:])` instead of using `enumerate` and `i + 1` to reduce index arithmetic and make the mapping between columns and field names clearer.
- Now that `SP_FIELD_SERIALIZERS` is reused across modules, it might be worth centralizing the related field names (or documenting the mapping) to make it easier to keep the serializer list in sync with the schema as fields are added/removed.

## Individual Comments

### Comment 1
<location path="api/host_query_db.py" line_range="789-793" />
<code_context>
-        needs_dynamic_join = True
+        fields_to_fetch = STATIC_FIELDS + DYNAMIC_FIELDS
+
+    sp_columns = []
+    sp_field_names = []
+    for key in fields_to_fetch:
+        if key in STATIC_FIELDS:
+            sp_columns.append(getattr(HostStaticSystemProfile, key))
+            sp_field_names.append(key)
+            needs_static_join = True
+        elif key in DYNAMIC_FIELDS:
+            sp_columns.append(getattr(HostDynamicSystemProfile, key))
+            sp_field_names.append(key)
+            needs_dynamic_join = True
+
+    columns = [Host.id] + sp_columns
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider zipping column values with field names instead of relying on index offsets

The loop assumes `item[i + 1]` always aligns with `sp_field_names[i]` because `columns = [Host.id] + sp_columns` and both lists are built in parallel. This is fragile: any new non-SP column in `columns` could break the alignment silently. Using `zip(sp_field_names, item[1:])` when building `system_profile` would avoid the magic `+ 1` offset and keep the mapping robust to future column changes.

Suggested implementation:

```python
    # Build system profile dicts: skip None values, apply serializers
    result_list: list[dict[str, Any]] = []

```

```python
    # Build system profile dicts: skip None values, apply serializers
    result_list: list[dict[str, Any]] = []
    for item in query_results.items:
        host_id = item[0]
        system_profile: dict[str, Any] = {}

        # item[0] is host_id, remaining values correspond to sp_field_names
        for field_name, value in zip(sp_field_names, item[1:]):
            if value is None:
                continue

            serializer = SYSTEM_PROFILE_SERIALIZERS.get(field_name)
            if serializer:
                value = serializer(value)

            system_profile[field_name] = value

        result_list.append(
            {
                "id": host_id,
                "system_profile": system_profile,
            }
        )

```

If the existing code around building `result_list` differs (e.g. different variable names, additional non-system-profile columns in `columns`, or a different loop structure), you should:
1. Locate the loop that currently does `for i, field_name in enumerate(sp_field_names):` (or similar) and indexes into `item[i + 1]`.
2. Replace that inner loop with a `for field_name, value in zip(sp_field_names, item[1:]):` loop, preserving the existing serializer application and `None` checks.
3. Ensure that `item[0]` remains the host identifier (or adjust the `item[1:]` slice if the host id is in a different position).
This will keep the system-profile mapping aligned with `sp_field_names` even if extra columns are added to `columns` in the future, as long as those extra columns are added before or after the `sp_columns` block consistently.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Replace enumerate with index offset by zip(strict=True) when building
the system profile dict from query result tuples.
@rodrigonull rodrigonull requested a review from a team as a code owner March 4, 2026 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant