refactor(RHINENG-24461): Move system profile JSON construction from DB to Python by rodrigonull · Pull Request #3696 · RedHatInsights/insights-host-inventory

rodrigonull · 2026-03-04T20:31:04Z

Overview

This PR is being created to address RHINENG-24461.

Replaces jsonb_build_object/jsonb_strip_nulls in get_sparse_system_profile with individual column SELECTs and Python-side dict assembly, matching the pattern used by build_system_profile_from_normalized. This offloads CPU- intensive JSON construction from the database to the application tier.

PR Checklist

Secure Coding Practices Documentation Reference

You can find documentation on this checklist here.

Secure Coding Checklist

Summary by Sourcery

Move sparse system profile construction from database-side JSONB assembly to Python, reusing shared field serializers for normalized system profile data.

Enhancements:

Select individual system profile columns from static and dynamic tables and assemble sparse profiles in Python instead of using PostgreSQL jsonb_build_object/jsonb_strip_nulls.
Centralize system profile field serializers in a shared SP_FIELD_SERIALIZERS mapping and reuse it in normalized system profile construction.

…B to Python Replace jsonb_build_object/jsonb_strip_nulls in get_sparse_system_profile with individual column SELECTs and Python-side dict assembly, matching the pattern used by build_system_profile_from_normalized. This offloads CPU- intensive JSON construction from the database to the application tier.

sourcery-ai · 2026-03-04T20:31:11Z

Reviewer's Guide

Refactors sparse system profile retrieval to select individual normalized columns instead of building JSON in PostgreSQL, then assembles and serializes the system profile dict in Python using a shared serializer mapping with the existing normalized-profile builder.

Sequence diagram for sparse system profile retrieval refactor

sequenceDiagram
    actor APIClient
    participant HostQueryDB
    participant SQLAlchemySession
    participant PostgreSQL
    participant Serialization

    APIClient->>HostQueryDB: get_sparse_system_profile(page, per_page, host_id_list, requested_sp_fields, workloads_needed_for_compat, workloads_requested, rbac_filter)
    activate HostQueryDB

    HostQueryDB->>HostQueryDB: determine fields_to_fetch
    HostQueryDB->>HostQueryDB: build sp_columns and sp_field_names

    HostQueryDB->>SQLAlchemySession: construct sp_query with Host.id and sp_columns
    SQLAlchemySession->>PostgreSQL: execute SELECT Host.id, normalized_columns
    PostgreSQL-->>SQLAlchemySession: rows with Host.id and individual columns
    SQLAlchemySession-->>HostQueryDB: query_results

    HostQueryDB->>HostQueryDB: iterate query_results.items
    loop for each row
        HostQueryDB->>Serialization: SP_FIELD_SERIALIZERS.get(field_name)
        Serialization-->>HostQueryDB: serializer or None
        HostQueryDB->>HostQueryDB: apply serializer(value) or keep value
        HostQueryDB->>HostQueryDB: build system_profile dict without None values
    end

    HostQueryDB-->>APIClient: list of {id, system_profile}
    deactivate HostQueryDB

Updated class diagram for sparse system profile and serializers

classDiagram
    class Host {
        +id: UUID
    }

    class HostStaticSystemProfile {
        <<table>>
        +owner_id
        +rhc_client_id
        +rhc_config_state
        +virtual_host_uuid
        +captured_date
        +last_boot_time
        +other_static_fields
    }

    class HostDynamicSystemProfile {
        <<table>>
        +other_dynamic_fields
    }

    class HostQueryDB {
        +get_sparse_system_profile(page, per_page, host_id_list, requested_sp_fields, workloads_needed_for_compat, workloads_requested, rbac_filter)
    }

    class SerializationModule {
        +build_system_profile_from_normalized(host, system_profile_fields)
        +serialize_uuid(u)
        +_serialize_datetime(dt)
    }

    class SP_FIELD_SERIALIZERS {
        <<mapping>>
        +owner_id: serialize_uuid
        +rhc_client_id: serialize_uuid
        +rhc_config_state: serialize_uuid
        +virtual_host_uuid: serialize_uuid
        +captured_date: _serialize_datetime
        +last_boot_time: _serialize_datetime
    }

    HostQueryDB ..> Host : uses
    HostQueryDB ..> HostStaticSystemProfile : selects_columns_from
    HostQueryDB ..> HostDynamicSystemProfile : selects_columns_from
    HostQueryDB ..> SP_FIELD_SERIALIZERS : applies_serializers

    SerializationModule ..> Host : reads_normalized_profiles
    SerializationModule ..> HostStaticSystemProfile : reads_fields
    SerializationModule ..> HostDynamicSystemProfile : reads_fields
    SerializationModule ..> SP_FIELD_SERIALIZERS : applies_serializers

    SP_FIELD_SERIALIZERS ..> SerializationModule : uses_functions_from

File-Level Changes

Change	Details	Files
Move JSON construction for sparse system profiles from a PostgreSQL jsonb_build_object query into Python-side dict assembly.	Replace conditional jsonb_build_object/jsonb_strip_nulls construction with per-field column selection from HostStaticSystemProfile and HostDynamicSystemProfile based on requested or full field lists. Track selected system profile columns and their field names to build the SQL columns list as [Host.id] plus the chosen normalized columns, setting flags to join the static and/or dynamic profile tables only when required. Iterate over query results and rebuild each system_profile dict in Python, skipping None values and applying per-field serializers when present.	`api/host_query_db.py`
Centralize system profile field serializers for reuse between sparse and full normalized profile builders.	Extract the inline SERIALIZERS mapping from build_system_profile_from_normalized into a module-level SP_FIELD_SERIALIZERS constant. Update build_system_profile_from_normalized to use SP_FIELD_SERIALIZERS when serializing field values. Reuse SP_FIELD_SERIALIZERS in get_sparse_system_profile when constructing system_profile dicts from query results.	`app/serialization.py` `api/host_query_db.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

github-actions · 2026-03-04T20:31:22Z

SC Environment Impact Assessment

Overall Impact: ⚪ NONE

No SC Environment-specific impacts detected in this PR.

What was checked

This PR was automatically scanned for:

Database migrations
ClowdApp configuration changes
Kessel integration changes
AWS service integrations (S3, RDS, ElastiCache)
Kafka topic changes
Secrets management changes
External dependencies

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

In get_sparse_system_profile, consider iterating over zip(sp_field_names, item[1:]) instead of using enumerate and i + 1 to reduce index arithmetic and make the mapping between columns and field names clearer.
Now that SP_FIELD_SERIALIZERS is reused across modules, it might be worth centralizing the related field names (or documenting the mapping) to make it easier to keep the serializer list in sync with the schema as fields are added/removed.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `get_sparse_system_profile`, consider iterating over `zip(sp_field_names, item[1:])` instead of using `enumerate` and `i + 1` to reduce index arithmetic and make the mapping between columns and field names clearer.
- Now that `SP_FIELD_SERIALIZERS` is reused across modules, it might be worth centralizing the related field names (or documenting the mapping) to make it easier to keep the serializer list in sync with the schema as fields are added/removed.

## Individual Comments

### Comment 1
<location path="api/host_query_db.py" line_range="789-793" />
<code_context>
-        needs_dynamic_join = True
+        fields_to_fetch = STATIC_FIELDS + DYNAMIC_FIELDS
+
+    sp_columns = []
+    sp_field_names = []
+    for key in fields_to_fetch:
+        if key in STATIC_FIELDS:
+            sp_columns.append(getattr(HostStaticSystemProfile, key))
+            sp_field_names.append(key)
+            needs_static_join = True
+        elif key in DYNAMIC_FIELDS:
+            sp_columns.append(getattr(HostDynamicSystemProfile, key))
+            sp_field_names.append(key)
+            needs_dynamic_join = True
+
+    columns = [Host.id] + sp_columns
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider zipping column values with field names instead of relying on index offsets

The loop assumes `item[i + 1]` always aligns with `sp_field_names[i]` because `columns = [Host.id] + sp_columns` and both lists are built in parallel. This is fragile: any new non-SP column in `columns` could break the alignment silently. Using `zip(sp_field_names, item[1:])` when building `system_profile` would avoid the magic `+ 1` offset and keep the mapping robust to future column changes.

Suggested implementation:

```python
    # Build system profile dicts: skip None values, apply serializers
    result_list: list[dict[str, Any]] = []

```

```python
    # Build system profile dicts: skip None values, apply serializers
    result_list: list[dict[str, Any]] = []
    for item in query_results.items:
        host_id = item[0]
        system_profile: dict[str, Any] = {}

        # item[0] is host_id, remaining values correspond to sp_field_names
        for field_name, value in zip(sp_field_names, item[1:]):
            if value is None:
                continue

            serializer = SYSTEM_PROFILE_SERIALIZERS.get(field_name)
            if serializer:
                value = serializer(value)

            system_profile[field_name] = value

        result_list.append(
            {
                "id": host_id,
                "system_profile": system_profile,
            }
        )

```

If the existing code around building `result_list` differs (e.g. different variable names, additional non-system-profile columns in `columns`, or a different loop structure), you should:
1. Locate the loop that currently does `for i, field_name in enumerate(sp_field_names):` (or similar) and indexes into `item[i + 1]`.
2. Replace that inner loop with a `for field_name, value in zip(sp_field_names, item[1:]):` loop, preserving the existing serializer application and `None` checks.
3. Ensure that `item[0]` remains the host identifier (or adjust the `item[1:]` slice if the host id is in a different position).
This will keep the system-profile mapping aligned with `sp_field_names` even if extra columns are added to `columns` in the future, as long as those extra columns are added before or after the `sp_columns` block consistently.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

api/host_query_db.py

Replace enumerate with index offset by zip(strict=True) when building the system profile dict from query result tuples.

sourcery-ai bot reviewed Mar 4, 2026

View reviewed changes

api/host_query_db.py Show resolved Hide resolved

refactor(RHINENG-24461): Use zip for system profile field-value pairing

c69f7f5

Replace enumerate with index offset by zip(strict=True) when building the system profile dict from query result tuples.

rodrigonull requested a review from a team as a code owner March 4, 2026 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(RHINENG-24461): Move system profile JSON construction from DB to Python#3696

refactor(RHINENG-24461): Move system profile JSON construction from DB to Python#3696
rodrigonull wants to merge 2 commits intomasterfrom
RHINENG-24461

rodrigonull commented Mar 4, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Mar 4, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rodrigonull commented Mar 4, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

PR Checklist

Secure Coding Practices Documentation Reference

Secure Coding Checklist

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for sparse system profile retrieval refactor

Updated class diagram for sparse system profile and serializers

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions bot commented Mar 4, 2026

SC Environment Impact Assessment

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rodrigonull commented Mar 4, 2026 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Mar 4, 2026 •

edited

Loading