refactor(RHINENG-24461): Move system profile JSON construction from DB to Python#3696
Open
rodrigonull wants to merge 2 commits intomasterfrom
Open
refactor(RHINENG-24461): Move system profile JSON construction from DB to Python#3696rodrigonull wants to merge 2 commits intomasterfrom
rodrigonull wants to merge 2 commits intomasterfrom
Conversation
…B to Python Replace jsonb_build_object/jsonb_strip_nulls in get_sparse_system_profile with individual column SELECTs and Python-side dict assembly, matching the pattern used by build_system_profile_from_normalized. This offloads CPU- intensive JSON construction from the database to the application tier.
Contributor
Reviewer's GuideRefactors sparse system profile retrieval to select individual normalized columns instead of building JSON in PostgreSQL, then assembles and serializes the system profile dict in Python using a shared serializer mapping with the existing normalized-profile builder. Sequence diagram for sparse system profile retrieval refactorsequenceDiagram
actor APIClient
participant HostQueryDB
participant SQLAlchemySession
participant PostgreSQL
participant Serialization
APIClient->>HostQueryDB: get_sparse_system_profile(page, per_page, host_id_list, requested_sp_fields, workloads_needed_for_compat, workloads_requested, rbac_filter)
activate HostQueryDB
HostQueryDB->>HostQueryDB: determine fields_to_fetch
HostQueryDB->>HostQueryDB: build sp_columns and sp_field_names
HostQueryDB->>SQLAlchemySession: construct sp_query with Host.id and sp_columns
SQLAlchemySession->>PostgreSQL: execute SELECT Host.id, normalized_columns
PostgreSQL-->>SQLAlchemySession: rows with Host.id and individual columns
SQLAlchemySession-->>HostQueryDB: query_results
HostQueryDB->>HostQueryDB: iterate query_results.items
loop for each row
HostQueryDB->>Serialization: SP_FIELD_SERIALIZERS.get(field_name)
Serialization-->>HostQueryDB: serializer or None
HostQueryDB->>HostQueryDB: apply serializer(value) or keep value
HostQueryDB->>HostQueryDB: build system_profile dict without None values
end
HostQueryDB-->>APIClient: list of {id, system_profile}
deactivate HostQueryDB
Updated class diagram for sparse system profile and serializersclassDiagram
class Host {
+id: UUID
}
class HostStaticSystemProfile {
<<table>>
+owner_id
+rhc_client_id
+rhc_config_state
+virtual_host_uuid
+captured_date
+last_boot_time
+other_static_fields
}
class HostDynamicSystemProfile {
<<table>>
+other_dynamic_fields
}
class HostQueryDB {
+get_sparse_system_profile(page, per_page, host_id_list, requested_sp_fields, workloads_needed_for_compat, workloads_requested, rbac_filter)
}
class SerializationModule {
+build_system_profile_from_normalized(host, system_profile_fields)
+serialize_uuid(u)
+_serialize_datetime(dt)
}
class SP_FIELD_SERIALIZERS {
<<mapping>>
+owner_id: serialize_uuid
+rhc_client_id: serialize_uuid
+rhc_config_state: serialize_uuid
+virtual_host_uuid: serialize_uuid
+captured_date: _serialize_datetime
+last_boot_time: _serialize_datetime
}
HostQueryDB ..> Host : uses
HostQueryDB ..> HostStaticSystemProfile : selects_columns_from
HostQueryDB ..> HostDynamicSystemProfile : selects_columns_from
HostQueryDB ..> SP_FIELD_SERIALIZERS : applies_serializers
SerializationModule ..> Host : reads_normalized_profiles
SerializationModule ..> HostStaticSystemProfile : reads_fields
SerializationModule ..> HostDynamicSystemProfile : reads_fields
SerializationModule ..> SP_FIELD_SERIALIZERS : applies_serializers
SP_FIELD_SERIALIZERS ..> SerializationModule : uses_functions_from
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
SC Environment Impact AssessmentOverall Impact: ⚪ NONE No SC Environment-specific impacts detected in this PR. What was checkedThis PR was automatically scanned for:
|
Contributor
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
get_sparse_system_profile, consider iterating overzip(sp_field_names, item[1:])instead of usingenumerateandi + 1to reduce index arithmetic and make the mapping between columns and field names clearer. - Now that
SP_FIELD_SERIALIZERSis reused across modules, it might be worth centralizing the related field names (or documenting the mapping) to make it easier to keep the serializer list in sync with the schema as fields are added/removed.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `get_sparse_system_profile`, consider iterating over `zip(sp_field_names, item[1:])` instead of using `enumerate` and `i + 1` to reduce index arithmetic and make the mapping between columns and field names clearer.
- Now that `SP_FIELD_SERIALIZERS` is reused across modules, it might be worth centralizing the related field names (or documenting the mapping) to make it easier to keep the serializer list in sync with the schema as fields are added/removed.
## Individual Comments
### Comment 1
<location path="api/host_query_db.py" line_range="789-793" />
<code_context>
- needs_dynamic_join = True
+ fields_to_fetch = STATIC_FIELDS + DYNAMIC_FIELDS
+
+ sp_columns = []
+ sp_field_names = []
+ for key in fields_to_fetch:
+ if key in STATIC_FIELDS:
+ sp_columns.append(getattr(HostStaticSystemProfile, key))
+ sp_field_names.append(key)
+ needs_static_join = True
+ elif key in DYNAMIC_FIELDS:
+ sp_columns.append(getattr(HostDynamicSystemProfile, key))
+ sp_field_names.append(key)
+ needs_dynamic_join = True
+
+ columns = [Host.id] + sp_columns
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider zipping column values with field names instead of relying on index offsets
The loop assumes `item[i + 1]` always aligns with `sp_field_names[i]` because `columns = [Host.id] + sp_columns` and both lists are built in parallel. This is fragile: any new non-SP column in `columns` could break the alignment silently. Using `zip(sp_field_names, item[1:])` when building `system_profile` would avoid the magic `+ 1` offset and keep the mapping robust to future column changes.
Suggested implementation:
```python
# Build system profile dicts: skip None values, apply serializers
result_list: list[dict[str, Any]] = []
```
```python
# Build system profile dicts: skip None values, apply serializers
result_list: list[dict[str, Any]] = []
for item in query_results.items:
host_id = item[0]
system_profile: dict[str, Any] = {}
# item[0] is host_id, remaining values correspond to sp_field_names
for field_name, value in zip(sp_field_names, item[1:]):
if value is None:
continue
serializer = SYSTEM_PROFILE_SERIALIZERS.get(field_name)
if serializer:
value = serializer(value)
system_profile[field_name] = value
result_list.append(
{
"id": host_id,
"system_profile": system_profile,
}
)
```
If the existing code around building `result_list` differs (e.g. different variable names, additional non-system-profile columns in `columns`, or a different loop structure), you should:
1. Locate the loop that currently does `for i, field_name in enumerate(sp_field_names):` (or similar) and indexes into `item[i + 1]`.
2. Replace that inner loop with a `for field_name, value in zip(sp_field_names, item[1:]):` loop, preserving the existing serializer application and `None` checks.
3. Ensure that `item[0]` remains the host identifier (or adjust the `item[1:]` slice if the host id is in a different position).
This will keep the system-profile mapping aligned with `sp_field_names` even if extra columns are added to `columns` in the future, as long as those extra columns are added before or after the `sp_columns` block consistently.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Replace enumerate with index offset by zip(strict=True) when building the system profile dict from query result tuples.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR is being created to address RHINENG-24461.
Replaces jsonb_build_object/jsonb_strip_nulls in get_sparse_system_profile with individual column SELECTs and Python-side dict assembly, matching the pattern used by build_system_profile_from_normalized. This offloads CPU- intensive JSON construction from the database to the application tier.
PR Checklist
Secure Coding Practices Documentation Reference
You can find documentation on this checklist here.
Secure Coding Checklist
Summary by Sourcery
Move sparse system profile construction from database-side JSONB assembly to Python, reusing shared field serializers for normalized system profile data.
Enhancements: