chore(RHINENG-24557): Create a command that reproduces ephemeral performance issues#3689
Open
chore(RHINENG-24557): Create a command that reproduces ephemeral performance issues#3689
Conversation
Contributor
Reviewer's GuideAdds a new IQE host_inventory CLI command that uploads a batch of archives, exercises GET /host_exists and GET /hosts/<host_ids>, and prints timing statistics to help reproduce and measure ephemeral performance issues. Sequence diagram for the new measure_api_get_times CLI commandsequenceDiagram
actor User
participant iqe_host_inventory_cli
participant application
participant application_host_inventory
participant upload_api
participant hosts_api
participant stats_logger
User->>iqe_host_inventory_cli: measure_api_get_times(number, user, display_name_prefix, base_archive, archive_repo, sleep_seconds, cache_refresh_wait_time)
iqe_host_inventory_cli->>iqe_host_inventory_cli: generate_display_name(panic_prevention=display_name_prefix)
iqe_host_inventory_cli->>iqe_host_inventory_cli: build_host_archive(...) for each host
iqe_host_inventory_cli->>application: _app_with_maybe_user(primary_application, user)
activate application
application-->>iqe_host_inventory_cli: application context
iqe_host_inventory_cli->>application_host_inventory: get host_inventory from application
iqe_host_inventory_cli->>upload_api: async_upload_archives(archives)
upload_api-->>iqe_host_inventory_cli: archives accepted
iqe_host_inventory_cli->>iqe_host_inventory_cli: sleep(sleep_seconds)
loop for each insights_id
iqe_host_inventory_cli->>hosts_api: get_host_exists(insights_id)
hosts_api-->>iqe_host_inventory_cli: host_exists_response(id)
iqe_host_inventory_cli->>iqe_host_inventory_cli: append host_id
alt cache_refresh_wait_time > 0
iqe_host_inventory_cli->>iqe_host_inventory_cli: sleep(cache_refresh_wait_time)
end
end
alt host_ids not empty
iqe_host_inventory_cli->>hosts_api: get_hosts_by_id(host_ids)
hosts_api-->>iqe_host_inventory_cli: hosts_list
end
iqe_host_inventory_cli->>stats_logger: log_request_statistics()
stats_logger-->>iqe_host_inventory_cli: report
iqe_host_inventory_cli->>User: print performance report
Updated class diagram for the measure_api_get_times command and related componentsclassDiagram
class MeasureApiGetTimesCommand {
+measure_api_get_times(obj, number, user, display_name_prefix, base_archive, archive_repo, sleep_seconds, cache_refresh_wait_time) void
}
class ApplicationContext {
+primary_application
}
class ApplicationHostInventory {
+upload UploadAPI
+apis HostInventoryApis
}
class UploadAPI {
+async_upload_archives(archives) void
}
class HostInventoryApis {
+hosts HostsApi
+log_request_statistics() string
}
class HostsApi {
+get_host_exists(insights_id) HostExistsResponse
+get_hosts_by_id(host_ids) HostsResponse
}
class HostExistsResponse {
+id string
}
class Archive {
+insights_id string
+display_name string
}
class ArchiveBuilder {
+build_host_archive(display_name, base_archive, archive_repo) Archive
+generate_display_name(panic_prevention) string
}
MeasureApiGetTimesCommand --> ApplicationContext : uses
MeasureApiGetTimesCommand --> ArchiveBuilder : uses
MeasureApiGetTimesCommand --> Archive : creates
MeasureApiGetTimesCommand --> ApplicationHostInventory : uses
ApplicationContext --> ApplicationHostInventory : provides
ApplicationHostInventory --> UploadAPI : has
ApplicationHostInventory --> HostInventoryApis : has
HostInventoryApis --> HostsApi : has
UploadAPI --> Archive : uploads
HostsApi --> HostExistsResponse : returns
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
SC Environment Impact AssessmentOverall Impact: ⚪ NONE No SC Environment-specific impacts detected in this PR. What was checkedThis PR was automatically scanned for:
|
Contributor
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In the loop calling
get_host_exists, consider guarding against missing or unexpected responses (e.g.,responsewithout anid) so you don’t append invalid IDs tohost_idsand potentially send bad data intoget_hosts_by_id. - Sleeping
cache_refresh_wait_timeafter every singleget_host_existscan make the command very slow for higher-nvalues; if you only need to approximate cache-refresh behavior, consider an option to sleep less frequently (e.g., every N hosts) or once after a batch.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the loop calling `get_host_exists`, consider guarding against missing or unexpected responses (e.g., `response` without an `id`) so you don’t append invalid IDs to `host_ids` and potentially send bad data into `get_hosts_by_id`.
- Sleeping `cache_refresh_wait_time` after every single `get_host_exists` can make the command very slow for higher `-n` values; if you only need to approximate cache-refresh behavior, consider an option to sleep less frequently (e.g., every N hosts) or once after a batch.
## Individual Comments
### Comment 1
<location path="iqe-host-inventory-plugin/iqe_host_inventory/grafted_commands.py" line_range="314-315" />
<code_context>
+
+
+@host_inventory_group.command()
+@click.option("-n", "--number", help="Number of hosts to create", default=51)
+@click.option(
+ "--user",
+ help="C.R.C user from iqe-core or local config. (Default: insights_qa)",
+ default="insights_qa",
+)
+@click.option(
+ "--display-name-prefix", help="Display name prefix for hosts", default="rhiqe.example"
+)
+@click.option("--base-archive", help="Name of the base archive", default=None)
+@click.option("--archive-repo", help="IQE plugin with the archive", default=None)
+@click.option(
+ "--sleep",
+ "sleep_seconds",
+ help="Seconds to sleep after upload before querying",
+ default=30,
+ type=float,
+)
+@click.option(
+ "--cache-refresh-wait-time",
+ help="Seconds to sleep between API requests for cache refresh (0 to skip)",
+ default=20,
+ type=float,
+)
+@click.pass_obj
+def measure_api_get_times(
+ obj: Any,
+ number: int,
+ user: str,
+ display_name_prefix: str,
</code_context>
<issue_to_address>
**suggestion:** Consider validating that `number` is positive to avoid unnecessary work or confusing output when it is 0 or negative.
With `number <= 0`, this will build and upload an empty archive list and still run the rest of the flow, which is likely confusing. Consider either enforcing `number > 0` at the CLI level (e.g., `click.IntRange(min=1)` or an early validation check) or short-circuiting with a clear message and exiting before upload/benchmarking when `number <= 0`.
```suggestion
@host_inventory_group.command()
@click.option(
"-n",
"--number",
help="Number of hosts to create (must be >= 1)",
default=51,
type=click.IntRange(min=1),
)
```
</issue_to_address>
### Comment 2
<location path="iqe-host-inventory-plugin/iqe_host_inventory/grafted_commands.py" line_range="385-391" />
<code_context>
+ insights_ids: list[str] = [archive.insights_id for archive in archives]
+ host_ids: list[str] = []
+
+ click.echo(f"Calling GET /host_exists for {len(insights_ids)} hosts...")
+ for insights_id in insights_ids:
+ response = host_inventory.apis.hosts.get_host_exists(insights_id=insights_id)
+ host_ids.append(response.id)
+ if cache_refresh_wait_time > 0:
+ click.echo(f"Sleeping {cache_refresh_wait_time}s for cache refresh...")
+ time.sleep(cache_refresh_wait_time)
+
+ click.echo(f"Found {len(host_ids)} / {len(insights_ids)} hosts via GET /host_exists.")
</code_context>
<issue_to_address>
**suggestion (performance):** The per-host sleep in the loop may make runs impractically long for larger `number` values.
Because the sleep is inside the loop, total runtime grows with `number * cache_refresh_wait_time` (plus request latency and `sleep_seconds`), so large `number` values can lead to multi-minute/hour runs. If the intent is to pause between batches rather than per host, consider sleeping once every N hosts, only before the first request, or making this configurable (e.g., batch size or a flag to disable per-request sleep while keeping an initial pause).
Suggested implementation:
```python
# --- Benchmark GET /host_exists ---
insights_ids: list[str] = [archive.insights_id for archive in archives]
host_ids: list[str] = []
if cache_refresh_wait_time > 0:
click.echo(
f"Sleeping {cache_refresh_wait_time}s before GET /host_exists calls for cache refresh..."
)
time.sleep(cache_refresh_wait_time)
```
```python
click.echo(f"Calling GET /host_exists for {len(insights_ids)} hosts...")
for insights_id in insights_ids:
response = host_inventory.apis.hosts.get_host_exists(insights_id=insights_id)
host_ids.append(response.id)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
4d0fc70 to
fa61172
Compare
rodrigonull
approved these changes
Mar 4, 2026
jpramos123
approved these changes
Mar 4, 2026
fa61172 to
99d466e
Compare
Contributor
|
/retest |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR is being created to address RHINENG-24557.
Devprod team has asked me to give them a simple way to reproduce the performance issues that we see in ephemeral with Kessel Phase 1 enabled: RHCLOUD-45279
Example usage:
This can be used in any environment.
PR Checklist
Secure Coding Practices Documentation Reference
You can find documentation on this checklist here.
Secure Coding Checklist
Summary by Sourcery
Add a CLI command to upload host archives and measure host inventory GET API performance.
New Features:
Enhancements: