Skip to content

Commit aa65c62

Browse files
RHOAIENG-26264: throttle concurrent skopeo invocations to at most 22 concurrent skopeo instances running (opendatahub-io#1165)
* RHOAIENG-26264: throttle concurrent skopeo invocations Implements throttling for concurrent skopeo invocations in the `scripts/update-commit-latest-env.py` script by using an asyncio.Semaphore. This change addresses issue opendatahub-io#1158. The script now uses a semaphore to limit the number of concurrent `skopeo inspect` operations to 22. This helps to: - Avoid overwhelming container registries with too many requests. - Prevent potential rate limiting issues. - Provide more stable execution under different network conditions. The `get_image_vcs_ref` function was updated to accept a semaphore, and the `inspect` function was updated to create and manage the semaphore, passing it to the image inspection tasks. * Revert: Revert accidental commit of test file This commit reverts the changes made to `manifests/base/params-latest.env` which was accidentally included in the previous commit (feat: Throttle concurrent skopeo invocations). The `params-latest.env` file in that location was created for testing purposes during development and should not be part of the final changeset. This commit restores it to its state prior to that accidental modification. * Refactor: Reduce semaphore scope in get_image_vcs_ref This commit refactors the get_image_vcs_ref function in `scripts/update-commit-latest-env.py`. The scope of the asyncio.Semaphore is now reduced to cover only the `asyncio.create_subprocess_exec` call for `skopeo` and the subsequent `process.communicate()` call. Processing of the skopeo output (JSON parsing, label extraction, and associated logging) has been moved outside of the semaphore block. This change is made to improve code clarity and provide a smaller diff for review, as per your feedback. The performance impact is expected to be negligible for this specific script, but the change aligns with best practices for minimizing resource holding time. * Revert: Correct state of manifests/base/params-latest.env This commit ensures that `manifests/base/params-latest.env` is restored to its state prior to the recent refactoring commit ("Refactor: Reduce semaphore scope in get_image_vcs_ref"). The file `manifests/base/params-latest.env` was unintentionally included in that refactoring commit. This commit reverts that specific file to the version present in the preceding commit ("Revert: Revert accidental commit of test file"), which should represent its correct state before the recent series of changes related to skopeo throttling. * Revert: Ensure correct state of manifests/base/params-latest.env This commit reverts the accidental inclusion of a test version of `manifests/base/params-latest.env` that occurred during the commit for a PEP 8 indentation fix ("Fix: Correct PEP 8 indentation in JSONDecodeError logging"). The file `manifests/base/params-latest.env` has been restored to its state from the commit prior to that indentation fix, which should be its correct, intended version. --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
1 parent 45e85f5 commit aa65c62

File tree

1 file changed

+28
-16
lines changed

1 file changed

+28
-16
lines changed

scripts/update-commit-latest-env.py

Lines changed: 28 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
PROJECT_ROOT = pathlib.Path(__file__).parent.parent
1212

1313

14-
async def get_image_vcs_ref(image_url: str) -> tuple[str, str | None]:
14+
async def get_image_vcs_ref(image_url: str, semaphore: asyncio.Semaphore) -> tuple[str, str | None]:
1515
"""
1616
Asynchronously inspects a container image's configuration using skopeo
1717
and extracts the 'vcs-ref' label.
@@ -32,21 +32,29 @@ async def get_image_vcs_ref(image_url: str) -> tuple[str, str | None]:
3232

3333
logging.info(f"Starting config inspection for: {image_url}")
3434

35+
stdout, stderr, returncode = None, None, None
3536
try:
36-
# Create an asynchronous subprocess
37-
process = await asyncio.create_subprocess_exec(
38-
*command,
39-
stdout=asyncio.subprocess.PIPE,
40-
stderr=asyncio.subprocess.PIPE
41-
)
42-
43-
# Wait for the command to complete and capture output
44-
stdout, stderr = await process.communicate()
45-
46-
# Check for errors
47-
if process.returncode != 0:
48-
logging.error(f"Skopeo command failed for {image_url} with exit code {process.returncode}.")
49-
logging.error(f"Stderr: {stderr.decode().strip()}")
37+
async with semaphore:
38+
logging.info(f"Semaphore acquired, starting skopeo inspect for: {image_url}")
39+
# Create an asynchronous subprocess
40+
process = await asyncio.create_subprocess_exec(
41+
*command,
42+
stdout=asyncio.subprocess.PIPE,
43+
stderr=asyncio.subprocess.PIPE
44+
)
45+
# Wait for the command to complete and capture output
46+
stdout, stderr = await process.communicate()
47+
returncode = process.returncode
48+
49+
# Process the results outside the semaphore block
50+
if returncode != 0:
51+
logging.error(f"Skopeo command failed for {image_url} with exit code {returncode}.")
52+
if stderr:
53+
logging.error(f"Stderr: {stderr.decode().strip()}")
54+
return image_url, None
55+
56+
if not stdout:
57+
logging.error(f"Skopeo command returned success but stdout was empty for {image_url}.")
5058
return image_url, None
5159

5260
# Decode and parse the JSON output from stdout
@@ -67,7 +75,10 @@ async def get_image_vcs_ref(image_url: str) -> tuple[str, str | None]:
6775
logging.error("The 'skopeo' command was not found. Please ensure it is installed and in your PATH.")
6876
return image_url, None
6977
except json.JSONDecodeError:
78+
# This error can now also happen if stdout is None or not valid JSON
7079
logging.error(f"Failed to parse skopeo output as JSON for {image_url}.")
80+
if stdout:
81+
logging.debug(f"Stdout from skopeo for {image_url}: {stdout.decode(errors='replace')}")
7182
return image_url, None
7283
except Exception as e:
7384
logging.error(f"An unexpected error occurred while processing {image_url}: {e}")
@@ -78,7 +89,8 @@ async def inspect(images_to_inspect: typing.Iterable[str]) -> list[tuple[str, st
7889
"""
7990
Main function to orchestrate the concurrent inspection of multiple images.
8091
"""
81-
tasks = [get_image_vcs_ref(image) for image in images_to_inspect]
92+
semaphore = asyncio.Semaphore(22) # Limit concurrent skopeo processes
93+
tasks = [get_image_vcs_ref(image, semaphore) for image in images_to_inspect]
8294
return await asyncio.gather(*tasks)
8395

8496

0 commit comments

Comments
 (0)