[None][infra] Add container scanning to plc nightly pipeline#12549
[None][infra] Add container scanning to plc nightly pipeline#12549yuanjingx87 wants to merge 14 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
📝 WalkthroughWalkthroughRestructures Jenkins pipeline scanning from a monolithic vulnerability reporting script into a modular Python package with separate utilities for Elasticsearch operations, report diffing, and Slack messaging. Introduces container image scanning via new Bash script for image key retrieval. Adds pipeline validation functions and reorganizes scanning stages to execute source code and container scanning in parallel, followed by a dedicated result processing stage. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
🧹 Nitpick comments (18)
jenkins/scripts/pulse_in_pipeline_scanning/utils/slack.py (1)
22-25: Variable naming:dependencyReportshould usesnake_case.Per coding guidelines, Python local variables should use
snake_case.Proposed fix
- dependencyReport = f"New risk found from nightly scanning ({branch} branch)\n{risk_detail}" - slack_payload = {"report": dependencyReport, "dashboardUrl": dashboard_link} - slack_resp = requests.post(SLACK_WEBHOOK_URL, json=slack_payload, timeout=60) + dependency_report = f"New risk found from nightly scanning ({branch} branch)\n{risk_detail}" + slack_payload = {"report": dependency_report, "dashboardUrl": dashboard_link} + slack_resp = requests.post(slack_webhook_url, json=slack_payload, timeout=60) slack_resp.raise_for_status()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/slack.py` around lines 22 - 25, The local variable dependencyReport should be renamed to snake_case (e.g., dependency_report) and all subsequent uses updated accordingly; update the assignment (currently building the f-string with branch and risk_detail), the payload construction (slack_payload = {"report": dependency_report, "dashboardUrl": dashboard_link}), and any references before calling requests.post(SLACK_WEBHOOK_URL, json=slack_payload, timeout=60) and slack_resp.raise_for_status() so the new variable name is used consistently.jenkins/scripts/pulse_in_pipeline_scanning/main.py (3)
30-31: Unused constantsSEVERITY_RANKandES_HEADERS.These constants are defined but never used in the code.
Proposed fix
-SEVERITY_RANK = {"Critical": 4, "High": 3, "Medium": 2, "Low": 1} -ES_HEADERS = {"Content-Type": "application/json"} -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/main.py` around lines 30 - 31, Remove or use the unused constants SEVERITY_RANK and ES_HEADERS in main.py: either delete the SEVERITY_RANK and ES_HEADERS definitions if they are not needed, or replace inline literal maps/headers elsewhere with references to SEVERITY_RANK and ES_HEADERS (e.g., use SEVERITY_RANK to normalize/compare severity strings and ES_HEADERS for HTTP/Elasticsearch requests). Update any functions that construct severity comparisons or HTTP requests to import/reference these symbols instead of hardcoding values so the constants are actually used.
44-44: Function nameprocess_container_resultis misleading.This function processes both source code and container scan results, not just container results. Consider renaming to
process_scan_resultsfor clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/main.py` at line 44, Rename the misleading function process_container_result to process_scan_results and update all references/calls accordingly; locate the function definition named process_container_result and change its name, then update any imports, callers, tests or registrations that reference process_container_result to use process_scan_results so the function name accurately reflects that it handles both source code and container scan results.
14-16: Unused constantsSOURCE_CODE_VULNERABILITYandSOURCE_CODE_SBOM.These constants are defined but never used in the code. Remove them or use them.
Proposed fix
-# this json file will be generated from pulse in pipeline scanning -SOURCE_CODE_VULNERABILITY = "./nspect_scan_report.json" -SOURCE_CODE_SBOM = "./sbom_toupload.json" -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/main.py` around lines 14 - 16, The two constants SOURCE_CODE_VULNERABILITY and SOURCE_CODE_SBOM are declared but never used; either remove these unused declarations or replace any hardcoded path usages ("./nspect_scan_report.json" and "./sbom_toupload.json") with these constants. Locate occurrences of the literal strings in this module (or any imported helpers that read/write the scan and SBOM files) and swap them to use SOURCE_CODE_VULNERABILITY and SOURCE_CODE_SBOM, or delete the constant definitions if there are no consumers; ensure imports/exports remain consistent if you change usage.jenkins/TensorRT_LLM_PLC.groovy (1)
355-361: Commented-out code should be removed or re-enabled.The
generateLockFilescall is commented out. If it's no longer needed, remove it. If it's temporarily disabled, add a comment explaining why and when it should be re-enabled.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/TensorRT_LLM_PLC.groovy` around lines 355 - 361, The commented-out generateLockFiles call inside the stage("Source Code OSS Scanning") should not be left uncommented without context; either remove the commented line entirely or re-enable it. Locate the generateLockFiles(env.LLM_REPO, env.BRANCH_NAME) call in the same block that invokes pulseScanSourceCode(env.LLM_REPO, env.BRANCH_NAME) and either delete the commented line if it's obsolete, or uncomment it and add a short explanatory comment (e.g., why it was disabled and when to re-enable) adjacent to the generateLockFiles call so future readers understand its intent.jenkins/scripts/pulse_in_pipeline_scanning/utils/es.py (2)
54-111: Add return type hint and consider clearing scroll context.The function should have a return type hint. Also, the scroll context is not explicitly cleared after use, which could leak resources on the Elasticsearch server.
Proposed fix
-def get_last_scan_results(report_type: str, branch: str): +def get_last_scan_results(report_type: str, branch: str) -> dict[tuple[str, str], int]: data = ES_CLIENT.search( ... ) ... while len(hits) > 0: resp = ES_CLIENT.scroll(scroll_id=scroll_id, scroll="2m") scroll_id = resp["_scroll_id"] hits = resp["hits"]["hits"] docs.extend(hits) + # Clear scroll context + ES_CLIENT.clear_scroll(scroll_id=scroll_id) + detected_dependencies = {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/es.py` around lines 54 - 111, In get_last_scan_results add an explicit return type hint (e.g., -> Dict[Tuple[str, str], int]) and ensure the Elasticsearch scroll context is cleared after use to avoid resource leaks: wrap the scrolling logic in a try/finally (or ensure cleanup after the while loop) and call ES_CLIENT.clear_scroll(scroll_id=scroll_id) only if scroll_id is present; also import any required typing names (Dict, Tuple) at the top. This keeps the existing behavior (returning {} when no aggregations) but documents the return type and guarantees the scroll is cleared.
26-51: Add type hints toes_postfunction.Per coding guidelines, Python functions should be annotated with type hints.
Proposed fix
-def es_post(url, documents): +def es_post(url: str, documents: list[dict]) -> tuple[int, bool]: """POST a list of documents to an Elasticsearch index and return (indexed, errors)."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/es.py` around lines 26 - 51, Add static type annotations to the es_post function: annotate the parameters and return type (for example: def es_post(url: str, documents: Sequence[Dict[str, Any]]) -> Tuple[int, bool]:) and add the necessary typing imports (e.g., from typing import Sequence, Dict, Any, Tuple) at the top of the module; ensure the annotated types match current usage (documents is an iterable of mapping/dict) and leave the function body logic unchanged.jenkins/scripts/pulse_in_pipeline_scanning/utils/common.py (1)
6-14: Add type hints to functions.Per coding guidelines, Python functions should be annotated with type hints.
Proposed fix
-def load_json(path): +def load_json(path: str) -> dict: with open(path) as f: return json.load(f) -def is_non_permissive(licenses): +def is_non_permissive(licenses: list[str]) -> bool: return len(licenses) == 0 or any( lic.startswith(NON_PERMISSIVE_LICENSE_PREFIXES) for lic in licenses )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/common.py` around lines 6 - 14, Add missing type annotations to comply with coding guidelines: annotate load_json to accept a path parameter as str and return a deserialized object (e.g., Any or dict[str, Any]) and annotate is_non_permissive to accept licenses as an iterable/sequence of strings (e.g., Sequence[str]) and return bool; update function signatures for load_json and is_non_permissive and import typing names (Any, Sequence) as needed, keeping references to NON_PERMISSIVE_LICENSE_PREFIXES unchanged.jenkins/scripts/pulse_in_pipeline_scanning/utils/report.py (3)
1-3: Inconsistent import style: mixing absolute and relative imports.Line 1 uses an absolute import (
from utils.es import ...) while Line 3 uses a relative import (from .common import ...). Consider using consistent import style.Proposed fix (use relative imports consistently)
-from utils.es import get_latest_license_preapproved_container_deps +from .es import get_latest_license_preapproved_container_deps from .common import is_non_permissive, load_json🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/report.py` around lines 1 - 3, The imports in report.py mix absolute and relative styles; change the absolute import of get_latest_license_preapproved_container_deps to a relative import to match the existing relative imports (e.g., replace the current from utils.es import get_latest_license_preapproved_container_deps with a relative form like from .es import get_latest_license_preapproved_container_deps) so all imports (get_latest_license_preapproved_container_deps, is_non_permissive, load_json) use the same module resolution style.
35-41: Functiondiff_vulnsdoesn't actually compute a diff.The function name suggests it computes a diff between release and base, but it only filters release vulnerabilities by severity. If this is intentional (report all high-severity vulns), consider renaming to
filter_high_severity_vulnsfor clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/report.py` around lines 35 - 41, The function diff_vulns currently only filters release vulnerabilities by HIGH_SEVERITY and doesn't compute a diff; either rename it to filter_high_severity_vulns and update all callers, or implement a true diff: change diff_vulns signature to accept a base path (e.g., diff_vulns(release_path, base_path)), load both JSONs via load_json, dedup both with dedup_vulns, then compute introduced_vulns as items present in release_vulns but not in base_vulns (compare by unique keys/IDs) and finally filter that result by HIGH_SEVERITY; keep references to load_json, dedup_vulns, diff_vulns and HIGH_SEVERITY when updating callers and tests.
8-41: Add type hints to functions.Per coding guidelines, Python functions should be annotated with type hints.
Proposed fix
-def dedup_vulns(vulns): +def dedup_vulns(vulns: list[dict]) -> dict[tuple, dict]: """Deduplicate by (vuln, package_name, package_version), merging package_paths.""" ... -def dedup_licenses(contents): +def dedup_licenses(contents: list[dict]) -> dict[str, dict]: """Deduplicate by (package, version, type), merging license lists.""" ... -def diff_vulns(release_path): +def diff_vulns(release_path: str) -> list[dict]: ... -def diff_licenses(release_path, base_path): +def diff_licenses(release_path: str, base_path: str) -> list[dict]: ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/report.py` around lines 8 - 41, Add Python type hints and necessary typing imports for the three functions: annotate dedup_vulns(vulns: List[Dict[str, Any]]) -> Dict[Tuple[str, str, str], Dict[str, Any]]; dedup_licenses(contents: List[Dict[str, Any]]) -> Dict[str, Dict[str, Any]]; and diff_vulns(release_path: str) -> List[Dict[str, Any]]. Import required symbols from typing (List, Dict, Any, Tuple) at the top of the module and update the function signatures (dedup_vulns, dedup_licenses, diff_vulns) accordingly; keep existing runtime logic unchanged.jenkins/scripts/get_image_key_to_tag.sh (2)
50-60: Fixed/tmppath may cause issues with concurrent executions.Using a fixed path
/tmp/imageKeyToTag.jsoncan cause race conditions if multiple instances of this script run simultaneously. Consider usingmktemp.Proposed fix
+TEMP_FILE=$(mktemp) +trap "rm -f ${TEMP_FILE}" EXIT + while [[ "${BUILD_NUMBER}" -gt 0 ]]; do ARTIFACT_URL="${ARTIFACTORY_BASE}/${BUILD_NUMBER}/imageKeyToTag.json" echo "Fetching: ${ARTIFACT_URL}" >&2 - HTTP_STATUS=$(curl -sL -o /tmp/imageKeyToTag.json -w "%{http_code}" "${ARTIFACT_URL}") + HTTP_STATUS=$(curl -sL -o "${TEMP_FILE}" -w "%{http_code}" "${ARTIFACT_URL}") if [[ "${HTTP_STATUS}" == "200" ]]; then - cat /tmp/imageKeyToTag.json + cat "${TEMP_FILE}" exit 0 fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/get_image_key_to_tag.sh` around lines 50 - 60, The script currently writes to a fixed path /tmp/imageKeyToTag.json causing races; change the curl output to a unique temp file created via mktemp (e.g., TMPFILE=$(mktemp)) and replace all references to /tmp/imageKeyToTag.json with that variable (used in the curl -o and the subsequent cat). Ensure you set a trap to remove the temp file on exit or failure (trap 'rm -f "$TMPFILE"' EXIT) and handle failure to create the temp file before entering the loop; update references to HTTP_STATUS, ARTIFACT_URL and BUILD_NUMBER logic to use the temporary file.
35-41: Comment says "last successful build" but querieslastBuildendpoint.Line 35 comment mentions "last successful build" but the code queries
lastBuild, which returns the most recent build regardless of status. If you want only successful builds, uselastSuccessfulBuild/api/jsoninstead.Proposed fix if successful build is intended
# Query Jenkins API for the last successful build number -BUILD_NUMBER=$(curl -fsSL "${JENKINS_BASE}/lastBuild/api/json" | python3 -c "import json,sys; print(json.load(sys.stdin)['number'])" 2>/dev/null) +BUILD_NUMBER=$(curl -fsSL "${JENKINS_BASE}/lastSuccessfulBuild/api/json" | python3 -c "import json,sys; print(json.load(sys.stdin)['number'])" 2>/dev/null)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/get_image_key_to_tag.sh` around lines 35 - 41, The script is claiming to fetch the "last successful build" but uses the lastBuild endpoint; update the first curl that sets BUILD_NUMBER (which currently calls "${JENKINS_BASE}/lastBuild/api/json") to use "${JENKINS_BASE}/lastSuccessfulBuild/api/json" so BUILD_NUMBER reflects a successful build, and keep the existing fallback to lastCompletedBuild/api/json for cases where a successful build is not available; ensure the variable BUILD_NUMBER and the curl/JENKINS_BASE usages are the only changes.jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py (5)
88-90: Add return type annotation.🔧 Proposed fix
def submit_source_code_licenses( input_file: str, last_scan_result: dict, build_metadata: BuildMetadata, start_datetime: datetime -): +) -> list[dict]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py` around lines 88 - 90, The function submit_source_code_licenses lacks an explicit return type annotation; update its signature to include the appropriate return type (for example -> None if it does not return a value, or -> dict/list/Optional[...] if it returns data) by modifying the def submit_source_code_licenses(...): line to def submit_source_code_licenses(...) -> <ReturnType>: and ensure any internal return statements match that type.
28-29: Add type hints tosafefunction.Per coding guidelines, always annotate functions with type hints. Also note that this implementation treats empty strings,
0, andFalseas falsy and will returndefaultfor these values.🔧 Proposed fix with type hints
-def safe(value, default=None): - return value if value else default +from typing import TypeVar + +T = TypeVar("T") + +def safe(value: T | None, default: T | None = None) -> T | None: + return value if value is not None else defaultNote: If empty strings should also fall back to
default, keep the originalif valuecheck but document the behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py` around lines 28 - 29, Add type hints to the safe function: declare a generic TypeVar T and annotate the parameter as Optional[T] (or Any if you prefer) and the return type as Optional[T] (or T | None) so callers have static typing; update the function signature for safe and import typing symbols accordingly, and explicitly document its current falsy semantics (it returns default for values like "", 0, False) or, if you want to treat only None as missing, change the runtime check from "if value" to "if value is not None" and keep the same typed signature; reference the function name safe when locating the change.
193-200: Add return type annotation.🔧 Proposed fix
def submit_container_licenses( input_file: str, base_input_file: str, platform: str, last_scan_result: dict, build_metadata: BuildMetadata, start_datetime: datetime, -): +) -> list[dict]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py` around lines 193 - 200, The function submit_container_licenses is missing a return type annotation; add an explicit return type (e.g., -> None if it does not return a value or the appropriate type if it does) to the function signature so callers and type checkers know the expected return; update the def submit_container_licenses(...) signature to include the correct return type annotation.
32-34: Add return type annotation.Per coding guidelines, always annotate functions with type hints including return types.
🔧 Proposed fix
def submit_source_code_vulns( input_file: str, last_scan_result: dict, build_metadata: BuildMetadata, start_datetime: datetime -): +) -> list[dict]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py` around lines 32 - 34, The function submit_source_code_vulns is missing an explicit return type annotation—update its signature to include a return type (for example -> None if it has no return value, or the specific type it returns such as -> dict or -> List[Vulnerability] according to its implementation) so the signature reads submit_source_code_vulns(input_file: str, last_scan_result: dict, build_metadata: BuildMetadata, start_datetime: datetime) -> <ReturnType>; inspect the function body to choose the correct type and add the annotation.
140-147: Add return type annotation.🔧 Proposed fix
def submit_container_vulns( input_file: str, base_input_file: str, platform: str, last_scan_result: dict, build_metadata: BuildMetadata, start_datetime: datetime, -): +) -> list[dict]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py` around lines 140 - 147, The function submit_container_vulns is missing a return type annotation; update its signature to include an explicit return type (most likely "-> None") so the signature becomes submit_container_vulns(... ) -> None:, and ensure any early returns or final return statements are consistent with that type; locate the function definition by the name submit_container_vulns and add the annotation without changing the function body.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@jenkins/scripts/pulse_in_pipeline_scanning/main.py`:
- Line 86: There's a typo in the variable name last_container_licneses; rename
it to last_container_licenses wherever it's defined and referenced (the
assignment calling get_last_scan_results("container_license", args.branch) and
any later uses) to fix the misspelling and avoid NameError/clarity issues.
- Around line 1-12: Add the required NVIDIA copyright header to the top of the
file (before any imports) in jenkin's pulse_in_pipeline_scanning main module so
the new Python file includes the standard NVIDIA header block with the correct
year and owner; ensure it precedes the import statements and retains existing
imports/logic (imports like argparse, os, datetime, and the submit_* and utils.*
imports should remain unchanged).
In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_preapproved.py`:
- Around line 75-79: The default CSV path in the parser.add_argument call for
"--csv" is a developer-local absolute path; update the parser.add_argument for
"--csv" in submit_preapproved.py to avoid that hardcoded default by either
making the flag required (set required=True and remove default) or set a
repository-relative default using Path(__file__).resolve().parents[...] joined
with the relative path to pre_approved.csv; update the help text accordingly to
reflect that the CSV must be provided or that a repo-relative file will be used.
- Around line 1-9: This file is missing the required NVIDIA copyright header;
add the standard NVIDIA copyright and license header block at the very top of
submit_preapproved.py (before any imports) so it precedes the existing import
statements and module-level code; ensure the header matches the project's
canonical NVIDIA header text used in other Python files in the repo.
- Around line 16-41: Remove the duplicated es_post implementation and instead
import and call the shared function from utils.es; locate the local es_post in
submit_preapproved.py and replace its body with "from utils.es import es_post"
(i.e. remove the function definition and add the import), and if utils.es
triggers environment-dependent failures at import time, refactor utils/es.py to
defer env reads or make its top-level safe to import (or add a thin wrapper in
submit_preapproved.py that initializes required configuration before importing
es_post) so importing utils.es does not raise during module import.
In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py`:
- Around line 129-133: The code unconditionally calls es_post with
sbom_documents which can be empty; add a guard to only call es_post when
sbom_documents (or a derived sbom_count) is non-empty and skip posting
otherwise, e.g., check "if sbom_documents:" before calling es_post and only
raise RuntimeError if sbom_errors are returned after a real post; update the
block referencing es_post, sbom_documents, and sbom_errors accordingly.
- Around line 8-11: The current sys.path.append(os.path.abspath("..")) is
fragile because it uses the process CWD; change it to compute the parent dir
relative to this script using __file__ and add that absolute path to sys.path
(preferably via sys.path.insert(0, ...)) so imports like load_json, es_post,
diff_licenses and diff_vulns resolve reliably regardless of where
submit_report.py is invoked from; update the sys.path manipulation line
accordingly to use os.path.dirname(__file__) and os.path.join to get the parent
folder.
- Around line 1-6: Add the required NVIDIA copyright header (with the current
year) to the top of this new file so it conforms to project guidelines; insert
the header above the existing imports (before the first line containing "import
json") so the file-level metadata is present for submit_report.py.
- Around line 209-225: The dedupe key uses the wrong field names: in the loop
over trtllm_deps you build result_key from v.get("package_name") and
v.get("package_version") but diff_licenses() yields "package" and "version";
update the result_key construction in the loop (and any related checks using
result_key) to use v.get("package") and v.get("version") so the is_new check
against last_scan_result works correctly (refer to the loop over trtllm_deps,
the result_key variable, and the diff_licenses() output).
- Line 6: The import uses NotRequired from typing which isn't available on
Python 3.10; update the import to pull NotRequired from typing_extensions (e.g.,
replace "from typing import NotRequired, TypedDict" with importing NotRequired
from typing_extensions while keeping TypedDict from typing) so NotRequired is
compatible with Python 3.10 as used elsewhere in the codebase.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/common.py`:
- Around line 1-4: This file is missing the required NVIDIA copyright header;
update the top of jenkins/scripts/pulse_in_pipeline_scanning/utils/common.py to
add the standard NVIDIA Python file header comment block (including copyright
year(s) and "NVIDIA CORPORATION" owner text) before any imports/definitions so
the module (containing NON_PERMISSIVE_LICENSE_PREFIXES) includes the legal
header.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/es.py`:
- Around line 114-122: get_latest_license_preapproved_container_deps currently
assumes ES_CLIENT.search returned at least one hit and directly indexes
data["hits"]["hits"][0], which can raise IndexError when no documents exist;
update the function to check that data.get("hits", {}).get("hits") is a
non-empty list before accessing [0], handle missing "_source" or missing
"preapproved_deps" by returning an empty list, and keep references to ES_CLIENT
and ES_INDEX_PREAPPROVED_BASE intact when implementing the guard.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/report.py`:
- Around line 1-5: Add the required NVIDIA copyright header at the very top of
this new Python module (before any imports) — modify report.py to prepend the
standard NVIDIA header block above the existing imports (which include
get_latest_license_preapproved_container_deps, is_non_permissive, load_json) and
constants such as HIGH_SEVERITY so the file complies with the repository's
header policy.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/slack.py`:
- Around line 8-12: In post_slack_msg, rename the local constant to snake_case
(e.g., slack_webhook_url), read the correct env var key ("TRTLLM_PLC_WEBHOOK")
with os.environ.get("TRTLLM_PLC_WEBHOOK"), and update the raised
EnvironmentError message to reference that same key ("TRTLLM_PLC_WEBHOOK") so
the variable name and error text match; ensure all subsequent uses in that
function use the new slack_webhook_url identifier.
- Around line 1-6: This file is missing the required NVIDIA copyright header at
the top; add the standard NVIDIA copyright header block (including the year of
latest meaningful modification) as the first lines of the module in
jenkins/scripts/pulse_in_pipeline_scanning/utils/slack.py so the header precedes
the imports (i.e., before the existing imports like datetime, timezone, quote,
requests) and matches the project's canonical header format used in other .py
files.
In `@jenkins/TensorRT_LLM_PLC.groovy`:
- Around line 228-232: The call that sets `output` currently hardcodes "main"
when invoking `./jenkins/scripts/get_image_key_to_tag.sh`; restore the
parameterized invocation by using the branch parameter (e.g.,
`params.branchName` or the pipeline `branchName` variable) so the script
receives the actual branch, and ensure the script argument is passed via the
`script:` string interpolation used previously; update the `sh` invocation that
assigns `output` (the `get_image_key_to_tag.sh` call) to use the parameter
rather than the literal "main".
- Around line 302-304: Remove the unsafe global environment dump in the shell
step that currently runs printenv (the sh """ ... printenv ...""" block);
instead either delete the printenv invocation entirely or replace it with
explicit, non-sensitive variable echoes (e.g., only echo known-safe names) or a
whitelist-based print routine so secrets like tokens/webhooks are never emitted
to build logs; update the shell block that creates the virtualenv (the sh block
running "python3 -m venv venv") accordingly to avoid printing all environment
variables.
---
Nitpick comments:
In `@jenkins/scripts/get_image_key_to_tag.sh`:
- Around line 50-60: The script currently writes to a fixed path
/tmp/imageKeyToTag.json causing races; change the curl output to a unique temp
file created via mktemp (e.g., TMPFILE=$(mktemp)) and replace all references to
/tmp/imageKeyToTag.json with that variable (used in the curl -o and the
subsequent cat). Ensure you set a trap to remove the temp file on exit or
failure (trap 'rm -f "$TMPFILE"' EXIT) and handle failure to create the temp
file before entering the loop; update references to HTTP_STATUS, ARTIFACT_URL
and BUILD_NUMBER logic to use the temporary file.
- Around line 35-41: The script is claiming to fetch the "last successful build"
but uses the lastBuild endpoint; update the first curl that sets BUILD_NUMBER
(which currently calls "${JENKINS_BASE}/lastBuild/api/json") to use
"${JENKINS_BASE}/lastSuccessfulBuild/api/json" so BUILD_NUMBER reflects a
successful build, and keep the existing fallback to lastCompletedBuild/api/json
for cases where a successful build is not available; ensure the variable
BUILD_NUMBER and the curl/JENKINS_BASE usages are the only changes.
In `@jenkins/scripts/pulse_in_pipeline_scanning/main.py`:
- Around line 30-31: Remove or use the unused constants SEVERITY_RANK and
ES_HEADERS in main.py: either delete the SEVERITY_RANK and ES_HEADERS
definitions if they are not needed, or replace inline literal maps/headers
elsewhere with references to SEVERITY_RANK and ES_HEADERS (e.g., use
SEVERITY_RANK to normalize/compare severity strings and ES_HEADERS for
HTTP/Elasticsearch requests). Update any functions that construct severity
comparisons or HTTP requests to import/reference these symbols instead of
hardcoding values so the constants are actually used.
- Line 44: Rename the misleading function process_container_result to
process_scan_results and update all references/calls accordingly; locate the
function definition named process_container_result and change its name, then
update any imports, callers, tests or registrations that reference
process_container_result to use process_scan_results so the function name
accurately reflects that it handles both source code and container scan results.
- Around line 14-16: The two constants SOURCE_CODE_VULNERABILITY and
SOURCE_CODE_SBOM are declared but never used; either remove these unused
declarations or replace any hardcoded path usages ("./nspect_scan_report.json"
and "./sbom_toupload.json") with these constants. Locate occurrences of the
literal strings in this module (or any imported helpers that read/write the scan
and SBOM files) and swap them to use SOURCE_CODE_VULNERABILITY and
SOURCE_CODE_SBOM, or delete the constant definitions if there are no consumers;
ensure imports/exports remain consistent if you change usage.
In `@jenkins/scripts/pulse_in_pipeline_scanning/submit_report.py`:
- Around line 88-90: The function submit_source_code_licenses lacks an explicit
return type annotation; update its signature to include the appropriate return
type (for example -> None if it does not return a value, or ->
dict/list/Optional[...] if it returns data) by modifying the def
submit_source_code_licenses(...): line to def submit_source_code_licenses(...)
-> <ReturnType>: and ensure any internal return statements match that type.
- Around line 28-29: Add type hints to the safe function: declare a generic
TypeVar T and annotate the parameter as Optional[T] (or Any if you prefer) and
the return type as Optional[T] (or T | None) so callers have static typing;
update the function signature for safe and import typing symbols accordingly,
and explicitly document its current falsy semantics (it returns default for
values like "", 0, False) or, if you want to treat only None as missing, change
the runtime check from "if value" to "if value is not None" and keep the same
typed signature; reference the function name safe when locating the change.
- Around line 193-200: The function submit_container_licenses is missing a
return type annotation; add an explicit return type (e.g., -> None if it does
not return a value or the appropriate type if it does) to the function signature
so callers and type checkers know the expected return; update the def
submit_container_licenses(...) signature to include the correct return type
annotation.
- Around line 32-34: The function submit_source_code_vulns is missing an
explicit return type annotation—update its signature to include a return type
(for example -> None if it has no return value, or the specific type it returns
such as -> dict or -> List[Vulnerability] according to its implementation) so
the signature reads submit_source_code_vulns(input_file: str, last_scan_result:
dict, build_metadata: BuildMetadata, start_datetime: datetime) -> <ReturnType>;
inspect the function body to choose the correct type and add the annotation.
- Around line 140-147: The function submit_container_vulns is missing a return
type annotation; update its signature to include an explicit return type (most
likely "-> None") so the signature becomes submit_container_vulns(... ) ->
None:, and ensure any early returns or final return statements are consistent
with that type; locate the function definition by the name
submit_container_vulns and add the annotation without changing the function
body.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/common.py`:
- Around line 6-14: Add missing type annotations to comply with coding
guidelines: annotate load_json to accept a path parameter as str and return a
deserialized object (e.g., Any or dict[str, Any]) and annotate is_non_permissive
to accept licenses as an iterable/sequence of strings (e.g., Sequence[str]) and
return bool; update function signatures for load_json and is_non_permissive and
import typing names (Any, Sequence) as needed, keeping references to
NON_PERMISSIVE_LICENSE_PREFIXES unchanged.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/es.py`:
- Around line 54-111: In get_last_scan_results add an explicit return type hint
(e.g., -> Dict[Tuple[str, str], int]) and ensure the Elasticsearch scroll
context is cleared after use to avoid resource leaks: wrap the scrolling logic
in a try/finally (or ensure cleanup after the while loop) and call
ES_CLIENT.clear_scroll(scroll_id=scroll_id) only if scroll_id is present; also
import any required typing names (Dict, Tuple) at the top. This keeps the
existing behavior (returning {} when no aggregations) but documents the return
type and guarantees the scroll is cleared.
- Around line 26-51: Add static type annotations to the es_post function:
annotate the parameters and return type (for example: def es_post(url: str,
documents: Sequence[Dict[str, Any]]) -> Tuple[int, bool]:) and add the necessary
typing imports (e.g., from typing import Sequence, Dict, Any, Tuple) at the top
of the module; ensure the annotated types match current usage (documents is an
iterable of mapping/dict) and leave the function body logic unchanged.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/report.py`:
- Around line 1-3: The imports in report.py mix absolute and relative styles;
change the absolute import of get_latest_license_preapproved_container_deps to a
relative import to match the existing relative imports (e.g., replace the
current from utils.es import get_latest_license_preapproved_container_deps with
a relative form like from .es import
get_latest_license_preapproved_container_deps) so all imports
(get_latest_license_preapproved_container_deps, is_non_permissive, load_json)
use the same module resolution style.
- Around line 35-41: The function diff_vulns currently only filters release
vulnerabilities by HIGH_SEVERITY and doesn't compute a diff; either rename it to
filter_high_severity_vulns and update all callers, or implement a true diff:
change diff_vulns signature to accept a base path (e.g.,
diff_vulns(release_path, base_path)), load both JSONs via load_json, dedup both
with dedup_vulns, then compute introduced_vulns as items present in
release_vulns but not in base_vulns (compare by unique keys/IDs) and finally
filter that result by HIGH_SEVERITY; keep references to load_json, dedup_vulns,
diff_vulns and HIGH_SEVERITY when updating callers and tests.
- Around line 8-41: Add Python type hints and necessary typing imports for the
three functions: annotate dedup_vulns(vulns: List[Dict[str, Any]]) ->
Dict[Tuple[str, str, str], Dict[str, Any]]; dedup_licenses(contents:
List[Dict[str, Any]]) -> Dict[str, Dict[str, Any]]; and diff_vulns(release_path:
str) -> List[Dict[str, Any]]. Import required symbols from typing (List, Dict,
Any, Tuple) at the top of the module and update the function signatures
(dedup_vulns, dedup_licenses, diff_vulns) accordingly; keep existing runtime
logic unchanged.
In `@jenkins/scripts/pulse_in_pipeline_scanning/utils/slack.py`:
- Around line 22-25: The local variable dependencyReport should be renamed to
snake_case (e.g., dependency_report) and all subsequent uses updated
accordingly; update the assignment (currently building the f-string with branch
and risk_detail), the payload construction (slack_payload = {"report":
dependency_report, "dashboardUrl": dashboard_link}), and any references before
calling requests.post(SLACK_WEBHOOK_URL, json=slack_payload, timeout=60) and
slack_resp.raise_for_status() so the new variable name is used consistently.
In `@jenkins/TensorRT_LLM_PLC.groovy`:
- Around line 355-361: The commented-out generateLockFiles call inside the
stage("Source Code OSS Scanning") should not be left uncommented without
context; either remove the commented line entirely or re-enable it. Locate the
generateLockFiles(env.LLM_REPO, env.BRANCH_NAME) call in the same block that
invokes pulseScanSourceCode(env.LLM_REPO, env.BRANCH_NAME) and either delete the
commented line if it's obsolete, or uncomment it and add a short explanatory
comment (e.g., why it was disabled and when to re-enable) adjacent to the
generateLockFiles call so future readers understand its intent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0c974fd6-5145-45b8-ba95-dffb96824b8e
📒 Files selected for processing (10)
jenkins/TensorRT_LLM_PLC.groovyjenkins/scripts/get_image_key_to_tag.shjenkins/scripts/pulse_in_pipeline_scanning/main.pyjenkins/scripts/pulse_in_pipeline_scanning/submit_preapproved.pyjenkins/scripts/pulse_in_pipeline_scanning/submit_report.pyjenkins/scripts/pulse_in_pipeline_scanning/utils/common.pyjenkins/scripts/pulse_in_pipeline_scanning/utils/es.pyjenkins/scripts/pulse_in_pipeline_scanning/utils/report.pyjenkins/scripts/pulse_in_pipeline_scanning/utils/slack.pyjenkins/scripts/submit_vulnerability_report.py
💤 Files with no reviewable changes (1)
- jenkins/scripts/submit_vulnerability_report.py
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
f7d662a to
204ab8f
Compare
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
| @@ -0,0 +1,45 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
Any particular reason to make this a Bash script if all the other scripts are Python?
Summary by CodeRabbit
Release Notes
New Features
Refactor
Description
Add stage to get latest post merge trtllm release container and run in-pipeline scanning on those containers
Some refactor
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.