-
Notifications
You must be signed in to change notification settings - Fork 105
RHOAIENG-12554, RHOAIENG-12195: add TrustyAI README.md
with TMaRCo detoxification details and required dependencies justification
#1572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…detoxification details and required dependencies justification
WalkthroughA new README file has been added to the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Possibly related PRs
Suggested labels
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
jupyter/trustyai/README.md (3)
1-1
: Add a proper Markdown heading.Starting the file with plain text lowers readability and breaks common README conventions. Consider prefixing with a level-1 heading:
-Supports the use of TrustyAI +# TrustyAI + +This directory provides notebook examples and assets that demonstrate how to use TrustyAI.
5-5
: Fix Markdown link syntax for Jira tickets.Square brackets inside parentheses render literally instead of as links. Use standard
[text](url)
form:-* TMaRCo detoxification from the notebooks ([https://issues.redhat.com/browse/RHOAIENG-12554], [https://issues.redhat.com/browse/RHOAIENG-12195]) +* TMaRCo detoxification from the notebooks + ([RHOAIENG-12554](https://issues.redhat.com/browse/RHOAIENG-12554), + [RHOAIENG-12195](https://issues.redhat.com/browse/RHOAIENG-12195))
6-8
: Tighten wording and split dependencies section.The current phrasing is repetitive (“had to be added to the dependencies”). Suggest clarifying and moving details into a dedicated “Dependencies” subsection:
- * This uses an "expert"/"anti-expert" architecture, which requires running these models locally. - This meant that these dependencies (Pytorch, transformers and datasets) had to be added to the dependencies. - * The use-case benefits from GPU acceleration, so the CUDA-enabled version of Pytorch was chosen. + * Implements an “expert / anti-expert” architecture that must run the models locally. + +### Dependencies +* `torch` (CUDA build), `transformers`, `datasets` – required for local model execution. + CUDA acceleration is recommended to speed up detoxification inference.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
jupyter/trustyai/README.md
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb:71-88
Timestamp: 2025-07-04T06:05:30.580Z
Learning: jiridanek requested GitHub issue creation for TrustyAI test notebook URL configurability and network error handling improvements during PR #1306 review. Issue #1323 was created with ⚠️ emoji in title for visibility, comprehensive problem description covering incorrect hardcoded URLs (pointing to Python 3.11 instead of 3.12), missing network error handling, maintenance burden, multiple solution options with code examples, phased acceptance criteria, implementation guidance, testing approach, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml:5-15
Timestamp: 2025-07-02T18:59:15.788Z
Learning: jiridanek creates targeted GitHub issues for specific test quality improvements identified during PR reviews in opendatahub-io/notebooks. Issue #1268 demonstrates this by converting a review comment about insufficient tf2onnx conversion test validation into a comprehensive improvement plan with clear acceptance criteria, code examples, and ROCm-specific context.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb:102-102
Timestamp: 2025-07-03T16:26:18.718Z
Learning: jiridanek requested GitHub issue creation for improving hardcoded threshold constants in TrustyAI test notebook during PR #1306 review. Issue #1317 was created with comprehensive problem description covering both bias detection threshold (-0.15670061634672994) and fairness threshold (0.0036255104824703954), multiple solution options with code examples, clear acceptance criteria, implementation guidance, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review, affecting both Python 3.11 and 3.12 versions. Issue #1338 was created with comprehensive problem description covering both affected images, repository pattern analysis comparing correct vs incorrect naming conventions, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review. Issue was created with comprehensive problem description covering both Python 3.11 and 3.12 versions, repository pattern analysis showing correct vs incorrect naming, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb:71-76
Timestamp: 2025-07-04T06:04:43.085Z
Learning: jiridanek requested GitHub issue creation for duplicate CSV loading and validation problem in jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb during PR #1306 review. Issue #1322 was created with comprehensive problem description covering code redundancy, runtime failure risks, network inefficiency, and test reliability concerns, along with detailed solution including duplicate line removal, data validation implementation, repository-wide audit, acceptance criteria, implementation guidance, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-11T11:16:05.131Z
Learning: jiridanek requested GitHub issue creation for RStudio py311 Tekton push pipelines during PR #1379 review. Issue #1384 was successfully created covering two RStudio variants (CPU and CUDA) found in manifests/base/params-latest.env, with comprehensive problem description, implementation requirements following the same pattern as other workbench pipelines, clear acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-11T11:16:05.131Z
Learning: jiridanek requested GitHub issue creation for adding RStudio py311 Tekton push pipelines during PR #1379 review, referencing existing registry entries in manifests/base/params-latest.env but missing corresponding .tekton pipeline files. A comprehensive issue was created with detailed problem description, implementation requirements following the same pattern as other workbench pipelines, clear acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb:22-29
Timestamp: 2025-07-02T18:27:51.097Z
Learning: jiridanek consistently creates comprehensive follow-up GitHub issues from PR review comments in opendatahub-io/notebooks, turning specific code quality concerns into systematic improvements tracked with proper context, acceptance criteria, and cross-references. Issue #1266 demonstrates this pattern by expanding a specific error handling concern in load_expected_versions() into a repository-wide improvement initiative.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1320
File: jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda:65-66
Timestamp: 2025-07-09T12:31:02.033Z
Learning: jiridanek requested GitHub issue creation for MSSQL repo file hardcoding problem during PR #1320 review. Issue #1363 was created and updated with comprehensive problem description covering hardcoded x86_64 MSSQL repo files breaking multi-architecture builds across 10 affected Dockerfiles (including datascience, CUDA, ROCm, and TrustyAI variants), detailed root cause analysis, three solution options with code examples, clear acceptance criteria for all image types, implementation guidance following established multi-architecture patterns, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: TrustyAI 0.6.1 (latest version as of June 2025) has a hard dependency constraint on jupyter-bokeh~=3.0.5, preventing upgrades to jupyter-bokeh 4.x in notebook images that include TrustyAI. This requires either waiting for TrustyAI to update their dependency or excluding TrustyAI from jupyter-bokeh upgrades.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: TrustyAI explicitly declares jupyter-bokeh~=3.0.5 as a hard dependency in both their requirements.txt and pyproject.toml files, with no open issues tracking jupyter-bokeh 4.x compatibility. This creates an unresolvable pip dependency conflict when trying to upgrade jupyter-bokeh to 4.x in notebook images that include TrustyAI.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1127
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:20-20
Timestamp: 2025-06-27T07:49:01.198Z
Learning: Transformers v4.50.0 contains only non-breaking changes including documentation redesign, repository maintenance, performance enhancements, and bug fixes, with no API changes that would affect TrustyAI integration.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: The jupyter-bokeh package was previously pinned to version 3.0.5 in the TrustyAI notebook image due to compatibility requirements with TrustyAI components, as indicated by the comment "Should be pinned down to this version in order to be compatible with trustyai" that was removed in this update.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: The jupyter-bokeh pinning to 3.0.5 in TrustyAI notebook image was not due to TrustyAI code compatibility issues, but because the trustyai package itself explicitly declares jupyter-bokeh~=3.0.5 as a hard dependency, causing pip dependency resolution conflicts when trying to upgrade to jupyter-bokeh 4.x.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: TrustyAI's jupyter-bokeh was pinned to 3.0.5 due to compatibility requirements with TrustyAI's visualization components, but the actual deployed version in requirements.txt shows 3.0.7, indicating incremental testing. The upgrade to 4.0.5 in this PR represents the completion of a gradual migration strategy from the 3.x series after confirming compatibility with Bokeh 3.7.3.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1418
File: tests/browser/README.md:14-18
Timestamp: 2025-07-21T12:58:43.996Z
Learning: In the opendatahub-io/notebooks repository, upstream project testing framework references in README files serve as informational context about the broader ecosystem, not as recommendations for the current project. These references help contributors understand what other related projects use and should be retained even when the current project doesn't use those frameworks.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1269
File: codeserver/ubi9-python-3.12/nginx/root/usr/share/container-scripts/nginx/common.sh:1-3
Timestamp: 2025-07-03T12:07:19.365Z
Learning: jiridanek consistently requests GitHub issue creation for technical improvements identified during code reviews in opendatahub-io/notebooks, ensuring systematic tracking of code quality enhancements like shell script portability issues with comprehensive descriptions, solution options, and acceptance criteria.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1269
File: codeserver/ubi9-python-3.12/utils/process.sh:17-19
Timestamp: 2025-07-03T14:00:00.909Z
Learning: jiridanek efficiently identifies when CodeRabbit review suggestions are already covered by existing comprehensive issues, demonstrating excellent issue management and avoiding duplicate tracking of the same improvements across multiple locations.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb:31-34
Timestamp: 2025-07-01T07:03:05.385Z
Learning: jiridanek demonstrates excellent pattern recognition for identifying duplicated code issues across the opendatahub-io/notebooks repository. When spotting a potential problem in test notebooks, he correctly assesses that such patterns are likely replicated across multiple similar files rather than being isolated incidents, leading to more effective systematic solutions.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm:56-66
Timestamp: 2025-07-02T18:19:49.397Z
Learning: jiridanek consistently creates comprehensive follow-up GitHub issues for security concerns raised during PR reviews in opendatahub-io/notebooks, ensuring systematic tracking and resolution of supply-chain security improvements like GPG signature verification for package repositories.
jupyter/trustyai/README.md (15)
Learnt from: jiridanek
PR: #1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review. Issue was created with comprehensive problem description covering both Python 3.11 and 3.12 versions, repository pattern analysis showing correct vs incorrect naming, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: #1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: TrustyAI 0.6.1 (latest version as of June 2025) has a hard dependency constraint on jupyter-bokeh~=3.0.5, preventing upgrades to jupyter-bokeh 4.x in notebook images that include TrustyAI. This requires either waiting for TrustyAI to update their dependency or excluding TrustyAI from jupyter-bokeh upgrades.
Learnt from: jiridanek
PR: #1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review, affecting both Python 3.11 and 3.12 versions. Issue #1338 was created with comprehensive problem description covering both affected images, repository pattern analysis comparing correct vs incorrect naming conventions, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: #1127
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:20-20
Timestamp: 2025-06-27T07:49:01.198Z
Learning: jiridanek reviewed the transformers v4.50.0 release notes and assessed that the changes are minimal and unlikely to cause TrustyAI integration problems, indicating the actual changelog contained mostly bug fixes and minor additions rather than breaking changes.
Learnt from: jiridanek
PR: #1306
File: jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb:71-88
Timestamp: 2025-07-04T06:05:30.580Z
Learning: jiridanek requested GitHub issue creation for TrustyAI test notebook URL configurability and network error handling improvements during PR #1306 review. Issue #1323 was created with
Learnt from: jiridanek
PR: #1127
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:20-20
Timestamp: 2025-06-27T07:49:01.198Z
Learning: Transformers v4.50.0 contains only non-breaking changes including documentation redesign, repository maintenance, performance enhancements, and bug fixes, with no API changes that would affect TrustyAI integration.
Learnt from: jiridanek
PR: #1306
File: jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb:102-102
Timestamp: 2025-07-03T16:26:18.718Z
Learning: jiridanek requested GitHub issue creation for improving hardcoded threshold constants in TrustyAI test notebook during PR #1306 review. Issue #1317 was created with comprehensive problem description covering both bias detection threshold (-0.15670061634672994) and fairness threshold (0.0036255104824703954), multiple solution options with code examples, clear acceptance criteria, implementation guidance, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: #1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: TrustyAI explicitly declares jupyter-bokeh~=3.0.5 as a hard dependency in both their requirements.txt and pyproject.toml files, with no open issues tracking jupyter-bokeh 4.x compatibility. This creates an unresolvable pip dependency conflict when trying to upgrade jupyter-bokeh to 4.x in notebook images that include TrustyAI.
Learnt from: jiridanek
PR: #1306
File: jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb:71-76
Timestamp: 2025-07-04T06:04:43.085Z
Learning: jiridanek requested GitHub issue creation for duplicate CSV loading and validation problem in jupyter/trustyai/ubi9-python-3.12/test/test_notebook.ipynb during PR #1306 review. Issue #1322 was created with comprehensive problem description covering code redundancy, runtime failure risks, network inefficiency, and test reliability concerns, along with detailed solution including duplicate line removal, data validation implementation, repository-wide audit, acceptance criteria, implementation guidance, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: #1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: The jupyter-bokeh package was previously pinned to version 3.0.5 in the TrustyAI notebook image due to compatibility requirements with TrustyAI components, as indicated by the comment "Should be pinned down to this version in order to be compatible with trustyai" that was removed in this update.
Learnt from: grdryn
PR: #1320
File: rstudio/rhel9-python-3.11/Dockerfile.cuda:34-35
Timestamp: 2025-07-04T10:41:13.061Z
Learning: In the opendatahub-io/notebooks repository, when adapting NVIDIA CUDA Dockerfiles, the project intentionally maintains consistency with upstream NVIDIA patterns even when it might involve potential risks like empty variable expansions in package installation commands. This is considered acceptable because the containers only run on RHEL 9 with known yum/dnf behavior, and upstream consistency is prioritized over defensive coding practices.
Learnt from: jiridanek
PR: #1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: TrustyAI's jupyter-bokeh was pinned to 3.0.5 due to compatibility requirements with TrustyAI's visualization components, but the actual deployed version in requirements.txt shows 3.0.7, indicating incremental testing. The upgrade to 4.0.5 in this PR represents the completion of a gradual migration strategy from the 3.x series after confirming compatibility with Bokeh 3.7.3.
Learnt from: jiridanek
PR: #1333
File: runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm:50-50
Timestamp: 2025-07-08T19:30:01.738Z
Learning: jiridanek requested GitHub issue creation for multi-architecture support in ROCm TensorFlow image during PR #1333 review. Issue #1346 was created with comprehensive problem description covering hardcoded x86_64 architecture breaking multi-arch support, detailed impact analysis, three solution options (runtime detection, BuildKit TARGETARCH integration, hybrid approach) with pros/cons analysis, comprehensive acceptance criteria covering core requirements and testing, phased implementation guidance, related files identification, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: #1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml:5-15
Timestamp: 2025-07-02T18:59:15.788Z
Learning: jiridanek creates targeted GitHub issues for specific test quality improvements identified during PR reviews in opendatahub-io/notebooks. Issue #1268 demonstrates this by converting a review comment about insufficient tf2onnx conversion test validation into a comprehensive improvement plan with clear acceptance criteria, code examples, and ROCm-specific context.
Learnt from: jiridanek
PR: #1333
File: runtimes/tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-08T19:29:32.006Z
Learning: jiridanek requested GitHub issue creation for investigating TensorFlow "and-cuda" extras usage patterns during PR #1333 review. Issue #1340 was created with comprehensive investigation framework covering platform-specific analysis, deployment scenarios, TensorFlow version compatibility, clear acceptance criteria, testing approach, and implementation timeline, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Thanks @jiridanek for adding details for TrustyAI packages, really handy. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: atheo89 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/override ci/prow/images |
@atheo89: Overrode contexts on behalf of atheo89: ci/prow/images In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Description
captures details from slack thread https://redhat-internal.slack.com/archives/C05TTTYG599/p1753457179425529?thread_ts=1753283266.857419&cid=C05TTTYG599, as explained by @ruivieira
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit