Skip to content

Adding link checker script + GitHub CI #886

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aireilly
Copy link

@aireilly aireilly commented Jul 30, 2025

Description

Adding link checker script + GitHub CI yml

Example CI failure: https://github.com/aireilly/opendatahub-documentation/actions/runs/16655212911/job/47138351717

Summary by CodeRabbit

  • New Features
    • Introduced automated link checking for AsciiDoc files, ensuring URLs in modified documentation are validated for correctness.
    • Added scripts to identify and check only those documentation files that have been changed, improving efficiency.
    • Implemented an ignore list to exclude specific domains or URLs from link validation.
  • Chores
    • Integrated link-checking workflow into automated CI for improved documentation quality.

Copy link

coderabbitai bot commented Jul 30, 2025

Warning

Rate limit exceeded

@aireilly has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 51 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e612b25 and 007019d.

📒 Files selected for processing (4)
  • .github/workflows/check-links.yml (1 hunks)
  • scripts/check-links.sh (1 hunks)
  • scripts/check-modified.sh (1 hunks)
  • scripts/links.ignore (1 hunks)

Walkthrough

This change introduces automated link-checking for AsciiDoc documentation files. It adds a GitHub Actions workflow and supporting Bash scripts to check for broken links in modified .adoc files during push and pull request events. Supporting files include ignore patterns for certain URLs and minor documentation updates to trigger link checks.

Changes

Cohort / File(s) Change Summary
GitHub Actions Workflow
.github/workflows/check-links.yml
Adds a workflow to run link checks on modified AsciiDoc files using custom scripts on push and pull request events.
Link Checking Scripts
scripts/check-links.sh, scripts/check-modified.sh
Introduces Bash scripts for extracting and checking links in AsciiDoc files, and for determining which files to check based on recent changes.
Ignore Patterns
scripts/links.ignore
Adds a file containing regular expressions to specify URLs that should be ignored during link checking.
Documentation Update (Trigger)
assemblies/upgrading-odh-v2.adoc, customizing-models-with-lab-tuning.adoc
Adds a test URL and a trailing newline to documentation files, likely to test the link-checking workflow.

Sequence Diagram(s)

sequenceDiagram
    participant GitHub
    participant Workflow
    participant check-modified.sh
    participant check-links.sh

    GitHub->>Workflow: Push or Pull Request event
    Workflow->>check-modified.sh: Run to detect modified .adoc files
    check-modified.sh->>check-links.sh: For each updated book, run link check
    check-links.sh->>External URLs: Validate each URL (parallel curl)
    check-links.sh-->>check-modified.sh: Report broken links (if any)
    check-modified.sh-->>Workflow: Exit with status (success/failure)
    Workflow-->>GitHub: Post check result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
scripts/check-links.sh (4)

15-21: Argument parsing blocks legitimate use-cases

The script only accepts either the --global flag or exactly one file argument.
It becomes impossible to check several specific files at once, or to pass additional flags (for example a future --verbose). This limitation will quickly surface in CI when multiple .adoc files change in the same commit outside assemblies/.

-elif [ "$#" -eq 1 ]; then
-    INPUT_FILE="$1"
+elif [ "$#" -ge 1 ]; then
+    INPUT_FILE=("$@")   # allow an arbitrary list

28-37: Ignore list is hard-wired & will drift

All patterns are baked into the script; updating them requires committing code instead of tweaking CI variables.
Consider letting callers extend / override the list via an environment variable, e.g. LINKCHECK_IGNORE_REGEX, and splitting on |. That keeps the script generic and reusable across repos.


45-47: Only the last trailing punctuation character is stripped

${URL%[.,;:?!\]\)]} removes one character.
Strings such as https://foo.com). will be truncated to https://foo.com) and still fail.

-URL=${URL%[.,;:?!\]\)]}
+while [[ "$URL" =~ [.,;:?!\]\)]$ ]]; do
+  URL=${URL%?}
+done

83-86: Temporary file never cleaned up

mktemp creates a real file that stays in /tmp.
Trap EXIT to delete it:

+trap 'rm -f "$TMP_FILE"' EXIT
.github/workflows/check-links.yml (2)

8-19: Global job lacks cache & matrix – runtime will grow

Running the checker over all assemblies on every PR can take several minutes as the doc set grows.
Consider:

  • Caching Asciidoctor’s gem downloads (actions/cache)
  • Using a matrix to shard the file list and run checks in parallel

40-41: Pattern excludes moved or renamed files

The regex ^assemblies/[^/]+\.adoc$ skips files moved to sub-directories or renamed within nested paths.
If that was unintentional, drop the anchor or use **/*.adoc.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a915636 and 0766a32.

📒 Files selected for processing (2)
  • .github/workflows/check-links.yml (1 hunks)
  • scripts/check-links.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#860
File: modules/using-llm-as-a-judge-metrics-with-lmeval.adoc:62-65
Timestamp: 2025-07-16T09:05:50.422Z
Learning: In the opendatahub-documentation project, bredamc prefers comprehensive error detection in code reviews. Continue to point out potential errors whether they are related to the code changes in the PR or not, but clearly distinguish between errors introduced in the PR versus errors in existing code.
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#842
File: modules/enabling-the-multi-model-serving-platform.adoc:11-17
Timestamp: 2025-07-07T19:36:45.326Z
Learning: In the opendatahub-documentation project, AsciiDoc anchors like `ref-dashboard-configuration-options_dashboard` are automatically generated when module files are included in assembly files, rather than being manually defined with [[anchor]] syntax. Links to these anchors will not 404 because the anchor is created through the inclusion mechanism.
scripts/check-links.sh (1)

Learnt from: bredamc
PR: #842
File: modules/enabling-the-multi-model-serving-platform.adoc:11-17
Timestamp: 2025-07-07T19:36:45.326Z
Learning: In the opendatahub-documentation project, AsciiDoc anchors like ref-dashboard-configuration-options_dashboard are automatically generated when module files are included in assembly files, rather than being manually defined with [[anchor]] syntax. Links to these anchors will not 404 because the anchor is created through the inclusion mechanism.

🔇 Additional comments (2)
scripts/check-links.sh (1)

70-71: Regex misses many valid URLs

[a-zA-Z0-9./?=%_-]* excludes &, #, :, ~, etc.
Consider using a more permissive pattern or letting pandoc/asciidoctor emit links in XML/JSON and parse that.

.github/workflows/check-links.yml (1)

36-45: origin/${{ github.base_ref }} may not exist in shallow clones

Even with fetch-depth: 0, the remote ref is named refs/remotes/origin/<branch>.
Inside containers, Git sometimes flattens remotes ⇒ the diff can be empty and the job silently passes.

A more robust variant:

BASE=${{ github.event.pull_request.base.sha }}
git diff --name-only $BASE...HEAD

Comment on lines 62 to 70
# Export variables and functions for xargs subshells
export TMP_FILE
export -f check_url

run_url_checks() {
local FILE="$1"
echo -e "\033[32mChecking: $FILE\033[0m"
asciidoctor "$FILE" -o - | \
grep -Eo '(http|https)://[a-zA-Z0-9./?=%_-]*' | \
sort -u | \
xargs -P 10 -n 1 bash -c "$PATTERNS_DECL; check_url \"\$0\""
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Race-condition & brittle $0 usage inside xargs

  1. Multiple parallel subshells write to the same TMP_FILE, which can interleave.
    Writing 1 once is fine, but concurrent writes could still truncate the file at the wrong moment.
  2. Using $0 to pass the URL relies on Bash’s special handling of arg0 and is hard to read.
    xargs -I{} is clearer.
-        xargs -P 10 -n 1 bash -c "$PATTERNS_DECL; check_url \"\$0\""
+        xargs -P 10 -n 1 -I{} bash -c "$PATTERNS_DECL; check_url \"{}\""

You can also switch to mktemp -u + touch and replace echo >| with printf >> to eliminate the truncation race.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Export variables and functions for xargs subshells
export TMP_FILE
export -f check_url
run_url_checks() {
local FILE="$1"
echo -e "\033[32mChecking: $FILE\033[0m"
asciidoctor "$FILE" -o - | \
grep -Eo '(http|https)://[a-zA-Z0-9./?=%_-]*' | \
sort -u | \
xargs -P 10 -n 1 bash -c "$PATTERNS_DECL; check_url \"\$0\""
}
# Export variables and functions for xargs subshells
export TMP_FILE
export -f check_url
run_url_checks() {
local FILE="$1"
echo -e "\033[32mChecking: $FILE\033[0m"
asciidoctor "$FILE" -o - | \
grep -Eo '(http|https)://[a-zA-Z0-9./?=%_-]*' | \
sort -u | \
xargs -P 10 -n 1 -I{} bash -c "$PATTERNS_DECL; check_url \"{}\""
}
🤖 Prompt for AI Agents
In scripts/check-links.sh around lines 62 to 73, fix the race condition caused
by multiple parallel subshells writing to the same TMP_FILE by switching to a
unique temporary file per process using mktemp -u combined with touch, and
replace echo >| with printf >> to avoid truncation issues. Also, replace the use
of $0 in the xargs bash -c call with xargs -I{} to pass the URL argument more
clearly and reliably.

Comment on lines 75 to 79
if [ "$GLOBAL_MODE" -eq 1 ]; then
while IFS= read -r file; do
run_url_checks "$file"
done < <(find assemblies -maxdepth 1 -type f -name "*.adoc")
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

find … -maxdepth 1 ignores sub-assemblies

assemblies already contains nested directories (e.g. ai/), so deeper .adoc files are skipped in global mode.
Dropping -maxdepth 1 fixes it:

-    done < <(find assemblies -maxdepth 1 -type f -name "*.adoc")
+    done < <(find assemblies -type f -name "*.adoc")
🤖 Prompt for AI Agents
In scripts/check-links.sh around lines 75 to 79, the find command uses -maxdepth
1 which limits the search to only the top-level files in the assemblies
directory, ignoring nested subdirectories and their .adoc files. To fix this,
remove the -maxdepth 1 option from the find command so it recursively finds all
.adoc files in assemblies and its subdirectories when GLOBAL_MODE is 1.

@aireilly aireilly force-pushed the add-link-checker branch 16 times, most recently from abb49f9 to 5ee707f Compare July 31, 2025 14:18
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
scripts/check-links.sh (2)

70-72: Past feedback still unaddressed – $0 in xargs & race

Prior review pointed out that relying on $0 inside bash -c is brittle and that xargs can pass the argument explicitly. The issue remains.

-        xargs -P 10 -n 1 bash -c "$PATTERNS_DECL; check_url \"\$0\""
+        xargs -P 10 -n 1 -I{} bash -c "$PATTERNS_DECL; check_url \"{}\""

This also sidesteps $0 quirks and makes the call self-documenting.


25-29: Single shared temp file + echo > is racy under parallel curl

Parallel subshells write to the same TMP_FILE; simultaneous truncation (> redirection) can clobber the flag and hide failures.

-echo "0" > "$TMP_FILE"
+printf '0' >"$TMP_FILE"

And inside check_url (Line 58)

-        echo "1" > "$TMP_FILE"
+        printf '1' >>"$TMP_FILE"

For complete safety allocate one file per process (mktemp --tmpdir check_links.XXXXXX) and test existence rather than contents.

🧹 Nitpick comments (2)
.github/workflows/check-links.yml (1)

13-17: Apt installs on every run – consider pre-built Asciidoctor action

apt-get update && apt-get install adds ~30 s to every CI run. Switching to the maintained asciidoctor/docker-action or a pre-built container cuts that to near-zero and avoids future package-availability failures.

scripts/check-links.sh (1)

3-4: Header comment is now misleading

The script no longer “Checks for 404 links” only. It validates all non-2xx/3xx statuses (except 000/403). Please update the comment to avoid confusion.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a21d915 and 5ee707f.

📒 Files selected for processing (3)
  • .github/workflows/check-links.yml (1 hunks)
  • scripts/check-links.sh (1 hunks)
  • scripts/links.ignore (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • scripts/links.ignore
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#860
File: modules/using-llm-as-a-judge-metrics-with-lmeval.adoc:62-65
Timestamp: 2025-07-16T09:05:50.422Z
Learning: In the opendatahub-documentation project, bredamc prefers comprehensive error detection in code reviews. Continue to point out potential errors whether they are related to the code changes in the PR or not, but clearly distinguish between errors introduced in the PR versus errors in existing code.
📚 Learning: in asciidoc assembly files, the conditional directive syntax `ifdef::parent-context[:context: {paren...
Learnt from: jiridanek
PR: opendatahub-io/opendatahub-documentation#801
File: assemblies/managing-basic-workbenches.adoc:30-31
Timestamp: 2025-06-17T10:51:51.206Z
Learning: In AsciiDoc assembly files, the conditional directive syntax `ifdef::parent-context[:context: {parent-context}]` and `ifndef::parent-context[:!context:]` is valid and commonly used for managing document context when files are included in different assemblies. This pattern should not be flagged as malformed syntax.

Applied to files:

  • scripts/check-links.sh
🔇 Additional comments (1)
scripts/check-links.sh (1)

54-57: Why are 403 & 000 silently accepted?

Treating all non-2xx/3xx statuses as broken is safer; many sites respond with 403 when public access is blocked (broken link for readers). Please document the rationale or remove the exception.

-if [[ "$STATUS" != "000" && "$STATUS" != "403" && ! "$STATUS" =~ ^(2|3)[0-9]{2}$ ]]; then
+if [[ ! "$STATUS" =~ ^(2|3)[0-9]{2}$ ]]; then

Comment on lines 27 to 32
MODIFIED_FILES=$(git diff --name-only origin/main...HEAD --diff-filter=d -- "*.adoc")
CHECK_FILES=()

while IFS= read -r file; do
[[ "$file" =~ ^assemblies/[^/]+\.adoc$ ]] && CHECK_FILES+=("$file")
done <<< "$MODIFIED_FILES"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Regex rejects nested assemblies – link-check silently skips many changed files

The path filter ^assemblies/[^/]+\.adoc$ matches only files that are directly inside assemblies/ (one level). All .adoc files already living in deeper sub-directories (assemblies/ai/…, assemblies/model_serving/…, etc.) will never be checked even when modified, defeating the purpose of this workflow.

-            [[ "$file" =~ ^assemblies/[^/]+\.adoc$ ]] && CHECK_FILES+=("$file")
+            [[ "$file" =~ ^assemblies/.*\.adoc$ ]] && CHECK_FILES+=("$file")

Run the workflow on a branch that touches assemblies/ai/example.adoc to confirm the fix.


Include nested assembly .adoc files in link-check workflow

The current regex only matches .adoc files directly under assemblies/, so any documents in subfolders (e.g. assemblies/ai/example.adoc) are never checked. Update the pattern to capture all .adoc files under assemblies/.

• File: .github/workflows/check-links.yml
Lines ~27–33 (regex filter)
• Change:

-            [[ "$file" =~ ^assemblies/[^/]+\.adoc$ ]] && CHECK_FILES+=("$file")
+            [[ "$file" =~ ^assemblies/.*\.adoc$ ]] && CHECK_FILES+=("$file")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
MODIFIED_FILES=$(git diff --name-only origin/main...HEAD --diff-filter=d -- "*.adoc")
CHECK_FILES=()
while IFS= read -r file; do
[[ "$file" =~ ^assemblies/[^/]+\.adoc$ ]] && CHECK_FILES+=("$file")
done <<< "$MODIFIED_FILES"
MODIFIED_FILES=$(git diff --name-only origin/main...HEAD --diff-filter=d -- "*.adoc")
CHECK_FILES=()
while IFS= read -r file; do
- [[ "$file" =~ ^assemblies/[^/]+\.adoc$ ]] && CHECK_FILES+=("$file")
+ [[ "$file" =~ ^assemblies/.*\.adoc$ ]] && CHECK_FILES+=("$file")
done <<< "$MODIFIED_FILES"
🤖 Prompt for AI Agents
In .github/workflows/check-links.yml around lines 27 to 33, the regex only
matches .adoc files directly under assemblies/ and misses nested files in
subfolders. Update the regex pattern to match all .adoc files under assemblies/,
including those in any subdirectories, by allowing for additional path segments
after assemblies/ before the filename.

# Strip trailing punctuation
URL=${URL%[.,;:?!\]\)]}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Bad glob ‑- script aborts with “bad pattern” on Bash 5.1

${URL%[.,;:?!\]\)]} closes the bracket set at the first ], making the back-slash sequence invalid. Bash exits with bad pattern the first time a URL is processed.

-    URL=${URL%[.,;:?!\]\)]}
+    # Trim a single trailing punctuation char if present
+    URL=${URL%%[[:punct:]]}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Strip trailing punctuation
URL=${URL%[.,;:?!\]\)]}
# Trim a single trailing punctuation char if present
URL=${URL%%[[:punct:]]}
🤖 Prompt for AI Agents
In scripts/check-links.sh around lines 45 to 47, the pattern used to strip
trailing punctuation from the URL is invalid because the closing bracket is
interpreted too early, causing a "bad pattern" error in Bash 5.1. To fix this,
escape the closing square bracket inside the bracket expression properly or
reorder the characters so that the closing bracket is not the first character
after the opening bracket, ensuring the glob pattern is valid and does not cause
the script to abort.

@aireilly aireilly force-pushed the add-link-checker branch 7 times, most recently from 77015ea to 95f7e2b Compare July 31, 2025 15:58
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
.github/workflows/check-links.yml (1)

20-24: Minor: avoid chmod when scripts are tracked as executable

check-links.sh and check-modified.sh already have the shebang and can be committed with the executable bit (git add --chmod=+x).
Dropping this step reduces noise.

[nitpick]

scripts/check-modified.sh (1)

44-47: Deduplicate list to avoid redundant link checks

When a module appears in multiple books, the current loop checks each book repeatedly.

-# Check links in the compiled list of books
-for f in "${UPDATED_BOOKS[@]}"; do
-  ./scripts/check-links.sh "$f"
-done
+# Check links once per unique book
+for f in $(printf '%s\n' "${UPDATED_BOOKS[@]}" | sort -u); do
+  ./scripts/check-links.sh "$f"
+done
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd6ea05 and 3cef4f1.

📒 Files selected for processing (6)
  • .github/workflows/check-links.yml (1 hunks)
  • assemblies/upgrading-odh-v2.adoc (1 hunks)
  • customizing-models-with-lab-tuning.adoc (1 hunks)
  • scripts/check-links.sh (1 hunks)
  • scripts/check-modified.sh (1 hunks)
  • scripts/links.ignore (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • customizing-models-with-lab-tuning.adoc
🚧 Files skipped from review as they are similar to previous changes (2)
  • scripts/links.ignore
  • scripts/check-links.sh
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#860
File: modules/using-llm-as-a-judge-metrics-with-lmeval.adoc:62-65
Timestamp: 2025-07-16T09:05:50.422Z
Learning: In the opendatahub-documentation project, bredamc prefers comprehensive error detection in code reviews. Continue to point out potential errors whether they are related to the code changes in the PR or not, but clearly distinguish between errors introduced in the PR versus errors in existing code.
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#842
File: modules/enabling-the-multi-model-serving-platform.adoc:11-17
Timestamp: 2025-07-07T19:36:45.326Z
Learning: In the opendatahub-documentation project, AsciiDoc anchors like `ref-dashboard-configuration-options_dashboard` are automatically generated when module files are included in assembly files, rather than being manually defined with [[anchor]] syntax. Links to these anchors will not 404 because the anchor is created through the inclusion mechanism.
📚 Learning: in asciidoc assembly files, the conditional directive syntax `ifdef::parent-context[:context: {paren...
Learnt from: jiridanek
PR: opendatahub-io/opendatahub-documentation#801
File: assemblies/managing-basic-workbenches.adoc:30-31
Timestamp: 2025-06-17T10:51:51.206Z
Learning: In AsciiDoc assembly files, the conditional directive syntax `ifdef::parent-context[:context: {parent-context}]` and `ifndef::parent-context[:!context:]` is valid and commonly used for managing document context when files are included in different assemblies. This pattern should not be flagged as malformed syntax.

Applied to files:

  • .github/workflows/check-links.yml
  • assemblies/upgrading-odh-v2.adoc
📚 Learning: in the opendatahub-documentation project, asciidoc anchors like `ref-dashboard-configuration-options...
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#842
File: modules/enabling-the-multi-model-serving-platform.adoc:11-17
Timestamp: 2025-07-07T19:36:45.326Z
Learning: In the opendatahub-documentation project, AsciiDoc anchors like `ref-dashboard-configuration-options_dashboard` are automatically generated when module files are included in assembly files, rather than being manually defined with [[anchor]] syntax. Links to these anchors will not 404 because the anchor is created through the inclusion mechanism.

Applied to files:

  • assemblies/upgrading-odh-v2.adoc
  • scripts/check-modified.sh
📚 Learning: in asciidoc files for opendatahub-documentation, when adding explanatory text after list items, avoi...
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#816
File: modules/deploying-models-on-the-single-model-serving-platform.adoc:0-0
Timestamp: 2025-06-16T11:11:20.681Z
Learning: In AsciiDoc files for opendatahub-documentation, when adding explanatory text after list items, avoid using block continuation markers ('+') if the text should appear in the same paragraph as the list item. The block continuation marker would move the text to a new paragraph within the same step, which may not be the desired rendering.

Applied to files:

  • assemblies/upgrading-odh-v2.adoc
📚 Learning: in the opendatahub-documentation project, avoid suggesting unnecessary admonition blocks (note, warn...
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#860
File: modules/using-llm-as-a-judge-metrics-with-lmeval.adoc:62-65
Timestamp: 2025-07-16T09:10:15.185Z
Learning: In the opendatahub-documentation project, avoid suggesting unnecessary admonition blocks (NOTE, WARNING, IMPORTANT, etc.) for explanatory text that can be effectively communicated inline. The documentation team prefers cleaner, less cluttered documentation with inline text unless the information is critical enough to warrant special formatting emphasis.

Applied to files:

  • assemblies/upgrading-odh-v2.adoc
📚 Learning: in asciidoc documentation for opendatahub-documentation, when adding explanatory text after a list i...
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#816
File: modules/deploying-models-on-the-single-model-serving-platform.adoc:0-0
Timestamp: 2025-06-16T11:11:27.675Z
Learning: In AsciiDoc documentation for opendatahub-documentation, when adding explanatory text after a list item, the preferred rendering is to keep the text in the same paragraph as the list item (no `+` continuation character) rather than creating a separate paragraph within the same step using the `+` block continuation character.

Applied to files:

  • assemblies/upgrading-odh-v2.adoc
📚 Learning: in the opendatahub-documentation repository, some lines using asciidoc comment syntax (`//`) are del...
Learnt from: bredamc
PR: opendatahub-io/opendatahub-documentation#816
File: modules/deploying-teacher-and-judge-models.adoc:0-0
Timestamp: 2025-06-16T11:06:25.386Z
Learning: In the opendatahub-documentation repository, some lines using AsciiDoc comment syntax (`//`) are deliberately commented out and should not be converted to visible NOTE blocks or other visible content. The author intentionally keeps certain guidance as non-rendered comments.

Applied to files:

  • assemblies/upgrading-odh-v2.adoc

Comment on lines +25 to +27
- name: Fetch base branch
run: git fetch origin main

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fetching origin/main breaks on PRs from forks – use base repo ref instead

For PRs raised from a fork, origin points to the fork, not the upstream repo, so git fetch origin main often fails or fetches an outdated branch.
Use the base-ref provided by GitHub Actions or add an explicit upstream remote.

-      - name: Fetch base branch
-        run: git fetch origin main
+      - name: Fetch base branch
+        if: ${{ github.event_name == 'pull_request' }}
+        run: |
+          git fetch \
+            "https://github.com/${{ github.repository_owner }}/opendatahub-documentation.git" \
+            "${{ github.event.pull_request.base.ref }}:refs/remotes/upstream/${{ github.event.pull_request.base.ref }}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Fetch base branch
run: git fetch origin main
- name: Fetch base branch
if: ${{ github.event_name == 'pull_request' }}
run: |
git fetch \
"https://github.com/${{ github.repository_owner }}/opendatahub-documentation.git" \
"${{ github.event.pull_request.base.ref }}:refs/remotes/upstream/${{ github.event.pull_request.base.ref }}"
🤖 Prompt for AI Agents
In .github/workflows/check-links.yml around lines 25 to 27, the command 'git
fetch origin main' fails on PRs from forks because 'origin' points to the fork
repository. To fix this, replace 'origin' with the upstream repository reference
or use the GitHub Actions base-ref variable to fetch the correct base branch
from the upstream repo. Alternatively, add an explicit remote for the upstream
repo and fetch from it instead of 'origin'.

Comment on lines 28 to 30

https://docs.redhat.com/derp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Placeholder URL will fail link-check and confuses readers

https://docs.redhat.com/derp looks like a stub/typo.
It renders as a naked paragraph with no context and will inevitably 404, causing the very link-checker introduced in this PR to fail.

-ifndef::parent-context[:!context:]
-
-https://docs.redhat.com/derp
+ifndef::parent-context[:!context:]
+
+// TODO: replace with the correct RH docs link once available.
+// For now, comment it out to avoid CI failures.
+// https://docs.redhat.com/<correct-path>

Replace the placeholder docs URL to prevent CI link-check failures

The stub URL https://docs.redhat.com/derp will always 404 and break our new link-checker. Please remove or comment it out until the correct Red Hat documentation path is known.

Location: assemblies/upgrading-odh-v2.adoc, lines 28–30
Action: replace the naked link with a TODO comment (or remove it) to keep the build green

Suggested diff:

-ifndef::parent-context[:!context:]
-
-https://docs.redhat.com/derp
+ifndef::parent-context[:!context:]
+
+// TODO: insert the correct Red Hat docs URL here once available.
// For now, comment out the stub to avoid CI failures.
// https://docs.redhat.com/<correct-path>
🤖 Prompt for AI Agents
In assemblies/upgrading-odh-v2.adoc around lines 28 to 30, the placeholder URL
https://docs.redhat.com/derp causes CI link-check failures because it 404s.
Remove this URL or replace it with a TODO comment indicating the correct Red Hat
documentation path is pending, to prevent build breaks.

Comment on lines 1 to 6
#!/bin/bash
#
# Checks for 404 links in modified books

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden script: fail fast & safer globbing

Add standard Bash safety switches (set -Eeuo pipefail) and IFS to avoid subtle bugs when filenames contain spaces.

-#!/bin/bash
+#!/usr/bin/env bash
+set -Eeuo pipefail
+IFS=$'\n\t'
🤖 Prompt for AI Agents
In scripts/check-modified.sh at lines 1 to 4, the script lacks safety measures
for robust execution. Add the Bash safety options by including `set -Eeuo
pipefail` near the top of the script to enable fail-fast behavior and better
error handling. Also, set `IFS` to a safe value like newline and tab to handle
filenames with spaces correctly and prevent globbing issues.

#
# Checks for 404 links in modified books

FILES=$(git diff --name-only origin/main...HEAD --diff-filter=d -- "*.adoc")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Diff against wrong remote on forked PRs

git diff origin/main...HEAD has the same fork issue as the workflow.
Derive the base branch dynamically:

-FILES=$(git diff --name-only origin/main...HEAD --diff-filter=d -- "*.adoc")
+BASE=${GITHUB_BASE_REF:-main}
+FILES=$(git diff --name-only "origin/${BASE}"...HEAD --diff-filter=d -- "*.adoc")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FILES=$(git diff --name-only origin/main...HEAD --diff-filter=d -- "*.adoc")
BASE=${GITHUB_BASE_REF:-main}
FILES=$(git diff --name-only "origin/${BASE}"...HEAD --diff-filter=d -- "*.adoc")
🤖 Prompt for AI Agents
In scripts/check-modified.sh at line 5, the git diff command uses a hardcoded
remote branch origin/main which causes issues on forked PRs. Modify the script
to dynamically determine the base branch or remote reference instead of using
origin/main directly. This can be done by deriving the base branch from the
current git context or environment variables to ensure the diff compares against
the correct upstream branch in forked PR scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants