Skip to content

Fix/array of objects and cloud docs check#166

Merged
JakeSCahill merged 13 commits intomainfrom
fix/array-of-objects-and-cloud-docs-check
Jan 20, 2026
Merged

Fix/array of objects and cloud docs check#166
JakeSCahill merged 13 commits intomainfrom
fix/array-of-objects-and-cloud-docs-check

Conversation

@JakeSCahill
Copy link
Contributor

@JakeSCahill JakeSCahill commented Jan 19, 2026

This pull request centralizes and standardizes the use of the GitHub Octokit client across the codebase by introducing a shared singleton instance, and improves connector documentation checks for Redpanda Cloud. The main themes are infrastructure simplification for GitHub API access and enhancements to connector documentation validation logic.

Centralized GitHub Octokit client:

  • Added a new octokit-client.js module in cli-utils to provide a shared, singleton Octokit client with consistent authentication and retry logic, reducing redundant initialization and improving rate limit tracking.
  • Refactored all scripts and tools (generate-rp-connect-info.js, fetch-from-github.js, get-console-version.js, get-redpanda-version.js, connector-binary-analyzer.js, generate-cloud-regions.js) to use the new shared Octokit client instead of creating their own instances. [1] [2] [3] [4] [5] [6] [7]

Connector documentation validation improvements:

  • Enhanced the Redpanda Connect documentation handler to:
    • Track cloudOnly and requiresCgo flags for connectors.
    • Build a set of cloud-supported connectors using binary analysis or fallback logic.
    • Check for missing connector documentation in both the local repo and the cloud-docs repository using the shared Octokit client, with robust error handling and fallback to raw HTTP requests. [1] [2]

Connector YAML rendering improvements:

  • Updated YAML rendering logic to properly handle array-of-object fields (e.g., client_certs[]) so they are rendered as empty arrays instead of expanded object structures in both buildConfigYaml.js and renderObjectField.js. [1] [2]

Dependency and version updates:

  • Bumped the package version to 4.13.5 and added dotenv as a dependency in package.json. [1] [2]

Array-of-objects fields like client_certs[] were incorrectly rendered as
flat objects with all child properties expanded, creating invalid
configurations that mixed mutually exclusive options.

Changes:
- buildConfigYaml.js: Check field.kind === 'array' to detect array-of-objects
- renderObjectField.js: Same check for nested array-of-objects fields
- Now renders 'client_certs: []' instead of expanded object structure

This fixes 81 occurrences of client_certs plus other array-of-objects
fields like tools[], roles[], sasl[], etc.

Impact: 105 files changed in generated docs, removing 966 lines of
incorrect expanded structures.

Resolves issue where examples showed both inline (cert/key) and
file-based (cert_file/key_file) options together, violating the
documented 'either/or' constraint.
Added logic to verify that all cloud-supported connectors (inCloud + cloudOnly)
have corresponding documentation pages in the cloud-docs repository.

Changes:
- Build set of cloud-supported connectors from binary analysis
- Use GitHub API to check if connector pages exist in cloud-docs
- Report missing connectors with their paths
- Runs automatically during draft-missing workflow (can be disabled)

The check uses GitHub API (via Octokit) to avoid cloning the cloud-docs repo.
Respects VBOT_GITHUB_API_TOKEN or GITHUB_TOKEN environment variables.

Also improved cloud-only connector detection:
- Cloud-only connectors now only check the cloud-only directory
- Regular connectors check pages and partials (not cloud-only)

This helps identify connectors that are available in Cloud but missing
documentation pages in the cloud-docs repository.
@netlify
Copy link

netlify bot commented Jan 19, 2026

Deploy Preview for docs-extensions-and-macros ready!

Name Link
🔨 Latest commit 4ebb6dd
🔍 Latest deploy log https://app.netlify.com/projects/docs-extensions-and-macros/deploys/696f9e9d4c5def00084303b1
😎 Deploy Preview https://deploy-preview-166--docs-extensions-and-macros.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 19, 2026

📝 Walkthrough

Walkthrough

This pull request includes a version bump to 4.13.5 and introduces enhancements to handle array-of-object fields in YAML configuration generation. Two helper functions are updated to render arrays of objects as YAML leaf arrays (e.g., client_certs: []) instead of expanding them recursively. Additionally, the connector docs handler gains cloud-docs awareness to track cloud-supported connectors and validate their documentation status via GitHub API integration.

Sequence Diagram

sequenceDiagram
    participant Handler as rpcn-connector-docs-handler
    participant CloudSet as cloudSupportedSet<br/>(Tracking)
    participant GitHub as GitHub API
    participant CloudDocs as Cloud-docs<br/>Repository
    participant Logger as Log Output

    Handler->>CloudSet: Track connectors in OSS and Cloud
    Note over CloudSet: inCloud, cloudOnly sets
    
    Handler->>Handler: Check cloud-only connectors<br/>in cloud-only directory
    Handler->>Logger: Log results for cloud-only
    
    Handler->>GitHub: Query cloud-docs repo<br/>for missing cloud-supported<br/>connectors
    GitHub->>CloudDocs: Retrieve cloud-supported<br/>connector docs
    CloudDocs-->>GitHub: Return docs status
    GitHub-->>Handler: Response with missing entries
    
    Handler->>Logger: Emit cloud-docs<br/>check results
    Handler->>Logger: Handle errors gracefully
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • paulohtb6
  • micheleRP
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix/array of objects and cloud docs check' directly references the two main changes in the PR: fixing array-of-objects rendering and adding cloud docs validation checks.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The pull request description comprehensively addresses all major changes in the changeset, including centralized Octokit client, connector documentation validation, YAML rendering improvements, and version updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@tools/redpanda-connect/rpcn-connector-docs-handler.js`:
- Around line 1135-1148: The cloud-only detection is wrong because
validConnectors entries built from dataObj lose the
connector.cloudOnly/requiresCgo flags; update the code that maps/creates
validConnectors (the place where dataObj is converted into connector objects) to
copy over connector.cloudOnly and connector.requiresCgo (or derive them from
dataObj) so that later logic in allMissing (which checks cloudOnly and uses
roots.cloudOnly vs roots.pages/roots.partials) correctly tests cloud-only
connectors; ensure the property names match what allMissing expects
(cloudOnly/requiresCgo) so cloud-only connectors are only checked in
roots.cloudOnly and treated as cloud-only by downstream drafting logic.
- Around line 1152-1193: The cloud-docs check currently only treats 404 as
"missing" and silently ignores any other errors, causing a false success; update
the loop that calls octokit.repos.getContent to record non-404 failures (e.g.,
push to a new array like cloudDocsErrors or set a flag) and include the error
details (status/message) when recording them, continue to treat 404 as missing
by adding to missingFromCloudDocs, and after the loop, if cloudDocsErrors is
non-empty print an "inconclusive" or error summary (with counts/details) instead
of the success message so non-404 API errors (401/403/rate-limit/network) are
surfaced and the check does not incorrectly report all-clear.

Update cloud-docs validation to record and report non-404 API errors
(auth, rate-limit, network) instead of silently ignoring them.

- Add cloudDocsErrors array to track non-404 failures
- Capture error status and message for each failure
- Report inconclusive check with error details when failures occur
- Only show success message when check completes without errors
- Provide troubleshooting guidance for common error causes
- Preserve cloudOnly and requiresCgo flags when building validConnectors
  from dataObj so cloud-only connectors are properly detected
- Move Octokit initialization outside the loop to reuse connection
  instead of creating a new instance for each connector check
When GitHub API fails with auth/rate-limit errors, fall back to
checking raw.githubusercontent.com URLs via HTTP HEAD request.

Flow:
1. Try official API with authentication
2. If 404 -> file missing (expected)
3. If 403/401/rate-limit -> try raw URL fallback
   - 200 -> file exists (success)
   - 404 -> file missing (confirmed)
   - Other -> record error with both API and fallback details

This allows validation to work even without GITHUB_TOKEN or when
rate-limited, while still preferring the official API when available.
@JakeSCahill JakeSCahill requested a review from paulohtb6 January 19, 2026 16:48
JakeSCahill and others added 5 commits January 20, 2026 08:54
- Fix cloud-docs path from modules/components/pages to
  modules/develop/pages/connect/components
- Fix plural type handling (inputs not inputss)
- Enable cloud-docs validation without binary analysis by checking
  all non-deprecated connectors when binary data unavailable
- Reduces false positives from 282 to 64 actual missing connectors
Changes:
- Add dotenv package for automatic .env file loading
- Load .env files at startup in both CLI entry points
- Fix cloud-docs validation to only check actual connector types

Cloud-docs validation improvements:
- Filter out config types (config/*) - internal schemas only
- Filter out Bloblang functions/methods - documented on reference pages
- Filter out rate-limits - documented differently
- Only check: inputs, outputs, processors, caches, buffers, scanners, metrics, tracers

Result: Validation now correctly reports ~7 missing connectors instead of 252

Benefits:
- No more manual export of GITHUB_TOKEN needed
- Binary analysis works automatically with .env file
- Cloud-docs validation is accurate for connector coverage

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Deprecated connectors shouldn't be flagged as missing from cloud-docs.
This filter reduces false positives (e.g., pg_stream which is deprecated).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
JakeSCahill and others added 2 commits January 20, 2026 14:32
dotenv 17.x includes promotional stderr output that clutters CLI output.
Downgraded to 16.x which provides the same functionality without
the promotional messages.
@JakeSCahill JakeSCahill requested a review from paulohtb6 January 20, 2026 14:35
Creates a centralized Octokit client singleton in cli-utils/octokit-client.js
that is shared across all doc-tools modules. This eliminates redundant
initialization, shares rate limit tracking, and works without authentication
when no GitHub token is set.

Changes:
- Add cli-utils/octokit-client.js as shared singleton
- Update 7 modules to use shared client:
  - tools/get-redpanda-version.js
  - tools/get-console-version.js
  - tools/fetch-from-github.js
  - tools/cloud-regions/generate-cloud-regions.js
  - extensions/generate-rp-connect-info.js
  - tools/redpanda-connect/connector-binary-analyzer.js
  - tools/redpanda-connect/rpcn-connector-docs-handler.js
- Configure auth only when token is available
- Simplify code by removing 38 lines of duplicate initialization

Benefits:
- Single initialization point for all GitHub API access
- Shared rate limit pool across modules
- Works without authentication (60 req/hr) or with token (5000 req/hr)
- Easier to maintain and update configuration
@JakeSCahill JakeSCahill merged commit 8e286a7 into main Jan 20, 2026
21 checks passed
@JakeSCahill JakeSCahill deleted the fix/array-of-objects-and-cloud-docs-check branch January 20, 2026 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants