Fix/array of objects and cloud docs check#166
Conversation
Array-of-objects fields like client_certs[] were incorrectly rendered as flat objects with all child properties expanded, creating invalid configurations that mixed mutually exclusive options. Changes: - buildConfigYaml.js: Check field.kind === 'array' to detect array-of-objects - renderObjectField.js: Same check for nested array-of-objects fields - Now renders 'client_certs: []' instead of expanded object structure This fixes 81 occurrences of client_certs plus other array-of-objects fields like tools[], roles[], sasl[], etc. Impact: 105 files changed in generated docs, removing 966 lines of incorrect expanded structures. Resolves issue where examples showed both inline (cert/key) and file-based (cert_file/key_file) options together, violating the documented 'either/or' constraint.
Added logic to verify that all cloud-supported connectors (inCloud + cloudOnly) have corresponding documentation pages in the cloud-docs repository. Changes: - Build set of cloud-supported connectors from binary analysis - Use GitHub API to check if connector pages exist in cloud-docs - Report missing connectors with their paths - Runs automatically during draft-missing workflow (can be disabled) The check uses GitHub API (via Octokit) to avoid cloning the cloud-docs repo. Respects VBOT_GITHUB_API_TOKEN or GITHUB_TOKEN environment variables. Also improved cloud-only connector detection: - Cloud-only connectors now only check the cloud-only directory - Regular connectors check pages and partials (not cloud-only) This helps identify connectors that are available in Cloud but missing documentation pages in the cloud-docs repository.
✅ Deploy Preview for docs-extensions-and-macros ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughThis pull request includes a version bump to 4.13.5 and introduces enhancements to handle array-of-object fields in YAML configuration generation. Two helper functions are updated to render arrays of objects as YAML leaf arrays (e.g., Sequence DiagramsequenceDiagram
participant Handler as rpcn-connector-docs-handler
participant CloudSet as cloudSupportedSet<br/>(Tracking)
participant GitHub as GitHub API
participant CloudDocs as Cloud-docs<br/>Repository
participant Logger as Log Output
Handler->>CloudSet: Track connectors in OSS and Cloud
Note over CloudSet: inCloud, cloudOnly sets
Handler->>Handler: Check cloud-only connectors<br/>in cloud-only directory
Handler->>Logger: Log results for cloud-only
Handler->>GitHub: Query cloud-docs repo<br/>for missing cloud-supported<br/>connectors
GitHub->>CloudDocs: Retrieve cloud-supported<br/>connector docs
CloudDocs-->>GitHub: Return docs status
GitHub-->>Handler: Response with missing entries
Handler->>Logger: Emit cloud-docs<br/>check results
Handler->>Logger: Handle errors gracefully
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@tools/redpanda-connect/rpcn-connector-docs-handler.js`:
- Around line 1135-1148: The cloud-only detection is wrong because
validConnectors entries built from dataObj lose the
connector.cloudOnly/requiresCgo flags; update the code that maps/creates
validConnectors (the place where dataObj is converted into connector objects) to
copy over connector.cloudOnly and connector.requiresCgo (or derive them from
dataObj) so that later logic in allMissing (which checks cloudOnly and uses
roots.cloudOnly vs roots.pages/roots.partials) correctly tests cloud-only
connectors; ensure the property names match what allMissing expects
(cloudOnly/requiresCgo) so cloud-only connectors are only checked in
roots.cloudOnly and treated as cloud-only by downstream drafting logic.
- Around line 1152-1193: The cloud-docs check currently only treats 404 as
"missing" and silently ignores any other errors, causing a false success; update
the loop that calls octokit.repos.getContent to record non-404 failures (e.g.,
push to a new array like cloudDocsErrors or set a flag) and include the error
details (status/message) when recording them, continue to treat 404 as missing
by adding to missingFromCloudDocs, and after the loop, if cloudDocsErrors is
non-empty print an "inconclusive" or error summary (with counts/details) instead
of the success message so non-404 API errors (401/403/rate-limit/network) are
surfaced and the check does not incorrectly report all-clear.
Update cloud-docs validation to record and report non-404 API errors (auth, rate-limit, network) instead of silently ignoring them. - Add cloudDocsErrors array to track non-404 failures - Capture error status and message for each failure - Report inconclusive check with error details when failures occur - Only show success message when check completes without errors - Provide troubleshooting guidance for common error causes
- Preserve cloudOnly and requiresCgo flags when building validConnectors from dataObj so cloud-only connectors are properly detected - Move Octokit initialization outside the loop to reuse connection instead of creating a new instance for each connector check
When GitHub API fails with auth/rate-limit errors, fall back to checking raw.githubusercontent.com URLs via HTTP HEAD request. Flow: 1. Try official API with authentication 2. If 404 -> file missing (expected) 3. If 403/401/rate-limit -> try raw URL fallback - 200 -> file exists (success) - 404 -> file missing (confirmed) - Other -> record error with both API and fallback details This allows validation to work even without GITHUB_TOKEN or when rate-limited, while still preferring the official API when available.
- Fix cloud-docs path from modules/components/pages to modules/develop/pages/connect/components - Fix plural type handling (inputs not inputss) - Enable cloud-docs validation without binary analysis by checking all non-deprecated connectors when binary data unavailable - Reduces false positives from 282 to 64 actual missing connectors
Changes: - Add dotenv package for automatic .env file loading - Load .env files at startup in both CLI entry points - Fix cloud-docs validation to only check actual connector types Cloud-docs validation improvements: - Filter out config types (config/*) - internal schemas only - Filter out Bloblang functions/methods - documented on reference pages - Filter out rate-limits - documented differently - Only check: inputs, outputs, processors, caches, buffers, scanners, metrics, tracers Result: Validation now correctly reports ~7 missing connectors instead of 252 Benefits: - No more manual export of GITHUB_TOKEN needed - Binary analysis works automatically with .env file - Cloud-docs validation is accurate for connector coverage Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Deprecated connectors shouldn't be flagged as missing from cloud-docs. This filter reduces false positives (e.g., pg_stream which is deprecated). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
dotenv 17.x includes promotional stderr output that clutters CLI output. Downgraded to 16.x which provides the same functionality without the promotional messages.
Creates a centralized Octokit client singleton in cli-utils/octokit-client.js that is shared across all doc-tools modules. This eliminates redundant initialization, shares rate limit tracking, and works without authentication when no GitHub token is set. Changes: - Add cli-utils/octokit-client.js as shared singleton - Update 7 modules to use shared client: - tools/get-redpanda-version.js - tools/get-console-version.js - tools/fetch-from-github.js - tools/cloud-regions/generate-cloud-regions.js - extensions/generate-rp-connect-info.js - tools/redpanda-connect/connector-binary-analyzer.js - tools/redpanda-connect/rpcn-connector-docs-handler.js - Configure auth only when token is available - Simplify code by removing 38 lines of duplicate initialization Benefits: - Single initialization point for all GitHub API access - Shared rate limit pool across modules - Works without authentication (60 req/hr) or with token (5000 req/hr) - Easier to maintain and update configuration
This pull request centralizes and standardizes the use of the GitHub Octokit client across the codebase by introducing a shared singleton instance, and improves connector documentation checks for Redpanda Cloud. The main themes are infrastructure simplification for GitHub API access and enhancements to connector documentation validation logic.
Centralized GitHub Octokit client:
octokit-client.jsmodule incli-utilsto provide a shared, singleton Octokit client with consistent authentication and retry logic, reducing redundant initialization and improving rate limit tracking.generate-rp-connect-info.js,fetch-from-github.js,get-console-version.js,get-redpanda-version.js,connector-binary-analyzer.js,generate-cloud-regions.js) to use the new shared Octokit client instead of creating their own instances. [1] [2] [3] [4] [5] [6] [7]Connector documentation validation improvements:
cloudOnlyandrequiresCgoflags for connectors.cloud-docsrepository using the shared Octokit client, with robust error handling and fallback to raw HTTP requests. [1] [2]Connector YAML rendering improvements:
client_certs[]) so they are rendered as empty arrays instead of expanded object structures in bothbuildConfigYaml.jsandrenderObjectField.js. [1] [2]Dependency and version updates:
4.13.5and addeddotenvas a dependency inpackage.json. [1] [2]