Skip to content

docs: initial developer guide for adding services#278

Open
rshoemaker wants to merge 6 commits intomainfrom
docs/PLAT-445/support_services
Open

docs: initial developer guide for adding services#278
rshoemaker wants to merge 6 commits intomainfrom
docs/PLAT-445/support_services

Conversation

@rshoemaker
Copy link
Contributor

docs: add developer guide for adding supported services

Note: this document is a snapshot overview based on the incomplete implementation of the MCP supported service. It will need to be updated as we gain more experience adding additional services.

  • Adds docs/development/supported-services.md, a pattern-focused guide for pgEdge engineers explaining how to add a new supported service type to the control plane
  • Uses MCP as the reference implementation throughout, walking through every touch point from API enum to E2E tests
  • Includes a checklist summarizing the minimal set of changes needed for a new service type

What's covered

  1. API layer — extending the service_type enum, regenerating code
  2. Validation — two-layer validation pattern (API + workflow-time), per-type config validators
  3. Image registry — registering images in NewServiceVersions(), version constraints
  4. Resource lifecycle — the 5-resource model (network, user role, spec, instance, monitor), dependency chain, what each resource does
  5. Container spec — ServiceContainerSpec() options, health checks, port publication
  6. Workflow integration — how PlanUpdate and EndState handle services generically
  7. Domain model — ServiceSpec, ServiceInstance, ID generation, ServiceResources
  8. Testing — unit tests, golden plan tests (stub pattern), E2E test patterns
  9. Design constraint — services depend on the database only, no inter-service dependencies
  10. Future work — read/write service users, primary-aware routing, persistent bind mounts
  • No code changes — documentation only

PLAT-445

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new comprehensive developer doc docs/development/supported-services.md explaining how to add a new supported service type to the control plane, and updates .gitignore to ignore .DS_Store. The change is documentation-only and introduces no code or API modifications.

Changes

Cohort / File(s) Summary
Service Type Documentation
docs/development/supported-services.md
Introduces a detailed guide for adding a new supported service type: architecture and constraints, API layer (ServiceSpec schema & validation), image registry, resource lifecycle (Network, ServiceUserRole, ServiceInstanceSpec/Instance/Monitor), container spec, workflow integration, domain model and ID generation, testing strategy (unit, golden plan, E2E), implementation checklist, reference snippets, and future work.
Repository Ignore Rules
​.gitignore
Adds .DS_Store to ignore macOS system metadata files.

Poem

🐇 I nibble on docs beneath the moon,

Schemas sing softly, plans in tune,
From spec to instance I hop and plot,
Checklists lined up, no detail is forgot,
A rabbit cheers for builders on the dot.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'docs: initial developer guide for adding services' clearly and concisely describes the main change—a new developer documentation guide for adding supported services to the control plane.
Description check ✅ Passed The PR description provides a detailed summary covering what was added, is well-organized with sections explaining coverage areas, but is missing explicit Testing and Checklist sections required by the template.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch docs/PLAT-445/support_services

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
docs/development/supported-services.md (2)

55-69: Add language specification to the fenced code block.

The data flow diagram should specify a language for proper rendering. Use text if it's ASCII art, or consider using mermaid for a proper diagram.

📝 Proposed fix
-```
+```text
 API Request (spec.services)
   → Validation (API layer)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/development/supported-services.md` around lines 55 - 69, The fenced code
block containing the ASCII data flow diagram (starting with "API Request
(spec.services)" and the arrows) lacks a language specifier; update the opening
fence to include a language (e.g., use ```text for ASCII art or ```mermaid if
converting to a mermaid diagram) so the block renders correctly, and leave the
diagram contents unchanged except for the fence language token.

241-247: Add language specification to the dependency chain diagram.

Similar to the data flow diagram, this code block should specify a language for proper rendering.

📝 Proposed fix
-```
+```text
 Phase 1: Network (swarm.network)              — no dependencies
          ServiceUserRole (swarm.service_user_role) — no dependencies
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/development/supported-services.md` around lines 241 - 247, The fenced
code block containing the dependency chain (lines showing Phase 1: Network,
ServiceUserRole, ServiceInstanceSpec, ServiceInstance, ServiceInstanceMonitor)
is missing a language tag and should be marked for text rendering; update the
opening triple-backtick to include "text" so the block starts with ```text to
ensure proper rendering of the diagram (refer to the dependency entries: Network
(swarm.network), ServiceUserRole (swarm.service_user_role), ServiceInstanceSpec
(swarm.service_instance_spec), ServiceInstance (swarm.service_instance),
ServiceInstanceMonitor (monitor.service_instance)).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/development/supported-services.md`:
- Around line 55-69: The fenced code block containing the ASCII data flow
diagram (starting with "API Request (spec.services)" and the arrows) lacks a
language specifier; update the opening fence to include a language (e.g., use
```text for ASCII art or ```mermaid if converting to a mermaid diagram) so the
block renders correctly, and leave the diagram contents unchanged except for the
fence language token.
- Around line 241-247: The fenced code block containing the dependency chain
(lines showing Phase 1: Network, ServiceUserRole, ServiceInstanceSpec,
ServiceInstance, ServiceInstanceMonitor) is missing a language tag and should be
marked for text rendering; update the opening triple-backtick to include "text"
so the block starts with ```text to ensure proper rendering of the diagram
(refer to the dependency entries: Network (swarm.network), ServiceUserRole
(swarm.service_user_role), ServiceInstanceSpec (swarm.service_instance_spec),
ServiceInstance (swarm.service_instance), ServiceInstanceMonitor
(monitor.service_instance)).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a7e8a31 and 13e7bb6.

📒 Files selected for processing (1)
  • docs/development/supported-services.md

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/development/supported-services.md`:
- Around line 55-69: The fenced flow-diagram code block is unlabeled and
triggers MD040; update the opening triple-backticks for that block to include
the language hint "text" (i.e., change ``` to ```text) so the block is
recognized as plain text and the MD040 lint warning is resolved.
- Around line 241-247: The fenced code block that lists the dependency phases
(starting with "Phase 1: Network (swarm.network)" and including "Phase 4:
ServiceInstanceMonitor (monitor.service_instance)") lacks a language hint;
update the opening fence from ``` to ```text so the block is marked as plain
text to resolve the MD040 lint warning.
- Line 1169: The cross-reference target for A.2 used in the A.7 step is
malformed (shows `apiapiapiv1validatego`); update the fragment portion of the
link that references validateMCPServiceConfig()/A.2 so it matches the actual
heading slug for A.2 (the validation function entry), replacing the incorrect
`apiapiapiv1validatego` fragment with the correct fragment that corresponds to
the A.2 heading (so the `validateMCPServiceConfig()` cross-link resolves
correctly).
- Around line 90-92: The docs currently list the fields `cpus` and `memory` with
inconsistent types across sections (overview table uses Float64/UInt64 while the
API DSL appendix marks them as String with patterns); pick the canonical type
(either numeric types Float64/UInt64 or patterned String) and make all
references consistent: update the overview table, the appendix API DSL entries,
any example payloads, and schema/pattern descriptions so `cpus` and `memory` use
the same type and validation wording throughout (ensure you update the `cpus`
and `memory` entries and any nearby examples that assume one representation).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13e7bb6 and 9e16bcd.

📒 Files selected for processing (1)
  • docs/development/supported-services.md

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/development/supported-services.md (1)

1169-1169: ⚠️ Potential issue | 🟡 Minor

Fix the broken cross-reference fragment for A.2.

The fragment #a2-validation-serverinternalapiapiapiv1validatego contains a duplicate api segment. Based on the actual heading at line 697, the correct fragment should be #a2-validation-serverinternalapiapiv1validatego.

📝 Proposed fix
-   `validateMCPServiceConfig()` (see [A.2](`#a2-validation-serverinternalapiapiapiv1validatego`)).
+   `validateMCPServiceConfig()` (see [A.2](`#a2-validation-serverinternalapiapiv1validatego`)).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/development/supported-services.md` at line 1169, Update the broken
cross-reference fragment that follows `validateMCPServiceConfig()` by replacing
the duplicate segment `#a2-validation-serverinternalapiapiapiv1validatego` with
the correct fragment `#a2-validation-serverinternalapiapiv1validatego`; locate
the occurrence near the `validateMCPServiceConfig()` reference and change only
the fragment string so it matches the actual heading anchor.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/development/supported-services.md`:
- Line 1169: Update the broken cross-reference fragment that follows
`validateMCPServiceConfig()` by replacing the duplicate segment
`#a2-validation-serverinternalapiapiapiv1validatego` with the correct fragment
`#a2-validation-serverinternalapiapiv1validatego`; locate the occurrence near
the `validateMCPServiceConfig()` reference and change only the fragment string
so it matches the actual heading anchor.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e16bcd and 22e384f.

📒 Files selected for processing (1)
  • docs/development/supported-services.md

Copy link
Contributor

@moizpgedge moizpgedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs looks good to me

Copy link
Contributor

@tsivaprasad tsivaprasad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.
I think the document would benefit from adding a small “API Usage & Verification” section with concrete examples.

  • Example POST /v1/databases request payload with a service
  • Example validation error response (HTTP 400)
  • Example GET /v1/databases/{id} response showing service_instances
  • Example update flow (add/remove service from existing DB)
  • Brief clarification on how service health/failure is reflected in API state

Add concrete request/response examples for service CRUD operations,
validation errors, and health/failure state transitions. Covers
create, read, update (add/remove), error responses, sensitive key
stripping, and the service instance monitor lifecycle.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/development/supported-services.md`:
- Around line 153-156: Add one clear sentence in the discussed docs sections
clarifying that failures during provisioning/workflow (e.g., errors returned by
GetServiceImage when a service_type/version is not registered in the image
registry) can cause the workflow task to fail and set the database state to
"failed", whereas runtime/post-provision monitoring or health-check failures do
not automatically transition the database to "failed" and must be handled
separately by the monitoring/recovery logic; place this sentence near the
existing text that mentions service errors and DB state so readers see the
distinction (also mirror the same sentence in the other section that currently
reads inconsistently).
- Around line 257-259: The docs conflict: one section says the runtime network
name is "{databaseID}-database" while the GenerateDatabaseNetworkID(dbID)
documentation says "{dbID}"; update documentation so both use a single canonical
format or explicitly document the transformation. Edit the text describing
GenerateDatabaseNetworkID(dbID) to show the full runtime name (e.g.,
"{databaseID}-database") or add a short sentence in the earlier section
explaining that GenerateDatabaseNetworkID(dbID) returns dbID which is then
converted to the runtime name by appending "-database". Ensure both the
paragraph referencing the runtime name and the GenerateDatabaseNetworkID(dbID)
doc are consistent.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22e384f and b3b9d65.

📒 Files selected for processing (1)
  • docs/development/supported-services.md

@rshoemaker
Copy link
Contributor Author

Thanks @tsivaprasad, I added the content you suggested.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/development/supported-services.md (1)

257-262: ⚠️ Potential issue | 🟡 Minor

Clarify identifier vs runtime network name in this paragraph.

Line 261 currently says deduplication happens because both sides generate the same "{databaseID}-database" name, but Lines 448-451 explain matching is done via the resource identifier (GenerateDatabaseNetworkID(dbID)), with runtime Docker name set separately. Please align this paragraph to avoid implementation/test confusion.

Suggested wording fix
- naturally via identifier matching (both generate the same
- `"{databaseID}-database"` name). Uses the IPAM service to allocate subnets.
+ naturally via identifier matching (both use
+ `GenerateDatabaseNetworkID(dbID)` as the same resource ID). The runtime Docker
+ network name is `"{databaseID}-database"` (set separately on `Network.Name`).
+ Uses the IPAM service to allocate subnets.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/development/supported-services.md` around lines 257 - 262, The paragraph
about Network deduplication is ambiguous about "name" vs the internal resource
identifier; update the text to state that deduplication is done by matching the
resource identifier (GenerateDatabaseNetworkID(dbID)) and that the runtime
Docker network name is set separately (e.g., "{databaseID}-database"), so tests
and implementation aren't confused—mention both the identifier function
GenerateDatabaseNetworkID and the runtime name format to make the distinction
explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/development/supported-services.md`:
- Around line 257-262: The paragraph about Network deduplication is ambiguous
about "name" vs the internal resource identifier; update the text to state that
deduplication is done by matching the resource identifier
(GenerateDatabaseNetworkID(dbID)) and that the runtime Docker network name is
set separately (e.g., "{databaseID}-database"), so tests and implementation
aren't confused—mention both the identifier function GenerateDatabaseNetworkID
and the runtime name format to make the distinction explicit.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3b9d65 and 80b0ec1.

📒 Files selected for processing (2)
  • .gitignore
  • docs/development/supported-services.md
✅ Files skipped from review due to trivial changes (1)
  • .gitignore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants