This document defines the first credible trust-control model for Viaduct as an early product. It is intentionally narrow. The goal is to make tenant-scoped evaluation and supervised pilot use feel trustworthy and accountable without pretending Viaduct already has enterprise identity, compliance, or policy infrastructure.
internal/api/middleware.goalready supports three credential paths:X-Admin-Keyfor platform-level admin routesX-API-Keyfor tenant credentialsX-Service-Account-Keyfor tenant-scoped service accounts
internal/models/tenant.goalready defines three tenant roles and six permissions:- roles:
viewer,operator,admin - permissions:
inventory.read,reports.read,lifecycle.read,migration.manage,tenant.read,tenant.manage
- roles:
internal/api/server.goalready enforces role and permission boundaries per route instead of relying on frontend-only assumptions.GET /api/v1/tenants/currentalready exposes effective role, permissions, auth method, and service-account identity for the active caller.internal/models/audit.goand the store layer already persist tenant-scoped audit events withid,tenant_id,actor,request_id,category,action,resource,outcome,message,details, andcreated_at.internal/api/reports.goalready exposes tenant-scoped audit history atGET /api/v1/auditandGET /api/v1/reports/audit.internal/api/observability.goalready generates or forwardsX-Request-ID, returns it in API responses, and logs request-scoped metadata.- The dashboard already exposes some trust context:
web/src/features/settings/SettingsPage.tsxshows auth method, role, service-account name, and effective permissions.web/src/features/reports/ReportsPage.tsxallows report and audit export through the current credential context.
- Phase 3 appears to have introduced multi-tenancy, tenant-scoped persistence, service accounts, quotas, and role-aware routing.
- Phase 4 appears to have made migration execution resumable, approval-aware, and checkpoint-driven, which makes action attribution and audit history materially more important.
- Phase 5 appears to have hardened API contracts with structured errors, request IDs, and a more explicit operator contract in
docs/reference/openapi.yaml.
- Default-tenant fallback still exists for local/lab convenience. That is useful for demos but not a credible trust posture for real evaluation or pilot use.
- A tenant API key currently authenticates as tenant
admin. That is acceptable for bootstrap and break-glass use, but weak for day-to-day operator attribution. - Human attribution is only as good as the credential in use. If multiple humans share one tenant key, audit history collapses to
tenant:<tenant-id>. - Platform-admin attribution is also weak today.
AdminAuthMiddlewareauthenticates a sharedX-Admin-Key, and admin audit events currently record onlyActor: "admin". - Audit coverage is incomplete for trust-sensitive actions. Today it is strongest for migration commands and service-account changes, but not yet consistent for report exports or authenticated authorization denials.
- The audit schema is intentionally small, but details are not yet normalized into a stable convention for migration commands, approval context, or export activity.
- The role model has an important implementation nuance that is easy to misread: explicit service-account permissions narrow access inside the service account's role ceiling. They do not override
RequireTenantRolechecks ininternal/api/middleware.go. - The dashboard exposes current identity context in Settings, but not yet the action attribution an operator expects in migration history and execution detail views.
GET /api/v1/migrationsandGET /api/v1/migrations/{id}are currentlymigration.manageroutes. That means theviewerrole cannot inspect migration history today, even though it can read audit history and reports.- Audit durability depends on the configured store. The in-memory store is useful for development but not a credible source of truth for pilot-grade audit history.
- The current API-key-based early-product posture.
- The three-role model already present in code.
- Explicit service-account permissions as a narrowing mechanism inside the existing role boundary, not a separate custom RBAC system.
- Tenant-scoped routing, storage, reporting, and request-correlation behavior.
- The existing dashboard Settings surface as the place where current caller context is explained.
- Freeze one early-product trust model around the primitives Viaduct already has:
- separate platform admin and tenant credentials
- three tenant roles
- service accounts as the preferred non-break-glass identity
- audit events for all trust-sensitive mutations and exports
- visible attribution in existing dashboard surfaces
Viaduct should adopt the following early-product trust stance:
- Keep authentication simple and explicit.
- Treat the tenant boundary as the primary security boundary.
- Use service accounts for named access whenever possible.
- Keep the role model small enough to reason about during pilots.
- Make state-changing actions and audit/report exports traceable through one tenant-scoped audit trail.
- Show operators who performed important actions, through which credential path, and with which request ID.
This is not an SSO-first or enterprise IAM design. It is the minimum model that makes Viaduct feel like serious infrastructure software during evaluation and supervised pilot work.
| Credential | Current mechanism | Intended use in early product | Should be used for |
|---|---|---|---|
| Platform admin key | X-Admin-Key |
Platform bootstrap and tenant administration | Creating and deleting tenants |
| Tenant key | X-API-Key |
Tenant bootstrap and break-glass admin access | Initial setup, emergency admin actions, short-lived direct admin work |
| Service-account key | X-Service-Account-Key |
Normal named access for operators and automation | Dashboard access, scripted operations, export jobs, migration workflows |
- Viaduct v1 does not need SSO, OIDC, browser sessions, or SCIM.
- Viaduct does need explicit, named credentials with a clear intended use.
- Tenant API keys should exist, but they should not be the recommended default for routine operator activity.
- Dashboard and automation guidance should prefer service accounts over a shared tenant key.
- In the current web client, that means
VITE_VIADUCT_SERVICE_ACCOUNT_KEYshould be the default documented credential path, whileVITE_VIADUCT_API_KEYshould be documented as bootstrap or break-glass usage. - Default-tenant fallback should remain available for local lab usage but should be treated as non-production behavior in docs, demos, and pilot guidance.
- Pilots should run behind TLS termination or a trusted reverse proxy. Viaduct's early trust model assumes transport security is provided by the deployment environment.
- When Viaduct does not yet have human user accounts, the credible substitute is a named service account per operator or per automation context.
- Service-account metadata should carry an owner or purpose label so audit trails remain understandable.
- Shared credentials should be treated as temporary exceptions, not the default operating pattern.
Viaduct should keep the current three-role model.
| Role | Current repo status | Intended user | Trust boundary |
|---|---|---|---|
viewer |
Implemented | Read-only evaluator, stakeholder, support reviewer | Can inspect tenant state but cannot change migration or tenant state |
operator |
Implemented | Migration operator or platform engineer running planned work | Can plan, validate, execute, resume, roll back, and simulate |
admin |
Implemented | Tenant owner or platform lead | Can do all operator work plus tenant/service-account administration |
- It matches the code already shipping in
internal/models/tenant.go. - It is easy to explain in a pilot.
- It avoids a custom-role UI or policy editor before the product has validated its workflow boundaries.
- It still allows narrower automation through explicit service-account permissions without changing the route model.
- Custom roles
- Per-resource ACLs
- Team/group mapping
- Approval-policy authoring by persona
- Fine-grained field-level or object-level permissions
- Explicit service-account permissions are not a privilege-escalation mechanism. They only matter after the route's role gate has already passed.
- Viaduct does not need a new
migration.readpermission in this step. The current route boundary remains intact for v1. - Because the current route boundary remains intact, read-only migration oversight for
vieweris limited to summaries, reports, and audit history, notGET /api/v1/migrations. - Admin-key actions remain only weakly attributed in v1. Do not claim stronger proof of admin identity than the current shared-key model actually provides.
| Action | Route or surface | Permission | Viewer | Operator | Admin | Notes |
|---|---|---|---|---|---|---|
| Read inventory | GET /api/v1/inventory |
inventory.read |
Yes | Yes | Yes | Includes graph and snapshots |
| Read graph and snapshots | GET /api/v1/graph, GET /api/v1/snapshots |
inventory.read |
Yes | Yes | Yes | Read-only analysis path |
| Run preflight | POST /api/v1/preflight |
migration.manage |
No | Yes | Yes | Operator path starts here |
| Create migration plan | POST /api/v1/migrations |
migration.manage |
No | Yes | Yes | Creates persisted migration state |
| Inspect migration state | GET /api/v1/migrations, GET /api/v1/migrations/{id} |
migration.manage |
No | Yes | Yes | Current route is operator-scoped |
| Execute migration | POST /api/v1/migrations/{id}/execute |
migration.manage |
No | Yes | Yes | High-trust action, always auditable |
| Resume migration | POST /api/v1/migrations/{id}/resume |
migration.manage |
No | Yes | Yes | High-trust action, always auditable |
| Roll back migration | POST /api/v1/migrations/{id}/rollback |
migration.manage |
No | Yes | Yes | High-trust action, always auditable |
| Read cost/policy/drift/summary | GET /api/v1/costs, GET /api/v1/policies, GET /api/v1/drift, GET /api/v1/summary |
lifecycle.read |
Yes | Yes | Yes | Read-only lifecycle analysis |
| Read audit history | GET /api/v1/audit |
reports.read |
Yes | Yes | Yes | Viewer access is acceptable for tenant-local transparency |
| Export reports | GET /api/v1/reports/* |
reports.read |
Yes | Yes | Yes | Export action itself should be auditable |
| Read current tenant context | GET /api/v1/tenants/current |
tenant.read |
Yes | Yes | Yes | Powers Settings trust context |
| List service accounts | GET /api/v1/service-accounts |
tenant.manage |
No | No | Yes | Admin-only tenant administration |
| Create service account | POST /api/v1/service-accounts |
tenant.manage |
No | No | Yes | Admin-only |
| Rotate service-account key | POST /api/v1/service-accounts/{id}/rotate |
tenant.manage |
No | No | Yes | Admin-only |
| Create or delete tenant | /api/v1/admin/tenants* |
Admin key only | No | No | No | Outside tenant role model |
operatoris the person who can move workloads.adminis the person who can change who is allowed to move workloads.- The same human may hold both in a small pilot, but the actions are still different and should remain separately modeled.
- Tenant administration should not be required for routine migration operations.
- Migration execution should not imply service-account management authority.
The early-product bar is not "audit everything." The bar is "audit every trust-sensitive change and every trust-sensitive export."
| Action class | Current status | v1 expectation |
|---|---|---|
| Tenant create/delete | Already audited | Keep |
| Service-account create | Already audited | Keep |
| Service-account key rotate | Already audited | Keep |
| Migration plan creation | Already audited | Keep |
| Migration execute | Already audited for success and some failures | Normalize details and keep |
| Migration resume | Already audited for success and some failures | Normalize details and keep |
| Migration rollback | Already audited for success and failure | Keep |
| CSV or JSON export from `GET /api/v1/reports/{summary | migrations | audit}` |
Authenticated permission denial (403) on trust-sensitive routes |
Not consistently audited | Add when principal is known |
- Every inventory read
- Every dashboard page view
- Unauthenticated invalid credential attempts as tenant audit events
- Full audit search, retention, and filtering UI
- Immutable/WORM audit retention
- It captures who changed state or extracted reportable data.
- It avoids flooding the audit log with low-value read noise.
- It keeps the implementation centered on the existing
AuditEventmodel.
The model only becomes executable if audit action names and ownership points are fixed. Viaduct should use the following route-to-action mapping for v1.
| Route or handler | Current implementation point | Category | Action | Notes |
|---|---|---|---|---|
POST /api/v1/admin/tenants |
handleAdminTenants in internal/api/server.go |
admin |
create-tenant |
Already emitted |
DELETE /api/v1/admin/tenants/{id} |
handleAdminTenantByID in internal/api/server.go |
admin |
delete-tenant |
Already emitted |
POST /api/v1/service-accounts |
handleServiceAccounts in internal/api/tenant_admin.go |
tenant |
create-service-account |
Already emitted |
POST /api/v1/service-accounts/{id}/rotate |
handleServiceAccountByID in internal/api/tenant_admin.go |
tenant |
rotate-service-account-key |
Already emitted |
POST /api/v1/migrations |
handleMigrations in internal/api/server.go |
migration |
plan |
Already emitted |
POST /api/v1/migrations/{id}/execute |
handleMigrationByID in internal/api/server.go |
migration |
execute |
Already emitted; details need normalization |
POST /api/v1/migrations/{id}/resume |
handleMigrationByID in internal/api/server.go |
migration |
resume |
Already emitted; details need normalization |
POST /api/v1/migrations/{id}/rollback |
handleMigrationByID in internal/api/server.go |
migration |
rollback |
Already emitted |
GET /api/v1/reports/summary |
writeSummaryReport in internal/api/reports.go |
report |
export-summary-report |
Missing today |
GET /api/v1/reports/migrations |
writeMigrationsReport in internal/api/reports.go |
report |
export-migrations-report |
Missing today |
GET /api/v1/reports/audit |
writeAuditReport in internal/api/reports.go |
report |
export-audit-report |
Missing today |
Authenticated 403 on tenant route |
RequireTenantRole and RequireTenantPermission in internal/api/middleware.go |
authz |
deny-role or deny-permission |
Missing today; requires backend refactor or dependency injection |
Viaduct should keep the current internal/models/audit.go schema as the base contract and standardize how it is used.
{
"id": "evt-123",
"tenant_id": "tenant-a",
"actor": "service-account:ops-dashboard",
"request_id": "req-123",
"category": "migration",
"action": "execute",
"resource": "migration-42",
"outcome": "success",
"message": "migration execution started",
"details": {
"auth_method": "service-account",
"role": "operator",
"spec_name": "wave-1",
"approved_by": "alice",
"ticket": "CHG-1234"
},
"created_at": "2026-04-08T14:30:00Z"
}id: unique event IDtenant_id: tenant security boundaryactor: the credential-level identity Viaduct can currently proverequest_id: correlation handle for support and troubleshootingcategory: stable event family such asadmin,tenant,migration, orreportaction: stable verb such ascreate-tenant,execute, orexport-audit-reportresource: affected resource identifier when one existsoutcome:successorfailuremessage: concise human-readable summarydetails: small machine-readable context mapcreated_at: UTC event timestamp
- Use
adminfor platform-admin-key actions. - Use
tenant:<tenant-id>for tenant-key actions. - Use
service-account:<service-account-id>for service-account actions. - When human user accounts do not yet exist, do not fake a richer identity in
actor. Usedetailsand service-account metadata for owner labels.
auth_methodroleservice_account_namespec_nameapproved_byticketreport_namereport_formatrouterequired_permissionerror_code
{
"category": "tenant",
"action": "create-service-account",
"resource": "sa-ops-dashboard",
"outcome": "success",
"message": "service account created",
"details": {
"role": "operator",
"service_account_name": "ops-dashboard",
"auth_method": "tenant-api-key"
}
}{
"category": "report",
"action": "export-audit-report",
"resource": "audit",
"outcome": "success",
"message": "audit report exported",
"details": {
"report_name": "audit",
"report_format": "json",
"auth_method": "service-account"
}
}{
"category": "authz",
"action": "deny-permission",
"resource": "service-accounts",
"outcome": "failure",
"message": "tenant principal cannot access \"tenant.manage\"",
"details": {
"route": "POST /api/v1/service-accounts",
"required_permission": "tenant.manage",
"auth_method": "service-account",
"role": "operator"
}
}Viaduct does not need a large security console for v1. It does need attribution in the places operators already use.
| Surface | Current status | v1 expectation |
|---|---|---|
| Settings page | Already shows auth method, role, service-account name, and permissions | Keep as the canonical "who am I authenticated as" surface |
| Migration detail and history | Does not yet expose enough attribution | Do not invent new migration API fields first; source initial attribution from GET /api/v1/audit matched by resource == migration_id |
| Reports page | Can export audit data but does not show much attribution context | Add recent audit history from GET /api/v1/audit before creating any separate audit page |
| Error states | API errors already carry request ID | Keep request ID visible in operator-facing failures |
- action
- actor
- auth method
- timestamp
- request ID
approved_bywhen supplied- ticket/change reference when supplied
- Prefer attribution summaries attached to the existing migration timeline or history rows.
- Do not invent human identities the backend does not know.
- If the actor is a service account, show the service-account name when available and keep the stable ID available for drill-down.
- If the action came from a tenant key, label it clearly as tenant credential usage so it is visible as weaker attribution.
- Do not create client-side synthetic audit records. The UI must render server-returned audit events and request IDs.
- In the current repo, the first viable attribution implementation should extend
web/src/types.tsandweb/src/api.tsfor audit events, then render those events inweb/src/features/reports/ReportsPage.tsxandweb/src/components/MigrationHistory.tsx.
- Viaduct is deployed behind TLS termination or a trusted reverse proxy.
- API keys are provisioned and rotated out of band.
- Pilot-grade environments use PostgreSQL, not only the in-memory store.
- Tenant isolation is the main security boundary.
- Operators understand that service-account ownership is the current unit of identity, not a full human user directory.
- No SSO or OIDC yet
- No MFA yet
- No session-based browser auth yet
- No custom roles or group mapping
- No immutable audit retention guarantee
- No dedicated secrets manager integration
- No strong human identity proof beyond named credentials
- No strong platform-admin attribution beyond request IDs and deployment logs
- This model is good enough for early product evaluation and supervised pilot work.
- It is not the final identity and compliance architecture.
- The UI and docs should state that clearly instead of implying a fuller security posture than the product actually has.
- Separate platform admin and tenant credentials
- Service accounts as first-class named credentials
- The current
viewer/operator/adminrole model - Explicit service-account permissions as a narrowing control
- Tenant-scoped audit persistence in PostgreSQL-backed environments
- Request IDs on API errors and in audit events
- Audit coverage for all state-changing tenant/admin actions
- Audit coverage for report and audit exports
- Visible current-caller context in the dashboard
- Visible attribution for migration commands in the dashboard, sourced from audit events rather than frontend-only inference
- OIDC or SSO
- SCIM or group sync
- MFA
- Custom roles
- Per-object permissions
- A new
migration.readpermission or wider read-only migration visibility - Full audit search/filtering UI
- Immutable retention or external SIEM streaming
- Approval policies tied to organizational identity providers
- Keep the current auth headers and three-role model.
- Update operator-facing docs so tenant keys are bootstrap or break-glass credentials, while dashboard and automation guidance prefer service accounts.
- Document
VITE_VIADUCT_SERVICE_ACCOUNT_KEYas the normal dashboard credential path. - Treat default-tenant fallback as local-lab-only behavior in product messaging.
- Normalize migration audit details in
internal/api/server.gosoplan,execute,resume, androllbackuse the same detail keys. - Add server-side export audit events in
internal/api/reports.go. Do not rely on the dashboard client for attribution. - Add authenticated
403audit events for tenant routes. BecauseRequireTenantRoleandRequireTenantPermissionare currently package-level middleware helpers without direct store access, do not bolt on ad hoc logging. Either:- refactor those checks into server-bound wrappers with access to
recordAuditEvent, or - inject an audit recorder dependency explicitly.
- refactor those checks into server-bound wrappers with access to
- Keep unauthenticated
401failures in request logs and metrics unless a tenant can be safely identified.
- Keep
web/src/features/settings/SettingsPage.tsxas the canonical current-caller context view. - Add an
AuditEventtype and API call inweb/src/types.tsandweb/src/api.ts. - Add recent trust-sensitive audit history to
web/src/features/reports/ReportsPage.tsx. - Add migration attribution in
web/src/components/MigrationHistory.tsxby joining server-returned audit events on migration ID, not by inventing client-side attribution. - Continue showing request IDs in structured API error states.
- Validate that audit events survive restart with PostgreSQL.
- Validate that named service accounts produce understandable attribution in the UI and exported audit trail.
- Validate that shared tenant-key usage is visible as weaker attribution.
- Validate that admin-key actions remain clearly documented as weakly attributed.
- Every trust-sensitive tenant or admin command emits an immediate server-side audit event with
tenant_id,actor,request_id,category,action,outcome, andcreated_at. - Every report export from
/api/v1/reports/*emits one server-side audit event. - Authenticated permission denials on trust-sensitive tenant routes emit one server-side audit event.
GET /api/v1/tenants/currentremains the source of truth for UI identity context.- The dashboard shows current auth method, role, and effective permissions.
- Migration execution history shows action attribution without frontend-only inference.
- The first UI attribution pass reads from audit events, not from new ad hoc migration-history fields.
- Default-tenant fallback is documented as lab behavior, not pilot guidance.
- Auth middleware tests:
- tenant key authenticates as tenant admin
- service-account key authenticates with effective permissions
- inactive or expired service account is rejected
- default-tenant fallback only applies when no active custom tenant is configured
- explicit service-account permissions do not bypass role-gated routes
- Authorization tests:
- viewer cannot call migration-manage routes
- operator cannot create or rotate service accounts
- admin can perform tenant-manage routes
- Audit tests:
handleMigrationsemitsmigration:planhandleMigrationByIDexecute/resume/rollback emit canonical actions with request IDwriteSummaryReport,writeMigrationsReport, andwriteAuditReportemit canonical export actions- authenticated
403emitsauthz:deny-roleorauthz:deny-permission - audit events remain tenant-scoped in both memory and PostgreSQL-backed stores
- UI tests:
- Settings page renders role, auth method, and service-account name from
GET /api/v1/tenants/current - reports surface renders recent trust-sensitive audit events from
GET /api/v1/audit - migration attribution renders actor, action, timestamp, and request ID using audit-event joins
- Settings page renders role, auth method, and service-account name from
- Create a tenant with the admin key.
- Create two service accounts:
- one
operator - one
viewer
- one
- Verify
GET /api/v1/tenants/currentreflects the correct role, permissions, and auth method for each credential. - Run a migration plan with the operator credential and confirm one
migration:planaudit event appears. - Execute the migration with
approved_byandticket, then confirm:- the command is accepted
- the audit event includes request ID and approval context
- Attempt a service-account creation with the operator credential and confirm:
- the API returns
403 - an authenticated authorization-denial audit event is recorded once that gap is implemented
- the API returns
- Export the audit report and confirm one export audit event appears.
- Load the dashboard with the operator service-account credential and confirm:
- Settings shows service-account auth, role, and permissions
- reports surface shows recent trust-sensitive audit events
- migration history/detail shows attribution sourced from audit events
- request IDs are visible on operator-facing failures
This model is intentionally conservative.
- It does not reset the architecture.
- It does not require replacing the current auth mechanism.
- It does not invent human-user identity before the rest of the product is ready.
- It does not pretend the current shared admin key has strong attribution.
- It does not pretend the current migration APIs already carry action history.
- It does make a clear promise: Viaduct must be able to show who did the important thing, through which credential path, and under which request ID.
That is the first trust bar the product needs to clear.