Skip to content

Event Schema

Adrian Burlacu edited this page Feb 12, 2026 · 5 revisions

Event Schema

This document describes the unified event system used throughout Stark Orchestrator for debugging, UI timelines, audits, and real-time communication.

Overview

Events are first-class data in Stark Orchestrator. Unlike simple logs, events are:

  • Structured — Consistent schema for all event types
  • Queryable — Filter by category, severity, resource, namespace, time range
  • Persistent — Stored in database for audit trails and debugging
  • Real-time — Streamed via WebSocket for live updates
  • Actionable — Used for UI timelines, alerting, and automation

Key Event Types

Category Example Events
Pod PodScheduled, PodFailed, PodRestarted, PodEvicted
Node NodeLost, NodeRecovered, NodeRegistered, NodeDraining
Service ServiceScaled, ServiceRollback, ServiceFailed
Pack PackPublished, PackVersionDeprecated
System ClusterStarted, ConfigChanged
Auth UserLoggedIn, PermissionDenied
Secret SecretCreated, SecretUpdated, SecretDeleted, SecretInjected
Scheduler SchedulingCycleCompleted, NoNodesAvailable
Ephemeral PodJoinedGroup, PodLeftGroup, PodGroupCreated, PodGroupDissolved

Event Structure

All events follow a consistent structure stored in the events table:

interface StarkEvent {
  // Identification
  id: string;                    // Unique event ID
  eventType: string;             // e.g., 'PodScheduled', 'NodeLost'
  category: EventCategory;       // 'pod' | 'node' | 'pack' | 'service' | 'system' | 'auth' | 'scheduler'
  severity: EventSeverity;       // 'info' | 'warning' | 'error' | 'critical'
  
  // Resource (polymorphic)
  resourceId?: string;           // Primary resource ID
  resourceType?: string;         // 'pod', 'node', 'pack', etc.
  resourceName?: string;         // Human-readable name
  namespace?: string;            // Namespace context
  
  // Actor
  actorId?: string;              // Who caused the event
  actorType: EventActorType;     // 'user' | 'system' | 'scheduler' | 'node'
  
  // Details
  reason?: string;               // Short reason code
  message?: string;              // Human-readable description
  previousState?: object;        // State before event
  newState?: object;             // State after event
  
  // Related resource
  relatedResourceId?: string;    // e.g., node for pod events
  relatedResourceType?: string;
  relatedResourceName?: string;
  
  // Metadata
  metadata: Record<string, unknown>;
  source: EventSource;           // 'server' | 'node' | 'client' | 'scheduler'
  correlationId?: string;        // For tracing
  timestamp: Date;
}

Severity Levels

Level Description Examples
info Normal operations PodCreated, NodeRegistered, PodStarted
warning Degraded state, attention needed PodEvicted, NodeDraining, NodeLost
error Failure occurred PodFailed, ScheduleFailed, ServiceFailed
critical Immediate attention required Cluster-wide failures, data loss risks

Pod Events

PodCreated

Emitted when a new pod is created.

{
  "eventType": "PodCreated",
  "category": "pod",
  "severity": "info",
  "resourceId": "pod-uuid",
  "resourceType": "pod",
  "resourceName": "my-app-abc123",
  "namespace": "default",
  "reason": "PodCreated",
  "message": "Pod created",
  "newState": { "status": "pending" },
  "metadata": {
    "packName": "my-app",
    "packVersion": "1.0.0"
  }
}

PodScheduled

Emitted when a pod is assigned to a node.

{
  "eventType": "PodScheduled",
  "category": "pod",
  "severity": "info",
  "resourceId": "pod-uuid",
  "resourceType": "pod",
  "resourceName": "my-app-abc123",
  "namespace": "default",
  "reason": "Scheduled",
  "message": "Pod scheduled to node worker-1",
  "previousState": { "status": "pending" },
  "newState": { "status": "scheduled" },
  "relatedResourceId": "node-uuid",
  "relatedResourceType": "node",
  "relatedResourceName": "worker-1"
}

PodFailed

Emitted when a pod fails with an error.

{
  "eventType": "PodFailed",
  "category": "pod",
  "severity": "error",
  "resourceId": "pod-uuid",
  "resourceType": "pod",
  "resourceName": "my-app-abc123",
  "namespace": "default",
  "reason": "OOMKilled",
  "message": "Container killed due to memory limit",
  "previousState": { "status": "running" },
  "newState": { "status": "failed" },
  "relatedResourceId": "node-uuid",
  "relatedResourceType": "node",
  "metadata": {
    "exitCode": 137,
    "memoryLimit": 256
  }
}

PodRestarted

Emitted when a pod is restarted.

{
  "eventType": "PodRestarted",
  "category": "pod",
  "severity": "info",
  "resourceId": "pod-uuid",
  "resourceType": "pod",
  "resourceName": "my-app-abc123",
  "namespace": "default",
  "reason": "Restarted",
  "message": "Pod restarted",
  "metadata": {
    "restartCount": 3
  }
}

PodEvicted

Emitted when a pod is evicted from a node.

{
  "eventType": "PodEvicted",
  "category": "pod",
  "severity": "warning",
  "resourceId": "pod-uuid",
  "resourceType": "pod",
  "resourceName": "my-app-abc123",
  "namespace": "default",
  "reason": "NodeDrain",
  "message": "Pod evicted due to node drain",
  "previousState": { "status": "running" },
  "newState": { "status": "evicted" },
  "relatedResourceId": "node-uuid",
  "relatedResourceType": "node"
}

Other Pod Events

  • PodStarting — Pod is starting up
  • PodRunning — Pod is now running
  • PodStopped — Pod stopped normally
  • PodRolledBack — Pod rolled back to previous version
  • PodScaled — Pod replicas changed
  • PodUpdated — Pod configuration updated
  • PodDeleted — Pod was deleted
  • PodScheduleFailed — Scheduling failed (no suitable node)

Node Events

NodeRegistered

Emitted when a node registers with the orchestrator.

{
  "eventType": "NodeRegistered",
  "category": "node",
  "severity": "info",
  "resourceId": "node-uuid",
  "resourceType": "node",
  "resourceName": "worker-1",
  "reason": "NodeRegistered",
  "message": "Node registered with orchestrator",
  "newState": { "status": "ready" },
  "metadata": {
    "runtimeType": "node",
    "labels": { "env": "production" },
    "allocatable": { "cpu": 2000, "memory": 4096, "pods": 20 }
  }
}

NodeLost

Emitted when a node stops responding (heartbeat timeout or disconnect).

{
  "eventType": "NodeLost",
  "category": "node",
  "severity": "warning",
  "resourceId": "node-uuid",
  "resourceType": "node",
  "resourceName": "worker-1",
  "reason": "HeartbeatTimeout",
  "message": "Node lost: worker-1",
  "previousState": { "status": "ready" },
  "newState": { "status": "offline" },
  "metadata": {
    "lastHeartbeat": "2026-02-03T10:00:15.000Z",
    "runtimeType": "node"
  }
}

NodeRecovered

Emitted when a previously offline node reconnects.

{
  "eventType": "NodeRecovered",
  "category": "node",
  "severity": "info",
  "resourceId": "node-uuid",
  "resourceType": "node",
  "resourceName": "worker-1",
  "reason": "HeartbeatRestored",
  "message": "Node recovered: worker-1",
  "previousState": { "status": "offline" },
  "newState": { "status": "ready" }
}

Other Node Events

  • NodeReady — Node is ready to accept pods
  • NodeDraining — Node is being drained
  • NodeDrained — Node drain completed
  • NodeCordoned — Node marked unschedulable
  • NodeUncordoned — Node marked schedulable again
  • NodeDeleted — Node was removed
  • NodeResourcePressure — Node has resource pressure

Service Events

ServiceCreated

{
  "eventType": "ServiceCreated",
  "category": "service",
  "severity": "info",
  "resourceId": "service-uuid",
  "resourceType": "service",
  "resourceName": "my-service",
  "namespace": "default",
  "reason": "Created",
  "message": "Service created with 3 replicas"
}

ServiceRollback

{
  "eventType": "ServiceRollback",
  "category": "service",
  "severity": "warning",
  "resourceId": "service-uuid",
  "resourceType": "service",
  "resourceName": "my-service",
  "namespace": "default",
  "reason": "ConsecutiveFailures",
  "message": "Service rolled back from v1.2.0 to v1.1.0",
  "previousState": { "version": "1.2.0" },
  "newState": { "version": "1.1.0" },
  "metadata": {
    "failureCount": 3
  }
}

Secret Events

SecretCreated

Emitted when a new secret is created. Secret values are never included in event data.

{
  "eventType": "SecretCreated",
  "category": "secret",
  "severity": "info",
  "resourceId": "secret-uuid",
  "resourceType": "secret",
  "resourceName": "db-creds",
  "namespace": "default",
  "reason": "Created",
  "message": "Secret created (opaque, 2 keys)",
  "metadata": {
    "secretType": "opaque",
    "keyCount": 2,
    "injectionMode": "env"
  }
}

SecretUpdated

{
  "eventType": "SecretUpdated",
  "category": "secret",
  "severity": "info",
  "resourceId": "secret-uuid",
  "resourceType": "secret",
  "resourceName": "db-creds",
  "namespace": "default",
  "reason": "Updated",
  "message": "Secret updated (2 keys changed)",
  "metadata": {
    "keyCount": 2
  }
}

SecretDeleted

{
  "eventType": "SecretDeleted",
  "category": "secret",
  "severity": "info",
  "resourceId": "secret-uuid",
  "resourceType": "secret",
  "resourceName": "db-creds",
  "namespace": "default",
  "reason": "Deleted",
  "message": "Secret deleted"
}

SecretInjected

Emitted when secrets are resolved and injected into a pod. Lists secret names but never values.

{
  "eventType": "SecretInjected",
  "category": "secret",
  "severity": "info",
  "resourceId": "pod-uuid",
  "resourceType": "pod",
  "resourceName": "my-app-abc123",
  "namespace": "default",
  "reason": "SecretsInjected",
  "message": "2 secrets injected into pod",
  "metadata": {
    "secretNames": ["db-creds", "api-cert"],
    "envVarCount": 2,
    "volumeMountCount": 1
  }
}

Querying Events

REST API

# Get events for a pod
GET /api/events?resourceType=pod&resourceId=<pod-id>

# Get critical events from last hour
GET /api/events?severity=error,critical&since=2026-02-03T09:00:00Z

# Get namespace timeline
GET /api/events?namespace=production&limit=100

# Get node events
GET /api/events?category=node&limit=50

TypeScript SDK

import { queryEvents, getPodEvents, getCriticalEvents } from '@stark-o/server';

// Query with filters
const result = await queryEvents({
  category: 'pod',
  namespace: 'production',
  severity: ['error', 'warning'],
  since: new Date(Date.now() - 60 * 60 * 1000), // Last hour
  limit: 100,
});

// Get pod timeline
const { events } = await getPodEvents('pod-uuid');

// Get critical events for alerting
const { events: critical } = await getCriticalEvents(
  new Date(Date.now() - 60 * 60 * 1000) // Since 1 hour ago
);

Emitting Events

Events are automatically emitted by database triggers when resources change state. You can also emit events programmatically:

import { emitPodEvent, emitNodeEvent, emitEvent } from '@stark-o/server';

// Emit a pod event
await emitPodEvent({
  eventType: 'PodFailed',
  podId: 'pod-uuid',
  podName: 'my-app-abc123',
  namespace: 'default',
  severity: 'error',
  reason: 'OOMKilled',
  message: 'Container killed due to memory limit',
  previousStatus: 'running',
  newStatus: 'failed',
  nodeId: 'node-uuid',
  nodeName: 'worker-1',
  metadata: { exitCode: 137 },
});

// Emit a node event
await emitNodeEvent({
  eventType: 'NodeLost',
  nodeId: 'node-uuid',
  nodeName: 'worker-1',
  severity: 'warning',
  reason: 'HeartbeatTimeout',
  message: 'Node stopped responding',
});

// Emit a generic event
await emitEvent({
  eventType: 'ConfigChanged',
  category: 'system',
  severity: 'info',
  reason: 'ConfigUpdate',
  message: 'Cluster configuration updated',
  actorId: 'user-uuid',
});

Event Reasons

Common reason codes for categorizing events:

Pod Reasons

Reason Description
Scheduled Pod assigned to node
ScheduleFailed No suitable node found
NoNodesAvailable No nodes available
InsufficientResources Not enough resources
TaintNotTolerated Node taint not tolerated
Started Pod started running
Failed Pod execution failed
OOMKilled Out of memory
CrashLoopBackOff Repeated crashes
Evicted Pod evicted
NodeDrain Evicted due to node drain
Preempted Preempted by higher priority pod

Node Reasons

Reason Description
Registered Node registered
HeartbeatTimeout Heartbeat not received
HeartbeatRestored Heartbeat resumed
Ready Node ready
Cordoned Marked unschedulable
Draining Being drained
MemoryPressure Low memory
DiskPressure Low disk space

Migration from pod_history

The previous pod_history table has been superseded by the unified events table. A compatibility view pod_history_compat is available for backwards compatibility:

SELECT * FROM pod_history_compat WHERE pod_id = 'pod-uuid';

Ephemeral Events (PodGroups)

The ephemeral data plane emits its own in-memory event stream, separate from the persistent StarkEvent system above. These events are not stored in the database — they are delivered via audit hooks registered on the PodGroupStore or EphemeralDataPlane.

Ephemeral Event Structure

interface EphemeralEvent {
  type: EphemeralEventType;
  timestamp: string;          // ISO 8601
  groupId?: string;
  podId?: string;
  queryId?: string;
  message?: string;
  metadata?: Record<string, unknown>;
}

Ephemeral Event Types

Event Severity Emitted When
PodJoinedGroup info A pod joins a group for the first time
PodLeftGroup info A pod explicitly leaves a group
PodGroupCreated info A group is lazily created (first member joins)
PodGroupDissolved info A group is garbage-collected (last member leaves or expires)
PodMembershipExpired info A membership is reaped due to TTL expiration
PodMembershipRefreshed info A pod refreshes its existing membership
EphemeralQueryIssued info An ephemeral fan-out query is sent
EphemeralResponseReceived info An ephemeral response is received from a pod

PodJoinedGroup

{
  "type": "PodJoinedGroup",
  "timestamp": "2026-02-12T10:00:00.000Z",
  "groupId": "demo:podgroup-chat",
  "podId": "pod-abc-123",
  "message": "Pod 'pod-abc-123' joined group 'demo:podgroup-chat'",
  "metadata": { "role": "echo" }
}

PodGroupCreated

{
  "type": "PodGroupCreated",
  "timestamp": "2026-02-12T10:00:00.000Z",
  "groupId": "demo:podgroup-chat",
  "message": "Group 'demo:podgroup-chat' created"
}

PodGroupDissolved

{
  "type": "PodGroupDissolved",
  "timestamp": "2026-02-12T10:05:00.000Z",
  "groupId": "demo:podgroup-chat",
  "message": "Group 'demo:podgroup-chat' dissolved (no remaining members)"
}

PodMembershipExpired

{
  "type": "PodMembershipExpired",
  "timestamp": "2026-02-12T10:02:00.000Z",
  "groupId": "demo:podgroup-chat",
  "podId": "pod-abc-123",
  "message": "Membership expired: pod 'pod-abc-123' in group 'demo:podgroup-chat'"
}

Registering Audit Hooks

// On PodGroupStore (server-side)
const dispose = store.onEvent((event) => {
  if (event.type === 'PodGroupDissolved') {
    console.log(`Group ${event.groupId} dissolved`);
  }
});

// On EphemeralDataPlane (pack-side, local mode only)
const dispose = plane.onEvent((event) => {
  console.log(`[audit] ${event.type}: ${event.message}`);
});

// Unregister later
dispose();

Full reference: PodGroups & Ephemeral Data Plane

Related Topics

Clone this wiki locally