Skip to content
This repository was archived by the owner on Sep 11, 2025. It is now read-only.

Conversation

@mattjohnsonpint
Copy link
Contributor

@mattjohnsonpint mattjohnsonpint commented Jun 13, 2025

This enables the Modus Runtime to operate in a cluster for scale-out and high-availability. It leverages the cluster mode and remoting capabilities of GoAkt. Implemented discovery providers are NATS and Kubernetes.

  • NATS discovery is easiest to test with locally, but you'll need to install a NATS Server.

  • Kubernetes discovery is intended for production usage, and requires additional configuration.

Detailed cluster configuration docs will be written at a later date. For now, refer to the environment variables used in the implementation of /runtime/actors/cluster.go.

Several related changes are incorporated into this PR that allow Modus agents to work reliably in cluster mode. However, note that cluster mode currently requires a shared Postgres database that all nodes can access. It will not work in conjunction with the embedded ModusGraph database at this time.

There is no usage difference from an end-user perspective.

@linear
Copy link

linear bot commented Jun 13, 2025

@mattjohnsonpint mattjohnsonpint added the do-not-merge DO NOT MERGE label Jun 13, 2025
@mattjohnsonpint mattjohnsonpint force-pushed the mjp/hyp-3484-actor-cluster-mode branch 2 times, most recently from f677243 to d8b980e Compare June 18, 2025 04:31
@mattjohnsonpint mattjohnsonpint force-pushed the mjp/hyp-3484-actor-cluster-mode branch from 0001191 to 9515605 Compare June 18, 2025 05:18
@mattjohnsonpint mattjohnsonpint marked this pull request as ready for review June 18, 2025 05:47
@mattjohnsonpint mattjohnsonpint requested review from a team and Copilot June 18, 2025 05:47
@mattjohnsonpint mattjohnsonpint removed the do-not-merge DO NOT MERGE label Jun 18, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables cluster mode in the Modus Runtime using GoAkt’s cluster and remoting features (NATS or Kubernetes), updates protobuf messages and actor lifecycles to support discovery, and adds helper APIs (tell/ask) and logging extensions.

  • Introduce clusterOptions and cluster initialization in actors/cluster.go
  • Rename and extend protobuf messages (messages.proto → add AgentInfoRequest, RestartAgent, etc.)
  • Refactor actor spawning, lifecycle messages, and discovery in runtime/actors

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sdk/go/pkg/agents/imports_mock.go Adjust mock AgentInfo.Status to use string conversion
sdk/go/pkg/agents/agents.go Change AgentStatus/agentEventAction from type alias to defined type
runtime/utils/sentry.go Only add non-default namespace as Sentry extra
runtime/protos/messages.proto Rename message types, add new cluster control messages
runtime/pluginmanager/pluginregistry.go Add GetPluginByName helper
runtime/logger/logger.go Add *f logging helpers and cluster fields in structured logs
runtime/db/db.go Rename env var MODUS_DB_USE_MODUSDBMODUS_USE_MODUSDB
runtime/app/app.go Add KubernetesNamespace() helper
runtime/actors/wasmagent.go Major refactor to handle Actor messages, dependencies, and state
runtime/actors/subscriber.go Refactor subscribe/unsubscribe flow
runtime/actors/misc.go Add tell and ask wrappers
runtime/actors/cluster.go Full cluster-mode implementation with NATS/Kubernetes discovery
runtime/actors/agents.go Refactor agent APIs to use tell/ask and database fallback
runtime/actors/actorsystem.go Integrate cluster options, wasm extension, coordinated shutdown
go.mod / go.work Bump Go version and dependencies (GoAkt v3.6.1, NATS libs, redis)
.vscode/launch.json Add “Debug Modus Runtime (cluster mode)” configuration
CHANGELOG.md Document “feat: cluster mode”
.trunk/configs/cspell.json Add Errf to cspell whitelist
Comments suppressed due to low confidence (5)

runtime/db/db.go:113

  • The environment variable for enabling ModusDB was renamed. Please update corresponding documentation, examples, and README to reflect MODUS_USE_MODUSDB instead of the old MODUS_DB_USE_MODUSDB.
		s := os.Getenv("MODUS_USE_MODUSDB")

runtime/protos/messages.proto:10

  • [nitpick] Removing the Message suffix from protobuf types (AgentRequestMessageAgentRequest) is a breaking change for existing clients; consider adding aliases or bumping the API version.
message AgentRequest {

runtime/actors/subscriber.go:75

  • There is no context.WithoutCancel in the standard library; this will not compile. Consider using context.Background() or passing a fresh parent context instead.
		ctx, cancel := context.WithTimeout(context.WithoutCancel(ctx), time.Second)

runtime/actors/cluster.go:212

  • strings.SplitSeq is not a real Go standard library function. You likely meant strings.Split and should iterate over the resulting slice with for _, label := range strings.Split(labels, ",").
		for label := range strings.SplitSeq(labels, ",") {

runtime/actors/agents.go:104

  • The file uses errors.Is but does not import the errors package. Add import "errors" to avoid a compile error.
		if !errors.Is(err, goakt.ErrActorNotFound) {

@mattjohnsonpint mattjohnsonpint enabled auto-merge (squash) June 18, 2025 15:55
@mattjohnsonpint mattjohnsonpint merged commit 87f0887 into main Jun 18, 2025
51 checks passed
@mattjohnsonpint mattjohnsonpint deleted the mjp/hyp-3484-actor-cluster-mode branch June 18, 2025 15:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants