Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
163 changes: 163 additions & 0 deletions .claude/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

RoadRunner Temporal Plugin - enables workflow and activity processing for PHP processes using Temporal. The plugin acts as a bridge between RoadRunner (Go) and Temporal SDK for PHP, handling communication via protobuf codec over goridge protocol.

## Build & Test Commands

### Go Tests
```bash
# Run all tests with race detection and coverage
go test -timeout 20m -v -race -cover -tags=debug -failfast ./...

# Run specific test suites
go test -timeout 20m -v -race -cover -tags=debug -failfast ./tests/general
go test -timeout 20m -v -race -cover -tags=debug -failfast ./canceller
go test -timeout 20m -v -race -cover -tags=debug -failfast ./dataconverter
go test -timeout 20m -v -race -cover -tags=debug -failfast ./queue

# Run with coverage profile
go test -timeout 20m -v -race -cover -tags=debug -failfast -coverpkg=$(cat pkgs.txt) -coverprofile=coverage.out -covermode=atomic ./general
```

### PHP Tests Setup
```bash
cd tests/php_test_files
composer install
```

### Linting
```bash
# Run golangci-lint with project config
golangci-lint run --timeout=10m --build-tags=safe
```

## Architecture

### Core Components

**Plugin Structure** (`plugin.go`):
- `Plugin` - main plugin struct, manages lifecycle and pools
- Implements RoadRunner plugin interface: `Init()`, `Serve()`, `Stop()`, `RPC()`
- Manages two worker pools: workflow pool (single worker) and activity pool (multiple workers)
- Uses mutex-protected `temporal` struct to hold client, workers, and definitions

**Worker Pools**:
- **Workflow Pool** (`wfP`): Single PHP worker dedicated to workflow execution
- **Activity Pool** (`actP`): Multiple PHP workers for concurrent activity execution
- Configured via `pool.Config` in config.go
- Uses `static_pool.Pool` from roadrunner-server/pool

**Communication Flow**:
1. Temporal SDK (Go) ↔ Plugin (`aggregatedpool/`) ↔ PHP Workers via codec
2. Protocol messages defined in `internal/protocol.go`
3. Codec implementation in `internal/codec/proto/`
4. Uses goridge for Go↔PHP communication

### Key Abstractions

**Workflow Definition** (`aggregatedpool/workflow.go`):
- Implements Temporal's `WorkflowDefinition` interface
- Handles workflow execution, local activities, queries, signals, updates
- Uses message queue for command/response exchange with PHP worker
- Maintains ID registry, cancellation context, and callback management

**Activity Definition** (`aggregatedpool/activity.go`):
- Implements Temporal's activity execution interface
- Routes activity invocations to PHP activity pool
- Handles activity context, headers, and heartbeats

**Protocol** (`internal/protocol.go`):
- Defines command constants (e.g., `invokeActivityCommand`, `startWorkflowCommand`)
- Message structure for Go↔PHP communication
- Context includes task queue, replay flag, history info

### Worker Lifecycle

1. **Initialization** (`internal.go:initPool()`):
- Creates activity and workflow pools
- Initializes codec and definitions
- Retrieves worker info from PHP via protobuf
- Creates Temporal client with interceptors
- Starts Temporal workers

2. **Reset Flow** (`plugin.go:Reset()`, `ResetAP()`):
- Triggered by worker stop events
- Stops Temporal workers, resets pools
- Purges sticky workflow cache
- Re-initializes workers with fresh PHP processes
- Workflow worker PID tracked for targeted resets

3. **Event Handling**:
- Subscribes to `EventWorkerStopped` events
- Checks PID in event message to determine WF vs Activity worker
- Executes full reset for WF worker, activity-only reset for activity workers

### Configuration

**Structure** (`config.go`):
- `Address`: Temporal server address
- `Namespace`: Temporal namespace
- `CacheSize`: Sticky workflow cache size
- `Activities`: Pool configuration for activity workers
- `Metrics`: Optional Prometheus or StatsD metrics
- `TLS`: Optional TLS configuration
- `DisableActivityWorkers`: Flag to disable activity pool

**Environment Variables**:
- `RR_MODE=temporal` - set for PHP workers
- `RR_CODEC=protobuf` - codec identifier
- `NO_PROXY` - respected for gRPC connections

### Interceptors & Extensions

The plugin supports interceptors via `api.Interceptor` interface:
- Collected via Endure's dependency injection (`Collects()`)
- Applied to Temporal workers during initialization
- Stored in `temporal.interceptors` map

### Metrics

Two driver options:
- **Prometheus**: Exposed on configured address
- **StatsD**: Sent to statsd server with configurable prefix/tags

Metrics integrated via Temporal's `MetricsHandler` interface using uber-go/tally.

## Important Patterns

### Codec Usage
- All PHP communication uses protobuf codec (`internal/codec/proto/`)
- Wraps Temporal's data converter for payload serialization
- PHP SDK version extracted from worker info to set gRPC headers

### Worker Restart Strategy
- Workflow worker PID stored in `p.wwPID` for tracking
- Worker stop events include PID in message
- Targeted restarts based on PID matching prevent unnecessary full resets

### Client Headers
- gRPC interceptor rewrites client-name to "temporal-php-2"
- client-version set to PHP SDK version from worker info
- API key dynamically loaded from atomic pointer

### Pool Allocation
- Activity pool: configurable workers via `Activities.NumWorkers`
- Workflow pool: always 1 worker with 240h allocate timeout
- Both use same command/env but different pool configs

## File Organization

- `plugin.go`, `internal.go` - plugin lifecycle and initialization
- `config.go`, `tls.go`, `metrics.go` - configuration and setup
- `rpc.go` - RPC methods for external control
- `info.go`, `status.go` - worker and status information
- `aggregatedpool/` - workflow/activity definitions bridging Go↔Temporal
- `internal/` - protocol definitions and codec implementation
- `api/` - interfaces and context utilities
- `canceller/`, `queue/`, `registry/` - workflow execution utilities
- `dataconverter/` - Temporal data converter wrapper
- `tests/` - integration tests with PHP workers
6 changes: 3 additions & 3 deletions .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
timeout-minutes: 60
strategy:
matrix:
php: [ "8.3" ]
php: [ "8.4" ]
go: [ stable ]
os: [ "ubuntu-latest" ]
steps:
Expand Down Expand Up @@ -86,7 +86,7 @@ jobs:
timeout-minutes: 60
strategy:
matrix:
php: [ "8.3" ]
php: [ "8.4" ]
go: [ stable ]
os: [ "ubuntu-latest" ]
steps:
Expand Down Expand Up @@ -179,7 +179,7 @@ jobs:
timeout-minutes: 60
strategy:
matrix:
php: [ "8.3" ]
php: [ "8.4" ]
go: [ stable ]
os: [ "ubuntu-latest" ]
steps:
Expand Down
5 changes: 5 additions & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,11 @@ linters:
for-loops: true
wsl:
allow-assign-and-anything: true
revive:
rules:
- name: var-naming
disabled: true

exclusions:
generated: lax
presets:
Expand Down
40 changes: 0 additions & 40 deletions AGENTS.md

This file was deleted.

38 changes: 27 additions & 11 deletions aggregatedpool/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ func (wp *Workflow) handleUpdate(name string, id string, input *commonpb.Payload
},
input,
header,
wp.getWorkflowWorkerPid(),
)
}

Expand All @@ -94,6 +95,7 @@ func (wp *Workflow) handleCancel() {
internal.CancelWorkflow{RunID: wp.env.WorkflowInfo().WorkflowExecution.RunID},
nil,
wp.header,
wp.getWorkflowWorkerPid(),
)
}

Expand All @@ -107,6 +109,7 @@ func (wp *Workflow) handleSignal(name string, input *commonpb.Payloads, header *
},
input,
header,
wp.getWorkflowWorkerPid(),
)

return nil
Expand Down Expand Up @@ -223,7 +226,7 @@ func (wp *Workflow) handleMessage(msg *internal.Message) error {
return errors.E(op, err)
}

wp.mq.PushResponse(msg.ID, result)
wp.mq.PushResponse(msg.ID, result, wp.getWorkflowWorkerPid())
err = wp.flushQueue()
if err != nil {
return errors.E(op, err)
Expand Down Expand Up @@ -279,7 +282,7 @@ func (wp *Workflow) handleMessage(msg *internal.Message) error {
case *internal.CompleteWorkflow:
wp.log.Debug("complete workflow request", zap.Uint64("ID", msg.ID))
result, _ := wp.env.GetDataConverter().ToPayloads(completed)
wp.mq.PushResponse(msg.ID, result)
wp.mq.PushResponse(msg.ID, result, wp.getWorkflowWorkerPid())

if msg.Failure == nil {
wp.env.Complete(msg.Payloads, nil)
Expand All @@ -291,7 +294,7 @@ func (wp *Workflow) handleMessage(msg *internal.Message) error {
case *internal.ContinueAsNew:
wp.log.Debug("continue-as-new request", zap.Uint64("ID", msg.ID), zap.String("name", command.Name))
result, _ := wp.env.GetDataConverter().ToPayloads(completed)
wp.mq.PushResponse(msg.ID, result)
wp.mq.PushResponse(msg.ID, result, wp.getWorkflowWorkerPid())

wp.env.Complete(nil, &workflow.ContinueAsNewError{
WorkflowType: &bindings.WorkflowType{
Expand Down Expand Up @@ -503,7 +506,7 @@ func (wp *Workflow) handleMessage(msg *internal.Message) error {
}

result, _ := wp.env.GetDataConverter().ToPayloads(completed)
wp.mq.PushResponse(msg.ID, result)
wp.mq.PushResponse(msg.ID, result, wp.getWorkflowWorkerPid())

err = wp.flushQueue()
if err != nil {
Expand Down Expand Up @@ -540,12 +543,12 @@ func (wp *Workflow) createLocalActivityCallback(id uint64) bindings.LocalActivit

if lar.Err != nil {
wp.log.Debug("error", zap.Error(lar.Err), zap.Int32("attempt", lar.Attempt), zap.Duration("backoff", lar.Backoff))
wp.mq.PushError(id, temporal.GetDefaultFailureConverter().ErrorToFailure(lar.Err))
wp.mq.PushError(id, temporal.GetDefaultFailureConverter().ErrorToFailure(lar.Err), wp.getWorkflowWorkerPid())
return
}

wp.log.Debug("pushing local activity response", zap.Uint64("ID", id))
wp.mq.PushResponse(id, lar.Result)
wp.mq.PushResponse(id, lar.Result, wp.getWorkflowWorkerPid())
}

return func(lar *bindings.LocalActivityResultWrapper) {
Expand All @@ -571,13 +574,13 @@ func (wp *Workflow) createCallback(id uint64, t string) bindings.ResultHandler {

if err != nil {
wp.log.Debug("error", zap.Error(err), zap.String("type", t))
wp.mq.PushError(id, temporal.GetDefaultFailureConverter().ErrorToFailure(err))
wp.mq.PushError(id, temporal.GetDefaultFailureConverter().ErrorToFailure(err), wp.getWorkflowWorkerPid())
return
}

wp.log.Debug("pushing response", zap.Uint64("ID", id), zap.String("type", t))
// fetch original payload
wp.mq.PushResponse(id, result)
wp.mq.PushResponse(id, result, wp.getWorkflowWorkerPid())
}

return func(result *commonpb.Payloads, err error) {
Expand All @@ -603,11 +606,11 @@ func (wp *Workflow) createContinuableCallback(id uint64, t string) bindings.Resu
wp.canceller.Discard(id)

if err != nil {
wp.mq.PushError(id, temporal.GetDefaultFailureConverter().ErrorToFailure(err))
wp.mq.PushError(id, temporal.GetDefaultFailureConverter().ErrorToFailure(err), wp.getWorkflowWorkerPid())
return
}

wp.mq.PushResponse(id, result)
wp.mq.PushResponse(id, result, wp.getWorkflowWorkerPid())
err = wp.flushQueue()
if err != nil {
panic(err)
Expand Down Expand Up @@ -678,7 +681,9 @@ func (wp *Workflow) flushQueue() error {
func (wp *Workflow) runCommand(cmd any, payloads *commonpb.Payloads, header *commonpb.Header) (*internal.Message, error) {
const op = errors.Op("workflow_process_runcommand")
msg := &internal.Message{}
wp.mq.AllocateMessage(cmd, payloads, header, msg)
// attempt to prevent sending the response from the dead worker

wp.mq.AllocateMessage(cmd, payloads, header, msg, wp.getWorkflowWorkerPid())

if wp.mh != nil {
wp.mh.Gauge(RrMetricName).Update(float64(wp.pool.QueueSize()))
Expand Down Expand Up @@ -734,6 +739,17 @@ func (wp *Workflow) runCommand(cmd any, payloads *commonpb.Payloads, header *com
return msgs[0], nil
}

func (wp *Workflow) getWorkflowWorkerPid() int {
wp.log.Debug("fetching workflow worker pid")
wfw := wp.pool.Workers()
if len(wfw) > 0 {
wp.log.Debug("workflow worker pid found", zap.Int("pid", int(wfw[0].Pid())))
return int(wfw[0].Pid())
}
wp.log.Debug("workflow worker pid not found")
return 0
}

func (wp *Workflow) getPld() *payload.Payload {
return wp.pldPool.Get().(*payload.Payload)
}
Expand Down
2 changes: 1 addition & 1 deletion aggregatedpool/workers.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ func TemporalWorkers(wDef *Workflow, actDef *Activity, wi []*internal.WorkerInfo
)
}

// interceptor used here to the headers
// interceptor used here to the headers
wi[i].Options.Interceptors = append(wi[i].Options.Interceptors, NewWorkerInterceptor())
for _, interceptor := range interceptors {
wi[i].Options.Interceptors = append(wi[i].Options.Interceptors, interceptor.WorkerInterceptor())
Expand Down
Loading
Loading