Skip to content

Conversation

@jog1t
Copy link
Contributor

@jog1t jog1t commented Jan 2, 2026

TL;DR

Added error handling and display for Actors in the frontend UI.

What changed?

  • Added error field to Actor data structure and query options
  • Created new actorErrorQueryOptions for retrieving Actor errors
  • Moved RunnerPoolError component from runner-config-table to actor-status-label for reuse
  • Added ActorError component to display different types of Actor errors
  • Updated GuardConnectableInspector to show appropriate error messages based on Actor status
  • Changed "Stopped" status label to "Destroyed" for better clarity
  • Enhanced Actor status determination to consider error state as "crashed"

How to test?

  1. Create an Actor that encounters an error (e.g., by configuring an invalid serverless endpoint)
  2. Observe the error message displayed in the Actor UI
  3. Verify different error types are properly formatted and displayed
  4. Check that the Actor status correctly shows as "Crashed" when an error is present

Why make this change?

This change improves the developer experience by providing clearer feedback when Actors fail to start or encounter errors. Previously, error information was limited, making it difficult to diagnose issues. Now, specific error messages are displayed based on the error type, helping users understand and resolve problems more efficiently.

@vercel
Copy link

vercel bot commented Jan 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
rivet-cloud Error Error Jan 9, 2026 9:38pm
rivet-inspector Error Error Jan 9, 2026 9:38pm
rivetkit-serverless Error Error Jan 9, 2026 9:38pm
1 Skipped Deployment
Project Deployment Review Updated (UTC)
rivet-site Ignored Ignored Preview Jan 9, 2026 9:38pm

Copy link
Contributor Author

jog1t commented Jan 2, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Jan 3, 2026

PR Review: feat(dash): surface actor errors in ui

Summary

This PR enhances the frontend UI to display Actor errors, improving the developer experience when debugging Actor failures. The implementation adds error handling and display logic across several components.

Code Quality & Best Practices

✅ Strengths

  • Good code reuse: The RunnerPoolError component was successfully extracted from runner-config-table.tsx and moved to actor-status-label.tsx for reuse
  • Type safety: Proper use of TypeScript types with Rivet.ActorError and Rivet.RunnerPoolError
  • Pattern matching: Excellent use of ts-pattern for handling different error types with exhaustive matching
  • Consistent status logic: Error state is now properly integrated into the getActorStatus function
  • Better UX: Changing "Stopped" to "Destroyed" improves clarity (line 13 in actor-status-label.tsx)

🔍 Observations

1. Status Determination Priority (queries/index.ts:128-130)
The error check happens first in getActorStatus, meaning an Actor with an error will always show as "crashed" even if it has other timestamps set. This seems correct, but worth noting:

if (error) {
    return "crashed";
}

Consider: Should an Actor with both error and destroyedAt show as "crashed" or "stopped"? The current logic prioritizes "crashed", which seems reasonable for debugging.

2. Query Structure (default-data-provider.tsx:199-204)
The new actorErrorQueryOptions is well-structured:

actorErrorQueryOptions(actorId: ActorId) {
    return queryOptions({
        ...this.actorQueryOptions(actorId),
        select: (data) => data.error,
    });
}

However, this creates a separate query subscription. If the parent actorQueryOptions is already being used elsewhere, consider whether the data is already cached.

3. Exhaustive Pattern Matching (actor-status-label.tsx:76-97)
The ActorError component uses .exhaustive(), which is excellent for type safety:

return match(error)
    .with(P.string, (errMsg) =>
        match(errMsg)
            .with("no_capacity", () => (...))
            .exhaustive(),
    )
    .with(P.shape({ runnerPoolError: P.any }), ...)
    .with(P.shape({ runnerNoResponse: P.any }), ...)
    .exhaustive();

Potential Issue: The inner match on P.string only handles "no_capacity". If other string error types exist in the backend, this will throw a runtime error. Consider adding .otherwise() instead of .exhaustive() for the inner match:

.with(P.string, (errMsg) =>
    match(errMsg)
        .with("no_capacity", () => (...))
        .otherwise(() => <span>Unknown error: {errMsg}</span>),
)

Potential Bugs & Issues

⚠️ Medium Priority

1. Error Display in UnavailableInfo (guard-connectable-inspector.tsx:68-74)
The error display wraps QueriedActorError in a <p> tag:

<Info>
    <p>Actor is unavailable.</p>
    <p>
        <QueriedActorError actorId={actorId} />
    </p>
</Info>

If QueriedActorError returns null (when isError || !error), this creates an empty paragraph. Consider:

.with("crashed", () => {
    const errorContent = <QueriedActorError actorId={actorId} />;
    return (
        <Info>
            <p>Actor is unavailable.</p>
            {errorContent && <p>{errorContent}</p>}
        </Info>
    );
})

Or rely on CSS to handle empty paragraphs if that's acceptable.

2. RunnerPoolError Return Type (actor-status-label.tsx:112-153)
The RunnerPoolError component returns different types:

  • null for nullish errors
  • String messages for all other cases
  • Uses .exhaustive() which is good

The function signature should be explicit about the return type:

export function RunnerPoolError({
    error,
}: {
    error: Rivet.RunnerPoolError | undefined;
}): string | null {
    // ...
}

However, looking at the usage context (line 88), it's used inside JSX <span>, so it's being rendered. The current implementation is fine, but the return type inconsistency (React component returning strings/null) is unusual.

Performance Considerations

⚡ Minor Concerns

1. Polling Interval (guard-connectable-inspector.tsx:42)

const { data: status } = useQuery({
    ...useDataProvider().actorStatusQueryOptions(actorId),
    refetchInterval: 1000,
});

1-second polling is reasonable for status updates, but ensure this doesn't cause performance issues with many Actors on screen. Consider:

  • Using a longer interval (2-3 seconds) for non-critical status checks
  • Implementing WebSocket updates for real-time status changes (if not already in place)

2. Multiple Query Subscriptions
The actorErrorQueryOptions, actorStatusAdditionalInfoQueryOptions, and actorQueryOptions all query the same underlying data. React Query's cache will deduplicate these, but be aware that:

  • Multiple components subscribing to different slices of the same data is efficient with React Query
  • The select function creates derived state, which is re-computed on each update

Security Concerns

✅ No Critical Issues Found

1. Error Message Display
Error messages from the backend are displayed directly to users:

  • serverlessHttpError.body (line 138)
  • serverlessConnectionError.message (line 141)
  • serverlessInvalidPayload.message (line 147)

Question: Are these messages sanitized on the backend? If these come from user-controlled serverless endpoints, there's a potential for XSS if the messages aren't sanitized. React's default escaping should handle this, but verify that:

  1. Error messages from external sources are sanitized
  2. No dangerouslySetInnerHTML is used elsewhere with this data

Test Coverage

📝 Suggestions

The PR doesn't include tests. Consider adding:

1. Unit tests for getActorStatus

describe('getActorStatus', () => {
    it('should return crashed when error is present', () => {
        const actor = {
            createdAt: '2024-01-01',
            startedAt: '2024-01-01',
            destroyedAt: null,
            sleepingAt: null,
            pendingAllocationAt: null,
            rescheduleAt: null,
            error: 'no_capacity',
        };
        expect(getActorStatus(actor)).toBe('crashed');
    });
    
    // Test that error takes precedence over other states
    it('should return crashed even when started', () => {
        const actor = {
            createdAt: '2024-01-01',
            startedAt: '2024-01-01',
            destroyedAt: null,
            sleepingAt: null,
            pendingAllocationAt: null,
            rescheduleAt: null,
            error: { runnerPoolError: 'internal_error' },
        };
        expect(getActorStatus(actor)).toBe('crashed');
    });
});

2. Component tests for error display

  • Test that ActorError renders correctly for each error type
  • Test that RunnerPoolError handles all error variants
  • Test that UnavailableInfo displays appropriate messages for each status

Additional Recommendations

1. Error Logging
Consider adding client-side error logging when an Actor error is encountered:

export function ActorError({ error }: { error: Rivet.ActorError }) {
    // Log error for debugging/monitoring
    console.error('[Actor Error]', error);
    
    return match(error)
        // ... existing code
}

2. Documentation
Add JSDoc comments to the new components:

/**
 * Displays Actor error messages based on error type.
 * Handles no_capacity, runnerPoolError, and runnerNoResponse errors.
 */
export function ActorError({ error }: { error: Rivet.ActorError }) {
    // ...
}

3. Accessibility
Error messages should be announced to screen readers. Consider wrapping errors in an ARIA live region:

<div role="alert" aria-live="polite">
    <ActorError error={error} />
</div>

Conclusion

Overall, this is a solid PR that improves the developer experience significantly. The code quality is good with appropriate use of TypeScript and pattern matching. The main concerns are:

  1. Action Required: Fix the inner .exhaustive() in ActorError to handle unknown string error types
  2. Consider: Add unit tests for the new status logic
  3. Consider: Verify backend sanitization of error messages displayed to users
  4. Nice to have: Add accessibility improvements for error announcements

Great work on improving error visibility in the UI! 🎉

@claude
Copy link

claude bot commented Jan 3, 2026

PR Review: feat(dashboard): surface actor errors in ui

Summary

This PR successfully adds error handling and display functionality for Rivet Actors in the frontend UI. The implementation is well-structured and follows good TypeScript patterns using ts-pattern for type-safe error handling.


✅ Strengths

  1. Code Reuse: Excellent refactoring of RunnerPoolError component from runner-config-table.tsx to actor-status-label.tsx, eliminating code duplication.

  2. Type Safety: Proper use of ts-pattern with .exhaustive() ensures all error cases are handled at compile time.

  3. User Experience: Clear, actionable error messages that help developers diagnose issues:

    • "No capacity available to start Actor"
    • "Runner Pool Error: [specific details]"
    • "Runner (ID) was allocated but Actor did not respond"
  4. Status Terminology: Changed "Stopped" to "Destroyed" for better clarity (line 13 in actor-status-label.tsx).

  5. Separation of Concerns: New actorErrorQueryOptions provides focused data fetching for error information.


🔍 Code Quality Observations

Good Patterns:

  • Consistent use of query options pattern
  • Proper null/undefined handling with P.nullish
  • Clean component decomposition (ActorError, QueriedActorError, RunnerPoolError)

Minor Issues:

  1. Blank Lines in runner-config-table.tsx (lines 259-260)

    • Two unnecessary blank lines were introduced after removing the RunnerPoolError component
    • Should be removed for consistency
  2. Unused Import Warning

    • formatISO is imported in actor-status-label.tsx (line 3) but only used within the reschedule logic, not in the new error handling code

🎯 Functionality Review

Error Flow:

  1. Backend provides error field on Actor object ✅
  2. Frontend queries error via actorErrorQueryOptions ✅
  3. Status determination checks error state first (queries/index.ts:128-130) ✅
  4. UI displays appropriate error message based on error type ✅

Status Priority (correct order):

  1. Error → "crashed"
  2. Reschedule → "crash-loop"
  3. Pending allocation → "pending"
  4. Destroyed → "stopped"
  5. Sleeping → "sleeping"
  6. Started → "running"

🔒 Security Considerations

No security concerns identified:

  • Error messages do not expose sensitive system information
  • No user-controlled input being rendered unsafely
  • Error details from backend are properly typed and validated

⚡ Performance Considerations

Minor optimization opportunities:

  1. Redundant Queries: In guard-connectable-inspector.tsx:81, QueriedActorError performs a separate query for the error. This could be optimized by:

    • Passing the error directly from the parent existing query (line 45-48 already fetches status, which includes error checking)
    • However, the current approach is acceptable due to React Query caching
  2. Polling Frequency: refetchInterval: 1000 (line 47) might be aggressive for error states. Consider:

    • Using exponential backoff for crashed states
    • But this is a minor concern and current approach is acceptable for UX

🧪 Test Coverage

Missing test coverage concerns:

  • No visible test files modified in this PR
  • Should verify:
    • Error message rendering for all error types
    • Status determination logic with various error states
    • Component rendering when error is null/undefined

Recommendation: Add unit tests for:

  • ActorError component with each error variant
  • getActorStatus function with error field populated
  • RunnerPoolError component with all error shapes

🐛 Potential Bugs

None identified. The implementation appears solid with proper type safety and null handling.


💡 Suggestions for Improvement

  1. Error Recovery Actions: Consider adding action buttons for common errors:

    • "No capacity" → "Scale runners" button
    • "Invalid serverless endpoint" → "Edit configuration" button
  2. Error Logging: Consider client-side error tracking for analytics on common Actor failures.

  3. Documentation: Add JSDoc comments to exported components like ActorError explaining expected error shapes.


✍️ Style & Conventions

Adherence to CLAUDE.md:

  • ✅ Uses proper TypeScript patterns
  • ✅ Follows existing file structure conventions
  • ✅ Component naming is consistent
  • ⚠️ Minor: Extra blank lines should be removed

📋 Checklist

  • ✅ Code quality: Good with minor whitespace issue
  • ✅ Best practices: Follows React/TypeScript patterns
  • ✅ Potential bugs: None identified
  • ✅ Performance: Acceptable, minor optimization opportunities
  • ✅ Security: No concerns
  • ⚠️ Test coverage: No tests added/modified
  • ✅ Documentation: Clear through type system and component names

🎯 Verdict: Approve with Minor Suggestions

This is a solid implementation that meaningfully improves the developer experience. The only required change is removing the blank lines. Test coverage would be beneficial but not blocking.

Required Changes:

  • Remove blank lines at runner-config-table.tsx:259-260

Recommended Enhancements:

  • Add unit tests for error handling logic
  • Consider adding error recovery actions in future iteration

Great work on improving error visibility! 🚀

@jog1t jog1t marked this pull request as ready for review January 5, 2026 16:26
@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 5, 2026

Graphite Automations

"Test" took an action on this PR • (01/05/26)

1 assignee was added to this PR based on Kacper Wojciechowski's automation.

@jog1t jog1t force-pushed the 01-03-fix_dash_use_ineligible_icon_for_runners branch from a4972a8 to d8be1a7 Compare January 9, 2026 00:04
@jog1t jog1t force-pushed the 01-03-feat_dash_surface_actor_errors_in_ui branch from d9ea998 to d93db97 Compare January 9, 2026 00:04
@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 9, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3743

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3743

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3743

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3743

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3743

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3743

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3743

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3743

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3743

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3743

commit: d93db97

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 9, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3743

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3743

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3743

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3743

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3743

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3743

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3743

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3743

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3743

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3743

commit: d93db97

Copy link
Member

@NathanFlurry NathanFlurry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • crash policy needs to be sleep
  • serverless on undefined error

@vercel vercel bot temporarily deployed to Preview – rivetkit-serverless January 9, 2026 20:12 Inactive
@jog1t jog1t changed the title feat(dash): surface actor errors in ui feat(dashboard): surface actor errors in ui Jan 9, 2026
@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 9, 2026

Merge activity

  • Jan 9, 9:41 PM UTC: jog1t added this pull request to the Graphite merge queue.
  • Jan 9, 9:42 PM UTC: CI is running for this pull request on a draft pull request (#3805) due to your merge queue CI optimization settings.
  • Jan 9, 9:42 PM UTC: Merged by the Graphite merge queue via draft PR: #3805.

graphite-app bot pushed a commit that referenced this pull request Jan 9, 2026
### TL;DR

Added error handling and display for Actors in the frontend UI.

### What changed?

- Added `error` field to Actor data structure and query options
- Created new `actorErrorQueryOptions` for retrieving Actor errors
- Moved `RunnerPoolError` component from runner-config-table to actor-status-label for reuse
- Added `ActorError` component to display different types of Actor errors
- Updated `GuardConnectableInspector` to show appropriate error messages based on Actor status
- Changed "Stopped" status label to "Destroyed" for better clarity
- Enhanced Actor status determination to consider error state as "crashed"

### How to test?

1. Create an Actor that encounters an error (e.g., by configuring an invalid serverless endpoint)
2. Observe the error message displayed in the Actor UI
3. Verify different error types are properly formatted and displayed
4. Check that the Actor status correctly shows as "Crashed" when an error is present

### Why make this change?

This change improves the developer experience by providing clearer feedback when Actors fail to start or encounter errors. Previously, error information was limited, making it difficult to diagnose issues. Now, specific error messages are displayed based on the error type, helping users understand and resolve problems more efficiently.
@graphite-app graphite-app bot closed this Jan 9, 2026
@graphite-app graphite-app bot deleted the 01-03-feat_dash_surface_actor_errors_in_ui branch January 9, 2026 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants