Skip to content

refactor: Enhance error handling, logging, and remote error capture a…#332

Open
dasein108 wants to merge 1 commit intomainfrom
fix-prevent-exit-enhance-log
Open

refactor: Enhance error handling, logging, and remote error capture a…#332
dasein108 wants to merge 1 commit intomainfrom
fix-prevent-exit-enhance-log

Conversation

@dasein108
Copy link
Collaborator

@dasein108 dasein108 commented Feb 8, 2026

User description

This pull request improves error handling and debugging for the RemoteChannel class in src/remote-device/remote-channel.ts. The main focus is on adding detailed error logging, capturing remote errors for monitoring, and providing more informative debug output throughout device and call management operations.

Most important changes:

Error handling and logging improvements:

  • Added comprehensive error logging and remote error capturing to methods handling session setting, user retrieval, device lookup, device creation, device updates, call status updates, and heartbeat updates. This ensures that failures are logged with context and reported for monitoring. [1] [2] [3] [4] [5]

Debug output enhancements:

  • Improved debug messages across the class to provide clearer information about successful operations and failure points, making troubleshooting easier. [1] [2] [3] [4] [5] [6] [7]

Device and channel management robustness:

  • Updated the channel creation process in registerDevice to handle failures gracefully by logging and allowing for retry after socket reconnection, instead of failing outright.
  • Removed the unused subscribe method, consolidating channel subscription logic and reducing code duplication.

Minor improvements:

  • Clarified error messages for channel subscription timeouts and status updates to better indicate the system's state and actions. [1] [2]…cross remote channel operations, and remove the subscribe method.

CodeAnt-AI Description

Prevent initialization exit on channel creation failure and add clearer remote error reporting

What Changed

  • During session setup, failures to set the session or fetch the user are logged, captured for monitoring, and return/throw explicit errors; missing user responses are treated as errors.
  • Device operations (find, create, update) and call/heartbeat/status updates now log failures and send remote error captures; successful updates log confirmation to aid observability.
  • Channel creation failure during device registration is no longer allowed to crash initialization: creation errors are caught and will retry after socket reconnects; the standalone subscribe method was removed.
  • Heartbeat and device status updates report errors and stop silently failing; call execution/result updates now surface failures and log outcomes.

Impact

✅ Clearer remote error reporting during session and device operations
✅ Fewer initialization exits when channel creation fails
✅ Easier diagnosis of failed call, heartbeat, and status updates

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Enhanced error handling and logging for session setup, device operations, and channel management
    • Improved reconnection logic following timeout events and offline status tracking
    • Strengthened error handling in call lifecycle updates and heartbeat monitoring
  • Refactor

    • Consolidated channel creation and subscription flows for improved consistency

…cross remote channel operations, and remove the subscribe method.
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 8, 2026

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 8, 2026

📝 Walkthrough

Walkthrough

This change adds comprehensive error handling, logging, and reliability improvements throughout the remote device channel lifecycle, including session setup, user retrieval, device management, channel subscription, and call tracking, with enhanced reconnect logic during device initialization.

Changes

Cohort / File(s) Summary
Remote Channel Improvements
src/remote-device/remote-channel.ts
Added extensive error handling and logging across session establishment, user retrieval, device lookup/creation/updates, channel creation and subscription lifecycle, call execution tracking, heartbeat management, and status updates. Introduced guarded reconnect logic on channel initialization failure and detailed offline update script invocation logging.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Remote Client
    participant Session as Session Manager
    participant Device as Device Registry
    participant Channel as Channel
    participant Call as Call Tracker
    participant Heartbeat as Heartbeat Monitor

    Client->>Session: setSession()
    alt Session Setup Fails
        Session-->>Client: Log error, capture event, return error
    else Session Setup Success
        Session->>Device: User retrieval
        alt User Not Found
            Device-->>Session: Log failure, capture event, throw
        else User Found
            Session->>Device: findDevice()
            alt Device Not Found
                Device-->>Session: Log, capture event, proceed to create
            else Device Found
                Device->>Device: Update status/last_seen
                Device->>Channel: Create channel with reconnect guard
                alt Channel Init Fails
                    Channel-->>Device: Catch & retry on reconnect (silent)
                else Channel Init Success
                    Channel->>Client: Subscribe & log health events
                    Client->>Call: markCallExecuting()
                    Call->>Heartbeat: updateHeartbeat()
                    Heartbeat->>Device: setOnlineStatus()
                end
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Fix reconnection add verbose #316: Directly modifies the same file with similar enhancements to debug logging, error capture, and channel/session lifecycle management including health checks and reconnect handling.

Suggested labels

size:L

Poem

🐰 Through channels deep and sessions bright,
Error logs now shine with light—
Reconnect on failure's call,
Heartbeats steady, logging all!
Resilience hops from tier to tier. 💫

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes in the changeset: enhanced error handling, logging, and remote error capture throughout the RemoteChannel class.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-prevent-exit-enhance-log

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codeant-ai codeant-ai bot added the size:M This PR changes 30-99 lines, ignoring generated files label Feb 8, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 8, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Sensitive information exposure
    The code spawns a child Node process and passes secrets (Supabase key, access token, refresh token) as command-line arguments to spawnSync. These values can be exposed in OS-level process listings and logs. Consider using environment variables or another secure IPC method to avoid exposing credentials.

  • Raw error objects sent to telemetry
    Many places call captureRemote(...) with raw Error or driver error objects. Those objects can contain sensitive data, circular references, or complex structures that telemetry or logging backends may store inadvertently. Serialize/sanitize errors (message, code, truncated stack) before sending.

  • Inconsistent error handling
    Some methods (e.g., findDevice) throw on database errors while others (updateDevice) log and return {data, error} but continue execution. In registerDevice the code proceeds even if updateDevice fails, which may leave the device in an inconsistent state. Consider standardizing handling (throw vs return) and reacting to failures where necessary.

  • Silent/ignored subscription failures
    registerDevice swallows createChannel failures during initialization (createChannel().catch(...)) — this may leave the device unsubscribed without surfacing a recoverable error. Consider retry/backoff or a clearer lifecycle/state so callers can react instead of silently continuing.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 8, 2026

CodeAnt AI finished reviewing your PR.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/remote-device/remote-channel.ts (2)

220-230: ⚠️ Potential issue | 🟠 Major

Fire-and-forget setOnlineStatus calls risk unhandled promise rejections.

Lines 222 and 227 call this.setOnlineStatus(...) without await or .catch(). If the underlying Supabase call or captureRemote rejects, it produces an unhandled promise rejection — which in Node.js can crash the process depending on the --unhandled-rejections flag.

Add a .catch() to match the pattern already used on line 215–217.

🐛 Proposed fix
                     } else if (status === 'CHANNEL_ERROR') {
                         // console.error('❌ Channel subscription failed:', err);
-                        this.setOnlineStatus(this.deviceId!, 'offline');
+                        this.setOnlineStatus(this.deviceId!, 'offline').catch(e => {
+                            console.error('Failed to set offline status:', e.message);
+                        });
                         captureRemote('remote_channel_subscription_error', { error: err || 'Channel error' }).catch(() => { });
                         reject(err || new Error('Failed to initialize tool call channel subscription'));
                     } else if (status === 'TIMED_OUT') {
                         console.error('⏱️ Channel subscription timed out, Reconnecting...');
-                        this.setOnlineStatus(this.deviceId!, 'offline');
+                        this.setOnlineStatus(this.deviceId!, 'offline').catch(e => {
+                            console.error('Failed to set offline status:', e.message);
+                        });
                         captureRemote('remote_channel_subscription_timeout', {}).catch(() => { });
                         reject(new Error('Tool call channel subscription timed out'));

253-258: ⚠️ Potential issue | 🟠 Major

Awaiting captureRemote in synchronous checkConnectionHealth — floating promise.

Line 254 calls captureRemote(...) (which returns a Promise) without await or .catch(). Since checkConnectionHealth is not async, the rejection is unhandled. Add .catch(() => {}) to match the pattern used elsewhere (e.g., lines 223, 228).

🐛 Proposed fix
         if (state !== 'joined') {
-            captureRemote('remote_channel_state_health', { state });
+            captureRemote('remote_channel_state_health', { state }).catch(() => { });
 
             console.debug(`[DEBUG] ⚠️ Channel in unhealthy state '${state}' - recreating...`);
🤖 Fix all issues with AI agents
In `@src/remote-device/remote-channel.ts`:
- Around line 167-170: Fix the typo in the inline comment near the channel
initialization retry: change "Inialization" to "Initialization" in the comment
that precedes the await this.createChannel().catch(...) call in
remote-channel.ts so codespell passes; ensure the comment reads something like
"// ! Ignore silently in Initialization to reconnect after" and keep the rest of
the catch block and logging unchanged.
- Around line 316-326: The update call using
this.client.from('mcp_remote_calls').update(updateData).eq('id', callId') does
not chain .select(), so Supabase always returns data as null; either chain
.select() on that query to return the updated row(s) and log the returned data
(matching the pattern used in updateDevice), or stop destructuring data and only
check error (e.g., remove data from the const and change the success log to a
simple "Call result updated successfully" without printing data); update the
code around the update call and the subsequent console.debug to reflect the
chosen approach.
🧹 Nitpick comments (3)
src/remote-device/remote-channel.ts (3)

53-58: console.error with [DEBUG] prefix is semantically misleading.

Throughout these changes, console.error('[DEBUG] Failed to ...') mixes error-level severity with a debug tag. This confuses log aggregation tools that key on severity. For operational clarity, use console.debug for diagnostic context and reserve console.error for actionable errors without the [DEBUG] prefix.

This pattern recurs at lines 55, 63, 70, 96, 112, 129, 298, 322, 338, 387.


76-76: User email logged in debug output — potential PII concern.

user.email is logged to stdout. Even at debug level, this could persist in log aggregation systems. Consider whether this aligns with your data-handling policy, or redact it (e.g., mask all but the domain).


290-303: Inconsistent throw-vs-swallow pattern across methods.

markCallExecuting swallows errors (logs + captures but doesn't throw), while findDevice and createDevice throw. This inconsistency makes it harder for callers to reason about failure modes. Consider documenting the intended contract (throw on critical failures, swallow on best-effort updates) or normalizing the pattern.

Comment on lines +167 to +170
// ! Ignore silently in Inialization to reconnect after
await this.createChannel().catch((error) => {
console.debug('[DEBUG] Failed to create channel, will retry after socket reconnect', error);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo: "Inialization" → "Initialization" (pipeline failure).

The codespell check is failing on this line.

✏️ Proposed fix
-            // ! Ignore silently in Inialization to reconnect after
+            // ! Ignore silently in Initialization to reconnect after
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// ! Ignore silently in Inialization to reconnect after
await this.createChannel().catch((error) => {
console.debug('[DEBUG] Failed to create channel, will retry after socket reconnect', error);
});
// ! Ignore silently in Initialization to reconnect after
await this.createChannel().catch((error) => {
console.debug('[DEBUG] Failed to create channel, will retry after socket reconnect', error);
});
🧰 Tools
🪛 GitHub Actions: Codespell

[error] 167-167: Misspelling detected by codespell: 'Inialization' should be 'Initialization'.

🪛 GitHub Check: Check for spelling errors

[failure] 167-167:
Inialization ==> Initialization

🤖 Prompt for AI Agents
In `@src/remote-device/remote-channel.ts` around lines 167 - 170, Fix the typo in
the inline comment near the channel initialization retry: change "Inialization"
to "Initialization" in the comment that precedes the await
this.createChannel().catch(...) call in remote-channel.ts so codespell passes;
ensure the comment reads something like "// ! Ignore silently in Initialization
to reconnect after" and keep the rest of the catch block and logging unchanged.

Comment on lines +316 to +326
const { data, error } = await this.client
.from('mcp_remote_calls')
.update(updateData)
.eq('id', callId);

if (error) {
console.error('[DEBUG] Failed to update call result:', error.message);
await captureRemote('remote_channel_update_call_result_error', { error });
} else {
console.debug('[DEBUG] Call result updated successfully:', data);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

data will always be null.select() is not chained on this update.

Unlike updateDevice (line 109), this query doesn't chain .select(), so Supabase returns null for data. The debug log on line 325 will always print null, which is misleading.

Either chain .select() if you need the returned row, or remove the data capture and simplify the log.

✏️ Proposed fix (option A: add .select())
         const { data, error } = await this.client
             .from('mcp_remote_calls')
             .update(updateData)
-            .eq('id', callId);
+            .eq('id', callId)
+            .select();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const { data, error } = await this.client
.from('mcp_remote_calls')
.update(updateData)
.eq('id', callId);
if (error) {
console.error('[DEBUG] Failed to update call result:', error.message);
await captureRemote('remote_channel_update_call_result_error', { error });
} else {
console.debug('[DEBUG] Call result updated successfully:', data);
}
const { data, error } = await this.client
.from('mcp_remote_calls')
.update(updateData)
.eq('id', callId)
.select();
if (error) {
console.error('[DEBUG] Failed to update call result:', error.message);
await captureRemote('remote_channel_update_call_result_error', { error });
} else {
console.debug('[DEBUG] Call result updated successfully:', data);
}
🤖 Prompt for AI Agents
In `@src/remote-device/remote-channel.ts` around lines 316 - 326, The update call
using this.client.from('mcp_remote_calls').update(updateData).eq('id', callId')
does not chain .select(), so Supabase always returns data as null; either chain
.select() on that query to return the updated row(s) and log the returned data
(matching the pattern used in updateDevice), or stop destructuring data and only
check error (e.g., remove data from the const and change the success log to a
simple "Call result updated successfully" without printing data); update the
code around the update call and the subsequent console.debug to reflect the
chosen approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant