diff --git a/docs/telemetry-retry-queue.md b/docs/telemetry-retry-queue.md new file mode 100644 index 0000000000..bcc543b010 --- /dev/null +++ b/docs/telemetry-retry-queue.md @@ -0,0 +1,364 @@ +# Telemetry Retry Queue + +This document describes the persistent retry queue system for failed telemetry events in Roo Code. + +## Overview + +The telemetry retry queue ensures that telemetry events are never lost due to temporary network issues, server downtime, or other connectivity problems. It provides a robust delivery system with the following features: + +- **Persistent Storage**: Events are stored locally using VSCode's globalState API and survive extension restarts +- **Exponential Backoff**: Failed events are retried with increasing delays to avoid overwhelming the server +- **Priority Handling**: Critical events (errors, crashes) are prioritized over routine analytics +- **Connection Monitoring**: Tracks connection status and provides user feedback +- **Configurable Behavior**: Users can control retry behavior through VSCode settings + +## Architecture + +### Components + +1. **TelemetryRetryQueue**: Core queue management with persistent storage +2. **ResilientTelemetryClient**: Wrapper that adds retry functionality to any TelemetryClient +3. **Configuration Settings**: VSCode settings for user control +4. **Status Monitoring**: Visual feedback through status bar and notifications + +### Flow + +``` +Telemetry Event → Immediate Send Attempt → Success? → Done + ↓ Failure + Add to Retry Queue + ↓ + Periodic Retry Processing + ↓ + Exponential Backoff + ↓ + Success or Max Retries +``` + +## Configuration + +### VSCode Settings + +Users can configure the retry behavior through the following settings: + +- `roo-cline.telemetryRetryEnabled` (boolean, default: true) + - Enable/disable the retry queue system +- `roo-cline.telemetryRetryMaxRetries` (number, default: 5, range: 0-10) + - Maximum number of retry attempts per event +- `roo-cline.telemetryRetryBaseDelay` (number, default: 1000ms, range: 100-10000ms) + - Base delay between retry attempts (exponential backoff) +- `roo-cline.telemetryRetryMaxDelay` (number, default: 300000ms, range: 1000-600000ms) + - Maximum delay between retry attempts (5 minutes default) +- `roo-cline.telemetryRetryQueueSize` (number, default: 1000, range: 10-10000) + - Maximum number of events to queue for retry +- `roo-cline.telemetryRetryNotifications` (boolean, default: true) + - Show notifications when connection issues are detected + +### Programmatic Configuration + +```typescript +import { TelemetryRetryQueue, RetryQueueConfig } from "@roo-code/telemetry" + +const config: Partial = { + maxRetries: 3, + baseDelayMs: 2000, + maxDelayMs: 60000, + maxQueueSize: 500, + batchSize: 5, + enableNotifications: false, +} + +const retryQueue = new TelemetryRetryQueue(context, config) +``` + +## Usage + +### Basic Usage + +The retry queue is automatically integrated into the telemetry system. No additional code is required for basic functionality: + +```typescript +// This automatically uses the retry queue if the send fails +TelemetryService.instance.captureTaskCreated("task-123") +``` + +### Advanced Usage + +For custom telemetry clients, wrap them with `ResilientTelemetryClient`: + +```typescript +import { ResilientTelemetryClient } from "@roo-code/telemetry" + +const originalClient = new MyTelemetryClient() +const resilientClient = new ResilientTelemetryClient(originalClient, context) + +// Register with telemetry service +TelemetryService.instance.register(resilientClient) +``` + +### Manual Queue Management + +```typescript +// Get queue status +const status = await resilientClient.getQueueStatus() +console.log(`Queue size: ${status.queueSize}`) +console.log(`Connected: ${status.connectionStatus.isConnected}`) + +// Manually trigger retry +await resilientClient.retryNow() + +// Clear queue +await resilientClient.clearQueue() + +// Update configuration +resilientClient.updateRetryConfig({ maxRetries: 10 }) +``` + +## Priority System + +Events are automatically prioritized based on their importance: + +### High Priority Events + +- `SCHEMA_VALIDATION_ERROR` +- `DIFF_APPLICATION_ERROR` +- `SHELL_INTEGRATION_ERROR` +- `CONSECUTIVE_MISTAKE_ERROR` + +### Normal Priority Events + +- All other telemetry events (task creation, completion, etc.) + +High priority events are: + +- Processed before normal priority events +- Retained longer when queue size limits are reached +- Given preference during batch processing + +## Storage + +### Persistence + +Events are stored in VSCode's `globalState` under the key `telemetryRetryQueue`. This ensures: + +- Data survives extension restarts +- Data survives VSCode crashes +- Data is automatically cleaned up when the extension is uninstalled + +### Storage Format + +```typescript +interface QueuedTelemetryEvent { + id: string // Unique identifier + event: TelemetryEvent // Original event data + timestamp: number // When event was first queued + retryCount: number // Number of retry attempts + nextRetryAt: number // When to retry next + priority: "high" | "normal" // Event priority +} +``` + +### Size Management + +- Queue size is limited by `maxQueueSize` setting +- When limit is reached, oldest normal priority events are removed first +- High priority events are preserved longer +- Automatic cleanup of successfully sent events + +## Retry Logic + +### Exponential Backoff + +Retry delays follow an exponential backoff pattern: + +``` +delay = min(baseDelayMs * 2^retryCount, maxDelayMs) +``` + +Example with default settings (baseDelayMs=1000ms, maxDelayMs=300000ms): + +- Retry 1: 1 second +- Retry 2: 2 seconds +- Retry 3: 4 seconds +- Retry 4: 8 seconds +- Retry 5: 16 seconds +- Further retries: 5 minutes (maxDelayMs) + +### Batch Processing + +- Events are processed in batches to improve efficiency +- Default batch size: 10 events +- Batches are processed every 30 seconds +- Failed events in a batch are individually rescheduled + +### Failure Handling + +- Temporary failures (network errors): Event is rescheduled for retry +- Permanent failures (authentication errors): Event may be dropped +- Max retries exceeded: Event is removed from queue +- Invalid events: Event is dropped immediately + +## User Interface + +### Status Bar + +When events are queued, a status bar item appears showing: + +- Queue size +- Connection status (connected/disconnected) +- Click to view queue details + +### Notifications + +When enabled, users receive notifications for: + +- Prolonged disconnection (>5 minutes) +- Large queue buildup +- Option to manually trigger retry or disable notifications + +### Commands + +The following commands are available: + +- `roo-code.telemetry.showQueue`: Display queue status and management options +- `roo-code.telemetry.retryNow`: Manually trigger retry processing +- `roo-code.telemetry.clearQueue`: Clear all queued events + +## Monitoring + +### Connection Status + +The system tracks: + +- `isConnected`: Current connection state +- `lastSuccessfulSend`: Timestamp of last successful telemetry send +- `consecutiveFailures`: Number of consecutive send failures + +Connection is considered lost after 3 consecutive failures. + +### Metrics + +Internal metrics tracked: + +- Queue size over time +- Retry success/failure rates +- Average retry delays +- Event priority distribution + +## Error Handling + +### Graceful Degradation + +- If retry queue initialization fails, telemetry continues without retry +- Storage errors are logged but don't prevent telemetry operation +- Invalid queue data is automatically cleaned up + +### Error Logging + +Errors are logged with appropriate levels: + +- Warnings: Temporary failures, retry attempts +- Errors: Persistent failures, configuration issues +- Info: Successful operations, queue status changes + +## Testing + +### Unit Tests + +Comprehensive test coverage includes: + +- Queue operations (enqueue, dequeue, prioritization) +- Retry logic (exponential backoff, max retries) +- Storage persistence +- Configuration handling +- Error scenarios + +### Integration Tests + +- End-to-end telemetry flow with retry +- VSCode extension integration +- Configuration changes +- Network failure simulation + +## Performance Considerations + +### Memory Usage + +- Queue size is limited to prevent unbounded growth +- Events are stored efficiently with minimal metadata +- Automatic cleanup of processed events + +### CPU Usage + +- Retry processing runs on a 30-second interval +- Batch processing minimizes overhead +- Exponential backoff reduces server load + +### Network Usage + +- Failed events are not retried immediately +- Batch processing reduces connection overhead +- Exponential backoff prevents server overload + +## Security + +### Data Protection + +- Telemetry events may contain sensitive information +- Events are stored locally only +- No additional network exposure beyond normal telemetry + +### Privacy + +- Retry queue respects user telemetry preferences +- Queue is cleared when telemetry is disabled +- No additional data collection beyond original events + +## Troubleshooting + +### Common Issues + +1. **Queue not working**: Check `telemetryRetryEnabled` setting +2. **Too many notifications**: Disable `telemetryRetryNotifications` +3. **Queue growing too large**: Reduce `telemetryRetryQueueSize` +4. **Slow retry processing**: Reduce `telemetryRetryBaseDelay` + +### Debugging + +Enable debug logging by setting the telemetry client debug flag: + +```typescript +const client = new PostHogTelemetryClient(true) // Enable debug +``` + +### Queue Inspection + +Use the command palette: + +1. Open Command Palette (Ctrl/Cmd + Shift + P) +2. Run "Roo Code: Show Telemetry Queue" +3. View queue status and management options + +## Migration + +### Existing Installations + +The retry queue is automatically enabled for existing installations with default settings. No user action is required. + +### Upgrading + +When upgrading from versions without retry queue: + +- Existing telemetry behavior is preserved +- Retry queue is enabled with default settings +- Users can disable via settings if desired + +## Future Enhancements + +Potential future improvements: + +- Configurable retry strategies (linear, custom) +- Queue analytics and reporting +- Network condition detection +- Intelligent batching based on connection quality +- Event compression for large queues diff --git a/packages/cloud/src/CloudService.ts b/packages/cloud/src/CloudService.ts index 08a270bfc3..ee43ff160f 100644 --- a/packages/cloud/src/CloudService.ts +++ b/packages/cloud/src/CloudService.ts @@ -50,6 +50,9 @@ export class CloudService { this.telemetryClient = new TelemetryClient(this.authService, this.settingsService) + // Initialize retry queue for cloud telemetry client + this.telemetryClient.initializeRetryQueue(this.context) + this.shareService = new ShareService(this.authService, this.settingsService, this.log) try { diff --git a/packages/cloud/src/TelemetryClient.ts b/packages/cloud/src/TelemetryClient.ts index 1ad892cb97..50a7f43d88 100644 --- a/packages/cloud/src/TelemetryClient.ts +++ b/packages/cloud/src/TelemetryClient.ts @@ -1,11 +1,15 @@ import { TelemetryEventName, type TelemetryEvent, rooCodeTelemetryEventSchema } from "@roo-code/types" -import { BaseTelemetryClient } from "@roo-code/telemetry" +import { BaseTelemetryClient, TelemetryRetryQueue } from "@roo-code/telemetry" +import * as vscode from "vscode" import { getRooCodeApiUrl } from "./Config" import { AuthService } from "./AuthService" import { SettingsService } from "./SettingsService" export class TelemetryClient extends BaseTelemetryClient { + private retryQueue: TelemetryRetryQueue | null = null + private context: vscode.ExtensionContext | null = null + constructor( private authService: AuthService, private settingsService: SettingsService, @@ -20,6 +24,22 @@ export class TelemetryClient extends BaseTelemetryClient { ) } + /** + * Initialize the retry queue with VSCode extension context + */ + public initializeRetryQueue(context: vscode.ExtensionContext): void { + this.context = context + const retrySettings = context.globalState.get("telemetryRetrySettings") as Record | undefined + this.retryQueue = new TelemetryRetryQueue(context, retrySettings) + + // Start periodic retry processing + setInterval(async () => { + if (this.retryQueue) { + await this.retryQueue.processQueue((event) => this.attemptDirectSend(event)) + } + }, 30000) // 30 seconds + } + private async fetch(path: string, options: RequestInit) { if (!this.authService.isAuthenticated()) { return @@ -53,32 +73,60 @@ export class TelemetryClient extends BaseTelemetryClient { return } - const payload = { - type: event.event, - properties: await this.getEventProperties(event), - } + // Try to send immediately first + const success = await this.attemptDirectSend(event) - if (this.debug) { - console.info(`[TelemetryClient#capture] ${JSON.stringify(payload)}`) + if (!success && this.retryQueue) { + // If immediate send fails, add to retry queue + const priority = this.isHighPriorityEvent(event.event) ? "high" : "normal" + await this.retryQueue.enqueue(event, priority) } + } - const result = rooCodeTelemetryEventSchema.safeParse(payload) + /** + * Attempts to send a telemetry event directly without retry logic + */ + private async attemptDirectSend(event: TelemetryEvent): Promise { + try { + const payload = { + type: event.event, + properties: await this.getEventProperties(event), + } - if (!result.success) { - console.error( - `[TelemetryClient#capture] Invalid telemetry event: ${result.error.message} - ${JSON.stringify(payload)}`, - ) + if (this.debug) { + console.info(`[TelemetryClient#attemptDirectSend] ${JSON.stringify(payload)}`) + } - return - } + const result = rooCodeTelemetryEventSchema.safeParse(payload) + + if (!result.success) { + console.error( + `[TelemetryClient#attemptDirectSend] Invalid telemetry event: ${result.error.message} - ${JSON.stringify(payload)}`, + ) + return false + } - try { await this.fetch(`events`, { method: "POST", body: JSON.stringify(result.data) }) + return true } catch (error) { - console.error(`[TelemetryClient#capture] Error sending telemetry event: ${error}`) + console.warn(`[TelemetryClient#attemptDirectSend] Error sending telemetry event: ${error}`) + return false } } + /** + * Determines if an event should be treated as high priority + */ + private isHighPriorityEvent(eventName: TelemetryEventName): boolean { + const highPriorityEvents = new Set([ + TelemetryEventName.SCHEMA_VALIDATION_ERROR, + TelemetryEventName.DIFF_APPLICATION_ERROR, + TelemetryEventName.SHELL_INTEGRATION_ERROR, + TelemetryEventName.CONSECUTIVE_MISTAKE_ERROR, + ]) + return highPriorityEvents.has(eventName) + } + public override updateTelemetryState(_didUserOptIn: boolean) {} public override isTelemetryEnabled(): boolean { @@ -100,5 +148,9 @@ export class TelemetryClient extends BaseTelemetryClient { return true } - public override async shutdown() {} + public override async shutdown() { + if (this.retryQueue) { + this.retryQueue.dispose() + } + } } diff --git a/packages/telemetry/src/ResilientTelemetryClient.ts b/packages/telemetry/src/ResilientTelemetryClient.ts new file mode 100644 index 0000000000..d72ee0f434 --- /dev/null +++ b/packages/telemetry/src/ResilientTelemetryClient.ts @@ -0,0 +1,192 @@ +import * as vscode from "vscode" +import { TelemetryEvent, TelemetryClient, TelemetryPropertiesProvider, TelemetryEventName } from "@roo-code/types" +import { TelemetryRetryQueue, RetryQueueConfig } from "./TelemetryRetryQueue" + +/** + * ResilientTelemetryClient wraps any TelemetryClient with retry functionality. + * It provides: + * - Automatic retry with exponential backoff for failed sends + * - Persistent queue that survives extension restarts + * - Connection status monitoring + * - Priority handling for critical events + * - User notifications for prolonged disconnection + */ +export class ResilientTelemetryClient implements TelemetryClient { + private wrappedClient: TelemetryClient + private retryQueue: TelemetryRetryQueue + private context: vscode.ExtensionContext + private isOnline = true + private retryInterval: NodeJS.Timeout | null = null + + // Events that should be treated as high priority + private readonly highPriorityEvents = new Set([ + TelemetryEventName.SCHEMA_VALIDATION_ERROR, + TelemetryEventName.DIFF_APPLICATION_ERROR, + TelemetryEventName.SHELL_INTEGRATION_ERROR, + TelemetryEventName.CONSECUTIVE_MISTAKE_ERROR, + ]) + + constructor( + wrappedClient: TelemetryClient, + context: vscode.ExtensionContext, + config: Partial = {}, + ) { + this.wrappedClient = wrappedClient + this.context = context + this.retryQueue = new TelemetryRetryQueue(context, config) + + // Start periodic retry processing + this.startRetryProcessor() + + // Register commands for manual control + this.registerCommands() + } + + public get subscription() { + return this.wrappedClient.subscription + } + + public setProvider(provider: TelemetryPropertiesProvider): void { + this.wrappedClient.setProvider(provider) + } + + public async capture(event: TelemetryEvent): Promise { + // Always try to send immediately first, regardless of telemetry state + // The wrapped client will handle telemetry state checking + const success = await this.attemptSend(event) + + // Only queue if telemetry is enabled and send failed + if (!success && this.wrappedClient.isTelemetryEnabled()) { + const priority = this.highPriorityEvents.has(event.event) ? "high" : "normal" + await this.retryQueue.enqueue(event, priority) + } + } + + public updateTelemetryState(didUserOptIn: boolean): void { + this.wrappedClient.updateTelemetryState(didUserOptIn) + } + + public isTelemetryEnabled(): boolean { + return this.wrappedClient.isTelemetryEnabled() + } + + public async shutdown(): Promise { + // Stop retry processor + if (this.retryInterval) { + clearInterval(this.retryInterval) + this.retryInterval = null + } + + // Dispose retry queue + this.retryQueue.dispose() + + // Shutdown wrapped client + await this.wrappedClient.shutdown() + } + + /** + * Gets the current retry queue status + */ + public async getQueueStatus(): Promise<{ + queueSize: number + connectionStatus: ReturnType + }> { + return { + queueSize: await this.retryQueue.getQueueSize(), + connectionStatus: this.retryQueue.getConnectionStatus(), + } + } + + /** + * Manually triggers a retry of queued events + */ + public async retryNow(): Promise { + await this.retryQueue.triggerRetry((event) => this.attemptSend(event)) + } + + /** + * Clears all queued events + */ + public async clearQueue(): Promise { + await this.retryQueue.clearQueue() + } + + /** + * Updates the retry queue configuration + */ + public updateRetryConfig(config: Partial): void { + this.retryQueue.updateConfig(config) + } + + private async attemptSend(event: TelemetryEvent): Promise { + try { + await this.wrappedClient.capture(event) + return true + } catch (error) { + // Only log as warning if telemetry is actually enabled + if (this.wrappedClient.isTelemetryEnabled()) { + console.warn(`[ResilientTelemetryClient] Failed to send telemetry event: ${error}`) + } + return false + } + } + + private startRetryProcessor(): void { + // Process retry queue every 30 seconds + this.retryInterval = setInterval(async () => { + try { + await this.retryQueue.processQueue((event) => this.attemptSend(event)) + } catch (error) { + console.error(`[ResilientTelemetryClient] Error processing retry queue: ${error}`) + } + }, 30000) // 30 seconds + } + + private registerCommands(): void { + // Register command to show queue status + vscode.commands.registerCommand("roo-code.telemetry.showQueue", async () => { + const status = await this.getQueueStatus() + const connectionStatus = status.connectionStatus.isConnected ? "Connected" : "Disconnected" + const lastSuccess = new Date(status.connectionStatus.lastSuccessfulSend).toLocaleString() + + const message = `Telemetry Queue Status: +• Queue Size: ${status.queueSize} events +• Connection: ${connectionStatus} +• Last Successful Send: ${lastSuccess} +• Consecutive Failures: ${status.connectionStatus.consecutiveFailures}` + + const actions = ["Retry Now", "Clear Queue", "Close"] + const selection = await vscode.window.showInformationMessage(message, ...actions) + + switch (selection) { + case "Retry Now": + await this.retryNow() + vscode.window.showInformationMessage("Telemetry retry triggered") + break + case "Clear Queue": + await this.clearQueue() + vscode.window.showInformationMessage("Telemetry queue cleared") + break + } + }) + + // Register command to manually retry now + vscode.commands.registerCommand("roo-code.telemetry.retryNow", async () => { + await this.retryNow() + }) + + // Register command to clear queue + vscode.commands.registerCommand("roo-code.telemetry.clearQueue", async () => { + const confirmation = await vscode.window.showWarningMessage( + "Are you sure you want to clear all queued telemetry events?", + "Yes", + "No", + ) + + if (confirmation === "Yes") { + await this.clearQueue() + vscode.window.showInformationMessage("Telemetry queue cleared") + } + }) + } +} diff --git a/packages/telemetry/src/TelemetryRetryQueue.ts b/packages/telemetry/src/TelemetryRetryQueue.ts new file mode 100644 index 0000000000..6817127a91 --- /dev/null +++ b/packages/telemetry/src/TelemetryRetryQueue.ts @@ -0,0 +1,355 @@ +import * as vscode from "vscode" +import { TelemetryEvent } from "@roo-code/types" + +export interface QueuedTelemetryEvent { + id: string + event: TelemetryEvent + timestamp: number + retryCount: number + nextRetryAt: number + priority: "high" | "normal" +} + +export interface RetryQueueConfig { + maxRetries: number + baseDelayMs: number + maxDelayMs: number + maxQueueSize: number + batchSize: number + enableNotifications: boolean +} + +export const DEFAULT_RETRY_CONFIG: RetryQueueConfig = { + maxRetries: 5, + baseDelayMs: 1000, // 1 second + maxDelayMs: 300000, // 5 minutes + maxQueueSize: 1000, + batchSize: 10, + enableNotifications: true, +} + +export interface ConnectionStatus { + isConnected: boolean + lastSuccessfulSend: number + consecutiveFailures: number +} + +/** + * TelemetryRetryQueue manages persistent storage and retry logic for failed telemetry events. + * Features: + * - Persistent storage using VSCode's globalState + * - Exponential backoff retry strategy + * - Priority-based event handling + * - Connection status monitoring + * - Configurable queue limits and retry behavior + */ +export class TelemetryRetryQueue { + private context: vscode.ExtensionContext + private config: RetryQueueConfig + private connectionStatus: ConnectionStatus + private retryTimer: NodeJS.Timeout | null = null + private isProcessing = false + private statusBarItem: vscode.StatusBarItem | null = null + + constructor(context: vscode.ExtensionContext, config: Partial = {}) { + this.context = context + this.config = { ...DEFAULT_RETRY_CONFIG, ...config } + this.connectionStatus = { + isConnected: true, + lastSuccessfulSend: Date.now(), + consecutiveFailures: 0, + } + + // Initialize status bar item for connection status + this.statusBarItem = vscode.window.createStatusBarItem(vscode.StatusBarAlignment.Right, 100) + this.updateStatusBar() + } + + /** + * Adds a telemetry event to the retry queue + */ + public async enqueue(event: TelemetryEvent, priority: "high" | "normal" = "normal"): Promise { + const queue = await this.getQueue() + + // Check queue size limit + if (queue.length >= this.config.maxQueueSize) { + // Remove oldest normal priority events to make room + const normalPriorityIndex = queue.findIndex((item) => item.priority === "normal") + if (normalPriorityIndex !== -1) { + queue.splice(normalPriorityIndex, 1) + } else { + // If no normal priority events, remove oldest event + queue.shift() + } + } + + const queuedEvent: QueuedTelemetryEvent = { + id: this.generateId(), + event, + timestamp: Date.now(), + retryCount: 0, + nextRetryAt: Date.now(), + priority, + } + + // Insert based on priority (high priority events go first) + if (priority === "high") { + const firstNormalIndex = queue.findIndex((item) => item.priority === "normal") + if (firstNormalIndex === -1) { + queue.push(queuedEvent) + } else { + queue.splice(firstNormalIndex, 0, queuedEvent) + } + } else { + queue.push(queuedEvent) + } + + await this.saveQueue(queue) + this.scheduleNextRetry() + } + + /** + * Processes the retry queue, attempting to send failed events + */ + public async processQueue(sendFunction: (event: TelemetryEvent) => Promise): Promise { + if (this.isProcessing) { + return + } + + this.isProcessing = true + + try { + const queue = await this.getQueue() + const now = Date.now() + const eventsToRetry = queue.filter((item) => item.nextRetryAt <= now) + + if (eventsToRetry.length === 0) { + return + } + + // Process events in batches + const batch = eventsToRetry.slice(0, this.config.batchSize) + const results = await Promise.allSettled( + batch.map(async (queuedEvent) => { + const success = await sendFunction(queuedEvent.event) + return { queuedEvent, success } + }), + ) + + let hasSuccessfulSend = false + const updatedQueue = [...queue] + + for (const result of results) { + if (result.status === "fulfilled") { + const { queuedEvent, success } = result.value + + if (success) { + // Remove successful event from queue + const index = updatedQueue.findIndex((item) => item.id === queuedEvent.id) + if (index !== -1) { + updatedQueue.splice(index, 1) + } + hasSuccessfulSend = true + } else { + // Update retry information for failed event + const index = updatedQueue.findIndex((item) => item.id === queuedEvent.id) + if (index !== -1) { + updatedQueue[index].retryCount++ + + if (updatedQueue[index].retryCount >= this.config.maxRetries) { + // Remove event that has exceeded max retries + updatedQueue.splice(index, 1) + } else { + // Calculate next retry time with exponential backoff + const delay = Math.min( + this.config.baseDelayMs * Math.pow(2, updatedQueue[index].retryCount), + this.config.maxDelayMs, + ) + updatedQueue[index].nextRetryAt = now + delay + } + } + } + } + } + + await this.saveQueue(updatedQueue) + this.updateConnectionStatus(hasSuccessfulSend) + this.scheduleNextRetry() + } finally { + this.isProcessing = false + } + } + + /** + * Gets the current queue size + */ + public async getQueueSize(): Promise { + const queue = await this.getQueue() + return queue.length + } + + /** + * Clears all events from the queue + */ + public async clearQueue(): Promise { + await this.saveQueue([]) + this.updateConnectionStatus(true) + } + + /** + * Gets connection status information + */ + public getConnectionStatus(): ConnectionStatus { + return { ...this.connectionStatus } + } + + /** + * Updates the retry queue configuration + */ + public updateConfig(newConfig: Partial): void { + this.config = { ...this.config, ...newConfig } + } + + /** + * Disposes of the retry queue and cleans up resources + */ + public dispose(): void { + if (this.retryTimer) { + clearTimeout(this.retryTimer) + this.retryTimer = null + } + if (this.statusBarItem) { + this.statusBarItem.dispose() + this.statusBarItem = null + } + } + + /** + * Manually triggers a retry attempt + */ + public async triggerRetry(sendFunction: (event: TelemetryEvent) => Promise): Promise { + await this.processQueue(sendFunction) + } + + private async getQueue(): Promise { + const stored = this.context.globalState.get("telemetryRetryQueue", []) + return stored + } + + private async saveQueue(queue: QueuedTelemetryEvent[]): Promise { + await this.context.globalState.update("telemetryRetryQueue", queue) + this.updateStatusBar() + } + + private generateId(): string { + return `${Date.now()}-${Math.random().toString(36).substr(2, 9)}` + } + + private scheduleNextRetry(): void { + if (this.retryTimer) { + clearTimeout(this.retryTimer) + } + + // Schedule next retry based on the earliest nextRetryAt time + this.getQueue().then((queue) => { + if (queue.length === 0) { + return + } + + const now = Date.now() + const nextRetryTime = Math.min(...queue.map((item) => item.nextRetryAt)) + const delay = Math.max(0, nextRetryTime - now) + + this.retryTimer = setTimeout(() => { + // The actual retry will be triggered by the telemetry client + this.retryTimer = null + }, delay) + }) + } + + private updateConnectionStatus(hasSuccessfulSend: boolean): void { + if (hasSuccessfulSend) { + this.connectionStatus.isConnected = true + this.connectionStatus.lastSuccessfulSend = Date.now() + this.connectionStatus.consecutiveFailures = 0 + } else { + this.connectionStatus.consecutiveFailures++ + + // Consider disconnected after 3 consecutive failures + if (this.connectionStatus.consecutiveFailures >= 3) { + this.connectionStatus.isConnected = false + } + } + + this.updateStatusBar() + this.showNotificationIfNeeded() + } + + private updateStatusBar(): void { + if (!this.statusBarItem) { + return + } + + this.getQueue() + .then((queue) => { + if (!this.statusBarItem) { + return + } + + if (queue.length === 0) { + this.statusBarItem.hide() + return + } + + const queueSize = queue.length + const isConnected = this.connectionStatus.isConnected + + if (!isConnected) { + this.statusBarItem.text = `$(warning) Telemetry: ${queueSize} queued` + this.statusBarItem.tooltip = `${queueSize} telemetry events queued due to connection issues` + this.statusBarItem.backgroundColor = new vscode.ThemeColor("statusBarItem.warningBackground") + } else { + this.statusBarItem.text = `$(sync) Telemetry: ${queueSize} pending` + this.statusBarItem.tooltip = `${queueSize} telemetry events pending retry` + this.statusBarItem.backgroundColor = undefined + } + + this.statusBarItem.command = "roo-code.telemetry.showQueue" + this.statusBarItem.show() + }) + .catch((error) => { + console.warn("[TelemetryRetryQueue] Error updating status bar:", error) + }) + } + + private showNotificationIfNeeded(): void { + if (!this.config.enableNotifications) { + return + } + + const timeSinceLastSuccess = Date.now() - this.connectionStatus.lastSuccessfulSend + const fiveMinutes = 5 * 60 * 1000 + + // Show notification if disconnected for more than 5 minutes + if (!this.connectionStatus.isConnected && timeSinceLastSuccess > fiveMinutes) { + this.getQueue().then((queue) => { + if (queue.length > 0) { + vscode.window + .showWarningMessage( + `Telemetry connection issues detected. ${queue.length} events queued for retry.`, + "Retry Now", + "Disable Notifications", + ) + .then((selection) => { + if (selection === "Retry Now") { + // Trigger manual retry - this will be handled by the telemetry client + vscode.commands.executeCommand("roo-code.telemetry.retryNow") + } else if (selection === "Disable Notifications") { + this.config.enableNotifications = false + } + }) + } + }) + } + } +} diff --git a/packages/telemetry/src/__tests__/ResilientTelemetryClient.test.ts b/packages/telemetry/src/__tests__/ResilientTelemetryClient.test.ts new file mode 100644 index 0000000000..ae1e157346 --- /dev/null +++ b/packages/telemetry/src/__tests__/ResilientTelemetryClient.test.ts @@ -0,0 +1,230 @@ +import { describe, it, expect, beforeEach, afterEach, vi } from "vitest" +import * as vscode from "vscode" +import { ResilientTelemetryClient } from "../ResilientTelemetryClient" +import { TelemetryEventName, TelemetryClient } from "@roo-code/types" + +// Mock VSCode +vi.mock("vscode", () => ({ + window: { + createStatusBarItem: vi.fn(() => ({ + text: "", + tooltip: "", + backgroundColor: undefined, + command: "", + show: vi.fn(), + hide: vi.fn(), + dispose: vi.fn(), + })), + showWarningMessage: vi.fn(), + showInformationMessage: vi.fn(), + }, + StatusBarAlignment: { + Right: 2, + }, + ThemeColor: vi.fn(), + commands: { + executeCommand: vi.fn(), + registerCommand: vi.fn(), + }, +})) + +describe("ResilientTelemetryClient", () => { + let mockWrappedClient: TelemetryClient + let mockContext: vscode.ExtensionContext + let resilientClient: ResilientTelemetryClient + + beforeEach(() => { + mockWrappedClient = { + capture: vi.fn().mockResolvedValue(undefined), + setProvider: vi.fn(), + updateTelemetryState: vi.fn(), + isTelemetryEnabled: vi.fn().mockReturnValue(true), + shutdown: vi.fn().mockResolvedValue(undefined), + } + + mockContext = { + globalState: { + get: vi.fn().mockReturnValue([]), + update: vi.fn().mockResolvedValue(undefined), + }, + } as unknown as vscode.ExtensionContext + + resilientClient = new ResilientTelemetryClient(mockWrappedClient, mockContext) + }) + + afterEach(() => { + resilientClient.shutdown() + vi.clearAllMocks() + }) + + describe("constructor", () => { + it("should initialize with wrapped client", () => { + expect(resilientClient).toBeDefined() + }) + + it("should register commands", () => { + expect(vscode.commands.registerCommand).toHaveBeenCalledWith( + "roo-code.telemetry.showQueue", + expect.any(Function), + ) + expect(vscode.commands.registerCommand).toHaveBeenCalledWith( + "roo-code.telemetry.retryNow", + expect.any(Function), + ) + expect(vscode.commands.registerCommand).toHaveBeenCalledWith( + "roo-code.telemetry.clearQueue", + expect.any(Function), + ) + }) + }) + + describe("capture", () => { + it("should try immediate send first", async () => { + const event = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "test" }, + } + + await resilientClient.capture(event) + + expect(mockWrappedClient.capture).toHaveBeenCalledWith(event) + }) + + it("should queue event if immediate send fails", async () => { + const event = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "test" }, + } + + // Make wrapped client throw error + vi.mocked(mockWrappedClient.capture).mockRejectedValue(new Error("Network error")) + + await resilientClient.capture(event) + + expect(mockWrappedClient.capture).toHaveBeenCalledWith(event) + // Event should be queued (we can't directly test this without exposing internals) + }) + + it("should prioritize high priority events", async () => { + const highPriorityEvent = { + event: TelemetryEventName.SCHEMA_VALIDATION_ERROR, + properties: { error: "test" }, + } + + // Make wrapped client fail + vi.mocked(mockWrappedClient.capture).mockRejectedValue(new Error("Network error")) + + await resilientClient.capture(highPriorityEvent) + + expect(mockWrappedClient.capture).toHaveBeenCalledWith(highPriorityEvent) + }) + + it("should not queue if telemetry is disabled", async () => { + const event = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "test" }, + } + + vi.mocked(mockWrappedClient.isTelemetryEnabled).mockReturnValue(false) + + await resilientClient.capture(event) + + // When telemetry is disabled, the wrapped client's capture should still be called + // but it should return early and not queue anything + expect(mockWrappedClient.capture).toHaveBeenCalledWith(event) + }) + }) + + describe("delegation methods", () => { + it("should delegate setProvider to wrapped client", () => { + const mockProvider = {} as Parameters[0] + resilientClient.setProvider(mockProvider) + + expect(mockWrappedClient.setProvider).toHaveBeenCalledWith(mockProvider) + }) + + it("should delegate updateTelemetryState to wrapped client", () => { + resilientClient.updateTelemetryState(true) + + expect(mockWrappedClient.updateTelemetryState).toHaveBeenCalledWith(true) + }) + + it("should delegate isTelemetryEnabled to wrapped client", () => { + const result = resilientClient.isTelemetryEnabled() + + expect(mockWrappedClient.isTelemetryEnabled).toHaveBeenCalled() + expect(result).toBe(true) + }) + + it("should return subscription from wrapped client", () => { + const mockSubscription = { type: "exclude", events: [] } as typeof mockWrappedClient.subscription + mockWrappedClient.subscription = mockSubscription + + expect(resilientClient.subscription).toBe(mockSubscription) + }) + }) + + describe("getQueueStatus", () => { + it("should return queue status", async () => { + const status = await resilientClient.getQueueStatus() + + expect(status).toHaveProperty("queueSize") + expect(status).toHaveProperty("connectionStatus") + expect(typeof status.queueSize).toBe("number") + expect(status.connectionStatus).toHaveProperty("isConnected") + }) + }) + + describe("retryNow", () => { + it("should trigger manual retry", async () => { + await expect(resilientClient.retryNow()).resolves.not.toThrow() + }) + }) + + describe("clearQueue", () => { + it("should clear the retry queue", async () => { + await expect(resilientClient.clearQueue()).resolves.not.toThrow() + }) + }) + + describe("updateRetryConfig", () => { + it("should update retry configuration", () => { + const newConfig = { maxRetries: 10, enableNotifications: false } + + expect(() => resilientClient.updateRetryConfig(newConfig)).not.toThrow() + }) + }) + + describe("shutdown", () => { + it("should shutdown wrapped client and cleanup", async () => { + await resilientClient.shutdown() + + expect(mockWrappedClient.shutdown).toHaveBeenCalled() + }) + }) + + describe("high priority events", () => { + const highPriorityEvents = [ + TelemetryEventName.SCHEMA_VALIDATION_ERROR, + TelemetryEventName.DIFF_APPLICATION_ERROR, + TelemetryEventName.SHELL_INTEGRATION_ERROR, + TelemetryEventName.CONSECUTIVE_MISTAKE_ERROR, + ] + + highPriorityEvents.forEach((eventName) => { + it(`should treat ${eventName} as high priority`, async () => { + const event = { + event: eventName, + properties: { test: "data" }, + } + + // Make wrapped client fail to trigger queueing + vi.mocked(mockWrappedClient.capture).mockRejectedValue(new Error("Network error")) + + await resilientClient.capture(event) + + expect(mockWrappedClient.capture).toHaveBeenCalledWith(event) + }) + }) + }) +}) diff --git a/packages/telemetry/src/__tests__/TelemetryRetryQueue.test.ts b/packages/telemetry/src/__tests__/TelemetryRetryQueue.test.ts new file mode 100644 index 0000000000..68d4f62619 --- /dev/null +++ b/packages/telemetry/src/__tests__/TelemetryRetryQueue.test.ts @@ -0,0 +1,286 @@ +import { describe, it, expect, beforeEach, afterEach, vi } from "vitest" +import * as vscode from "vscode" +import { TelemetryRetryQueue, DEFAULT_RETRY_CONFIG } from "../TelemetryRetryQueue" +import { TelemetryEventName } from "@roo-code/types" + +// Mock VSCode +vi.mock("vscode", () => ({ + window: { + createStatusBarItem: vi.fn(() => ({ + text: "", + tooltip: "", + backgroundColor: undefined, + command: "", + show: vi.fn(), + hide: vi.fn(), + dispose: vi.fn(), + })), + showWarningMessage: vi.fn(), + showInformationMessage: vi.fn(), + }, + StatusBarAlignment: { + Right: 2, + }, + ThemeColor: vi.fn(), + commands: { + executeCommand: vi.fn(), + registerCommand: vi.fn(), + }, +})) + +describe("TelemetryRetryQueue", () => { + let mockContext: vscode.ExtensionContext + let retryQueue: TelemetryRetryQueue + + beforeEach(() => { + mockContext = { + globalState: { + get: vi.fn().mockReturnValue([]), + update: vi.fn().mockResolvedValue(undefined), + }, + } as unknown as vscode.ExtensionContext + + retryQueue = new TelemetryRetryQueue(mockContext) + }) + + afterEach(() => { + retryQueue.dispose() + vi.clearAllMocks() + }) + + describe("constructor", () => { + it("should initialize with default config", () => { + expect(retryQueue).toBeDefined() + }) + + it("should accept custom config", () => { + const customConfig = { maxRetries: 3, baseDelayMs: 500 } + const customQueue = new TelemetryRetryQueue(mockContext, customConfig) + expect(customQueue).toBeDefined() + customQueue.dispose() + }) + }) + + describe("enqueue", () => { + it("should add event to queue", async () => { + const event = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "test-123" }, + } + + await retryQueue.enqueue(event) + + expect(mockContext.globalState.update).toHaveBeenCalledWith( + "telemetryRetryQueue", + expect.arrayContaining([ + expect.objectContaining({ + event, + priority: "normal", + retryCount: 0, + }), + ]), + ) + }) + + it("should prioritize high priority events", async () => { + const normalEvent = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "normal" }, + } + + const highEvent = { + event: TelemetryEventName.SCHEMA_VALIDATION_ERROR, + properties: { error: "test" }, + } + + await retryQueue.enqueue(normalEvent, "normal") + await retryQueue.enqueue(highEvent, "high") + + // High priority event should be inserted before normal priority + const calls = vi.mocked(mockContext.globalState.update).mock.calls + const lastCall = calls[calls.length - 1] + const queue = lastCall[1] + + expect(queue[0].priority).toBe("high") + expect(queue[1].priority).toBe("normal") + }) + + it("should respect queue size limit", async () => { + const smallQueue = new TelemetryRetryQueue(mockContext, { maxQueueSize: 2 }) + + const event1 = { event: TelemetryEventName.TASK_CREATED, properties: { taskId: "1" } } + const event2 = { event: TelemetryEventName.TASK_CREATED, properties: { taskId: "2" } } + const event3 = { event: TelemetryEventName.TASK_CREATED, properties: { taskId: "3" } } + + await smallQueue.enqueue(event1) + await smallQueue.enqueue(event2) + await smallQueue.enqueue(event3) // Should remove oldest + + const queueSize = await smallQueue.getQueueSize() + expect(queueSize).toBe(2) + + smallQueue.dispose() + }) + }) + + describe("processQueue", () => { + it("should process events and remove successful ones", async () => { + const event = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "test" }, + } + + // Mock existing queue with one event + vi.mocked(mockContext.globalState.get).mockReturnValue([ + { + id: "test-id", + event, + timestamp: Date.now(), + retryCount: 0, + nextRetryAt: Date.now() - 1000, // Ready for retry + priority: "normal", + }, + ]) + + const sendFunction = vi.fn().mockResolvedValue(true) // Success + + await retryQueue.processQueue(sendFunction) + + expect(sendFunction).toHaveBeenCalledWith(event) + expect(mockContext.globalState.update).toHaveBeenCalledWith("telemetryRetryQueue", []) + }) + + it("should increment retry count for failed events", async () => { + const event = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "test" }, + } + + const queuedEvent = { + id: "test-id", + event, + timestamp: Date.now(), + retryCount: 0, + nextRetryAt: Date.now() - 1000, + priority: "normal", + } + + vi.mocked(mockContext.globalState.get).mockReturnValue([queuedEvent]) + + const sendFunction = vi.fn().mockResolvedValue(false) // Failure + + await retryQueue.processQueue(sendFunction) + + expect(sendFunction).toHaveBeenCalledWith(event) + + const updateCalls = vi.mocked(mockContext.globalState.update).mock.calls + const lastCall = updateCalls[updateCalls.length - 1] + const updatedQueue = lastCall[1] + + expect(updatedQueue[0].retryCount).toBe(1) + expect(updatedQueue[0].nextRetryAt).toBeGreaterThan(Date.now()) + }) + + it("should remove events that exceed max retries", async () => { + const event = { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: "test" }, + } + + const queuedEvent = { + id: "test-id", + event, + timestamp: Date.now(), + retryCount: DEFAULT_RETRY_CONFIG.maxRetries, // Already at max + nextRetryAt: Date.now() - 1000, + priority: "normal", + } + + vi.mocked(mockContext.globalState.get).mockReturnValue([queuedEvent]) + + const sendFunction = vi.fn().mockResolvedValue(false) // Failure + + await retryQueue.processQueue(sendFunction) + + expect(mockContext.globalState.update).toHaveBeenCalledWith("telemetryRetryQueue", []) + }) + + it("should process events in batches", async () => { + const events = Array.from({ length: 15 }, (_, i) => ({ + id: `test-id-${i}`, + event: { + event: TelemetryEventName.TASK_CREATED, + properties: { taskId: `test-${i}` }, + }, + timestamp: Date.now(), + retryCount: 0, + nextRetryAt: Date.now() - 1000, + priority: "normal" as const, + })) + + vi.mocked(mockContext.globalState.get).mockReturnValue(events) + + const sendFunction = vi.fn().mockResolvedValue(true) + + await retryQueue.processQueue(sendFunction) + + // Should only process batch size (default 10) + expect(sendFunction).toHaveBeenCalledTimes(DEFAULT_RETRY_CONFIG.batchSize) + }) + }) + + describe("getQueueSize", () => { + it("should return correct queue size", async () => { + const events = [ + { id: "1", event: {}, timestamp: 0, retryCount: 0, nextRetryAt: 0, priority: "normal" }, + { id: "2", event: {}, timestamp: 0, retryCount: 0, nextRetryAt: 0, priority: "normal" }, + ] + + vi.mocked(mockContext.globalState.get).mockReturnValue(events) + + const size = await retryQueue.getQueueSize() + expect(size).toBe(2) + }) + }) + + describe("clearQueue", () => { + it("should clear all events from queue", async () => { + await retryQueue.clearQueue() + + expect(mockContext.globalState.update).toHaveBeenCalledWith("telemetryRetryQueue", []) + }) + }) + + describe("getConnectionStatus", () => { + it("should return connection status", () => { + const status = retryQueue.getConnectionStatus() + + expect(status).toHaveProperty("isConnected") + expect(status).toHaveProperty("lastSuccessfulSend") + expect(status).toHaveProperty("consecutiveFailures") + }) + }) + + describe("updateConfig", () => { + it("should update configuration", () => { + const newConfig = { maxRetries: 10, enableNotifications: false } + + retryQueue.updateConfig(newConfig) + + // Config should be updated (we can't directly test private properties, + // but we can test behavior changes) + expect(() => retryQueue.updateConfig(newConfig)).not.toThrow() + }) + }) + + describe("triggerRetry", () => { + it("should manually trigger retry processing", async () => { + const sendFunction = vi.fn().mockResolvedValue(true) + + await retryQueue.triggerRetry(sendFunction) + + // Should not throw and should call processQueue internally + expect(() => retryQueue.triggerRetry(sendFunction)).not.toThrow() + }) + }) +}) diff --git a/packages/telemetry/src/index.ts b/packages/telemetry/src/index.ts index 8795ad46a2..7fbf917d57 100644 --- a/packages/telemetry/src/index.ts +++ b/packages/telemetry/src/index.ts @@ -1,3 +1,5 @@ export * from "./BaseTelemetryClient" export * from "./PostHogTelemetryClient" export * from "./TelemetryService" +export * from "./TelemetryRetryQueue" +export * from "./ResilientTelemetryClient" diff --git a/packages/types/src/global-settings.ts b/packages/types/src/global-settings.ts index 5b729a125f..5e29a0cb7f 100644 --- a/packages/types/src/global-settings.ts +++ b/packages/types/src/global-settings.ts @@ -10,7 +10,7 @@ import { import { historyItemSchema } from "./history.js" import { codebaseIndexModelsSchema, codebaseIndexConfigSchema } from "./codebase-index.js" import { experimentsSchema } from "./experiment.js" -import { telemetrySettingsSchema } from "./telemetry.js" +import { telemetrySettingsSchema, telemetryRetrySettingsSchema } from "./telemetry.js" import { modeConfigSchema } from "./mode.js" import { customModePromptsSchema, customSupportPromptsSchema } from "./mode.js" import { languagesSchema } from "./vscode.js" @@ -92,6 +92,7 @@ export const globalSettingsSchema = z.object({ language: languagesSchema.optional(), telemetrySetting: telemetrySettingsSchema.optional(), + telemetryRetrySettings: telemetryRetrySettingsSchema.optional(), mcpEnabled: z.boolean().optional(), enableMcpServerCreation: z.boolean().optional(), diff --git a/packages/types/src/telemetry.ts b/packages/types/src/telemetry.ts index 9861f4425d..ae5fca1eed 100644 --- a/packages/types/src/telemetry.ts +++ b/packages/types/src/telemetry.ts @@ -172,3 +172,29 @@ export interface TelemetryClient { isTelemetryEnabled(): boolean shutdown(): Promise } + +/** + * TelemetryRetrySettings + */ + +export const telemetryRetrySettingsSchema = z.object({ + maxRetries: z.number().min(0).max(10).optional(), + baseDelayMs: z.number().min(100).max(10000).optional(), + maxDelayMs: z.number().min(1000).max(600000).optional(), + maxQueueSize: z.number().min(10).max(10000).optional(), + batchSize: z.number().min(1).max(100).optional(), + enableNotifications: z.boolean().optional(), + enableRetryQueue: z.boolean().optional(), +}) + +export type TelemetryRetrySettings = z.infer + +export const DEFAULT_TELEMETRY_RETRY_SETTINGS: Required = { + maxRetries: 5, + baseDelayMs: 1000, + maxDelayMs: 300000, + maxQueueSize: 1000, + batchSize: 10, + enableNotifications: true, + enableRetryQueue: true, +} diff --git a/src/extension.ts b/src/extension.ts index 64963ac4d8..d0a5f09ff6 100644 --- a/src/extension.ts +++ b/src/extension.ts @@ -13,7 +13,7 @@ try { } import { CloudService } from "@roo-code/cloud" -import { TelemetryService, PostHogTelemetryClient } from "@roo-code/telemetry" +import { TelemetryService, PostHogTelemetryClient, ResilientTelemetryClient } from "@roo-code/telemetry" import "./utils/path" // Necessary to have access to String.prototype.toPosix. import { createOutputChannelLogger, createDualLogger } from "./utils/outputChannelLogger" @@ -64,9 +64,25 @@ export async function activate(context: vscode.ExtensionContext) { const telemetryService = TelemetryService.createInstance() try { - telemetryService.register(new PostHogTelemetryClient()) + // Get telemetry retry settings from VSCode configuration + const config = vscode.workspace.getConfiguration("roo-cline") + const retryConfig = { + enableRetryQueue: config.get("telemetryRetryEnabled", true), + maxRetries: config.get("telemetryRetryMaxRetries", 5), + baseDelayMs: config.get("telemetryRetryBaseDelay", 1000), + maxDelayMs: config.get("telemetryRetryMaxDelay", 300000), + maxQueueSize: config.get("telemetryRetryQueueSize", 1000), + enableNotifications: config.get("telemetryRetryNotifications", true), + batchSize: 10, // Not configurable via UI, use default + } + + // Create PostHog client and wrap it with resilient retry functionality + const postHogClient = new PostHogTelemetryClient() + const resilientClient = new ResilientTelemetryClient(postHogClient, context, retryConfig) + + telemetryService.register(resilientClient) } catch (error) { - console.warn("Failed to register PostHogTelemetryClient:", error) + console.warn("Failed to register ResilientTelemetryClient:", error) } // Create logger for cloud services diff --git a/src/package.json b/src/package.json index 5e2fd096e1..e8fa0915fb 100644 --- a/src/package.json +++ b/src/package.json @@ -344,6 +344,44 @@ "type": "boolean", "default": false, "description": "%settings.rooCodeCloudEnabled.description%" + }, + "roo-cline.telemetryRetryEnabled": { + "type": "boolean", + "default": true, + "description": "Enable automatic retry for failed telemetry events" + }, + "roo-cline.telemetryRetryMaxRetries": { + "type": "number", + "default": 5, + "minimum": 0, + "maximum": 10, + "description": "Maximum number of retry attempts for failed telemetry events" + }, + "roo-cline.telemetryRetryBaseDelay": { + "type": "number", + "default": 1000, + "minimum": 100, + "maximum": 10000, + "description": "Base delay in milliseconds between retry attempts (exponential backoff)" + }, + "roo-cline.telemetryRetryMaxDelay": { + "type": "number", + "default": 300000, + "minimum": 1000, + "maximum": 600000, + "description": "Maximum delay in milliseconds between retry attempts (5 minutes default)" + }, + "roo-cline.telemetryRetryQueueSize": { + "type": "number", + "default": 1000, + "minimum": 10, + "maximum": 10000, + "description": "Maximum number of telemetry events to queue for retry" + }, + "roo-cline.telemetryRetryNotifications": { + "type": "boolean", + "default": true, + "description": "Show notifications when telemetry connection issues are detected" } } }