Skip to content

Commit c33d2e2

Browse files
committed
Update the coffee voice agent README
1 parent 32f5a5b commit c33d2e2

File tree

1 file changed

+201
-0
lines changed

1 file changed

+201
-0
lines changed

coffee_ws/src/coffee_voice_agent/README.md

Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,10 @@ The refactored version was created through careful **file-based modular extracti
9191
- **🌐 ROS2 Bridge**: WebSocket-based integration with Coffee Buddy system
9292
- **📡 Virtual Requests**: External coffee requests via ROS2 topics
9393
- **🔧 Tool Events**: Real-time function tool call tracking for UI feedback
94+
- **👑 VIP Session Management**: Automatic VIP user detection with unlimited conversation time
95+
- **🎯 Smart Conversation Timing**: Event-driven ending system with proper TTS completion
96+
- **🧠 Context-Aware Admin Messages**: Identity-aware conversation guidance
97+
- **⚙️ Admin Override UI**: Real-time VIP detection and extension monitoring
9498

9599
## Architecture
96100

@@ -147,6 +151,8 @@ coffee_voice_agent/
147151
- **`AGENT_STATUS`**: Comprehensive status updates (behavioral mode, speech status, emotion, text, etc.)
148152
- **`TOOL_EVENT`**: Function tool execution tracking (started, completed, failed)
149153
- **`USER_SPEECH`**: Real-time STT transcription events (user speech text)
154+
- **`VIP_DETECTED`**: VIP user identification with matched keywords and importance level
155+
- **`EXTENSION_GRANTED`**: Conversation extension events (VIP sessions, manual extensions)
150156
- **`STARTUP`**: Agent initialization and version info
151157
- **`ACKNOWLEDGMENT`**: Command confirmations
152158

@@ -275,6 +281,157 @@ async for text_chunk in text:
275281
- **Low Latency**: Real-time processing for responsive conversations
276282
- **Unified Events**: Single AgentStatus message provides complete context
277283

284+
## VIP Session Management & Smart Timing
285+
286+
### **👑 VIP User Detection System**
287+
288+
The voice agent includes intelligent VIP user detection that automatically provides enhanced service for important users:
289+
290+
#### **🔍 VIP Detection Keywords**
291+
```python
292+
vip_keywords = [
293+
"alice", "bob", "sui foundation", "event organizer", "staff", "organizer",
294+
"speaker", "sponsor", "mysten labs", "team", "developer", "builder"
295+
]
296+
```
297+
298+
#### **🎯 VIP Session Flow**
299+
```
300+
User: "I'm from Sui Foundation"
301+
302+
check_user_status tool called
303+
304+
VIP detected → set_vip_session()
305+
306+
Hard timeout cancelled (7-minute limit removed)
307+
308+
Only inactivity timeout applies (15 seconds of silence)
309+
310+
Unlimited conversation time while user is engaged
311+
```
312+
313+
### **⏰ Event-Driven Conversation Ending**
314+
315+
The system uses **event-driven TTS completion detection** instead of magic number delays:
316+
317+
#### **🚫 Old Approach (Magic Numbers)**
318+
```python
319+
# Problematic: Guessing TTS completion time
320+
await asyncio.sleep(2) # ❌ Magic number
321+
await self.end_conversation()
322+
```
323+
324+
#### **✅ New Approach (Event-Driven)**
325+
```python
326+
# Proper: Wait for actual TTS completion
327+
self.end_after_current_speech = True # Set flag
328+
329+
# In agent_state_changed handler:
330+
if event.old_state == "speaking" and event.new_state != "speaking":
331+
if self.end_after_current_speech:
332+
# TTS actually completed
333+
await self.end_conversation()
334+
```
335+
336+
### **🧠 Smart Admin Message System**
337+
338+
Context-aware admin messages guide the LLM to make better tool choices:
339+
340+
#### **🔄 Identity Detection Logic**
341+
```python
342+
# Check user message for identity claims
343+
identity_keywords = [
344+
"foundation", "team", "staff", "organizer", "speaker", "sponsor",
345+
"developer", "builder", "employee", "contractor", "member", "labs"
346+
]
347+
348+
if user_mentions_identity:
349+
admin_message = "ADMIN: User mentioned their identity. Call check_user_status FIRST to verify their status, then manage_conversation_time if needed."
350+
else:
351+
admin_message = "ADMIN: You MUST call the manage_conversation_time tool now to either extend or end gracefully."
352+
```
353+
354+
#### **⚙️ VIP vs Regular User Flows**
355+
356+
**Regular User Timeline:**
357+
- 5 minutes: Gentle time awareness message
358+
- 6 minutes: Admin message to call time management tool
359+
- 7 minutes: Hard timeout with conversation ending
360+
361+
**VIP User Timeline:**
362+
- VIP detected: Hard timeout cancelled permanently
363+
- Admin messages disabled for VIP sessions
364+
- Only 15-second inactivity timeout applies
365+
- Unlimited conversation duration while engaged
366+
367+
### **📡 Enhanced WebSocket Events**
368+
369+
New WebSocket events support VIP detection and session management:
370+
371+
#### **🔄 VIP Detection Event**
372+
```json
373+
{
374+
"type": "VIP_DETECTED",
375+
"data": {
376+
"user_identifier": "Sui Foundation",
377+
"matched_keywords": ["sui foundation"],
378+
"importance_level": "vip",
379+
"recommended_extension_minutes": 0,
380+
"timestamp": "2025-07-31T20:38:48.000Z"
381+
}
382+
}
383+
```
384+
385+
#### **🔄 Extension Granted Event**
386+
```json
387+
{
388+
"type": "EXTENSION_GRANTED",
389+
"data": {
390+
"action": "vip_session",
391+
"extension_minutes": 0,
392+
"reason": "VIP user detected: Sui Foundation",
393+
"granted_by": "auto_vip_detection",
394+
"timestamp": "2025-07-31T20:38:48.100Z"
395+
}
396+
}
397+
```
398+
399+
### **⚙️ Admin Override UI Widget**
400+
401+
The UI includes a new `AdminOverrideWidget` that displays:
402+
403+
- **VIP Status**: Real-time VIP user detection
404+
- **Extension Status**: Active conversation extensions with progress
405+
- **VIP History**: Recent VIP detections and actions
406+
- **Visual Indicators**: Extension progress bars and status updates
407+
408+
**UI Layout Integration:**
409+
```
410+
Left Column:
411+
├── Agent Status Widget
412+
├── Emotion Display Widget
413+
└── Admin Override Widget (new)
414+
```
415+
416+
### **🎯 Technical Benefits**
417+
418+
1. **Eliminates Magic Numbers**: No more hardcoded delays for TTS completion
419+
2. **Proper VIP Treatment**: Unlimited time for important users
420+
3. **Race Condition Free**: Event-driven coordination prevents timing conflicts
421+
4. **Context-Aware Guidance**: Smart admin messages improve LLM tool selection
422+
5. **Real-Time Monitoring**: UI shows VIP detection and extension status
423+
6. **Natural Conversation Flow**: Conversations end gracefully after speech completes
424+
425+
### **🔧 Architecture Improvements**
426+
427+
The improvements maintain clean separation of concerns:
428+
429+
- **VIP Detection**: Handled by `check_user_status` tool
430+
- **Session Management**: Managed by `StateManager.set_vip_session()`
431+
- **Event Coordination**: Uses existing `agent_state_changed` events
432+
- **UI Integration**: New WebSocket events feed Admin Override widget
433+
- **Timing Logic**: Single flag-based system replaces multiple timers
434+
278435
### **STT and Speech Recognition Flow**
279436

280437
Understanding how speech-to-text processing and user speech capture works:
@@ -450,6 +607,8 @@ ros2 launch coffee_voice_agent voice_agent_system.launch.py
450607
- `/voice_agent/status` (`coffee_voice_agent_msgs/AgentStatus`) - **Unified agent status for robot coordination**
451608
- `/voice_agent/tool_events` (`coffee_voice_agent_msgs/ToolEvent`) - **Function tool call tracking**
452609
- `/voice_agent/user_speech` (`std_msgs/String`) - **Real-time STT transcription events**
610+
- `/voice_agent/vip_detections` (`coffee_voice_agent_msgs/VipDetection`) - **VIP user detection events**
611+
- `/voice_agent/extension_events` (`coffee_voice_agent_msgs/ExtensionEvent`) - **Conversation extension events**
453612
- `/voice_agent/connected` (`std_msgs/Bool`) - Bridge connection status
454613

455614
### Subscribers (ROS2 → Voice Agent)
@@ -503,6 +662,42 @@ string status
503662
builtin_interfaces/Time timestamp
504663
```
505664

665+
### VipDetection Message
666+
```
667+
# User identifier mentioned by the user
668+
string user_identifier
669+
670+
# Keywords that matched for VIP detection
671+
string[] matched_keywords
672+
673+
# Importance level: "vip", "high", "normal"
674+
string importance_level
675+
676+
# Recommended extension minutes (0 for unlimited VIP sessions)
677+
int32 recommended_extension_minutes
678+
679+
# Timestamp when VIP was detected
680+
builtin_interfaces/Time timestamp
681+
```
682+
683+
### ExtensionEvent Message
684+
```
685+
# Action type: "vip_session", "granted", "expired", "updated"
686+
string action
687+
688+
# Extension duration in minutes (0 for unlimited)
689+
int32 extension_minutes
690+
691+
# Reason for the extension
692+
string reason
693+
694+
# Who granted the extension: "auto_vip_detection", "tool", "manual"
695+
string granted_by
696+
697+
# Timestamp when extension was granted/updated
698+
builtin_interfaces/Time timestamp
699+
```
700+
506701
## Virtual Requests
507702

508703
Send coffee requests to the voice agent via ROS2:
@@ -599,6 +794,12 @@ ros2 topic echo /voice_agent/tool_events
599794
# Monitor user speech transcriptions
600795
ros2 topic echo /voice_agent/user_speech
601796

797+
# Monitor VIP detection events
798+
ros2 topic echo /voice_agent/vip_detections
799+
800+
# Monitor conversation extension events
801+
ros2 topic echo /voice_agent/extension_events
802+
602803
# Monitor bridge connection
603804
ros2 topic echo /voice_agent/connected
604805
```

0 commit comments

Comments
 (0)