microsoft
diff --git a/‎python/packages/devui/README.md‎
Lines changed: 42 additions & 5 deletions b/‎python/packages/devui/README.md‎
Lines changed: 42 additions & 5 deletions
diff --git a/‎python/packages/devui/agent_framework_devui/_executor.py‎
Lines changed: 34 additions & 1 deletion b/‎python/packages/devui/agent_framework_devui/_executor.py‎
Lines changed: 34 additions & 1 deletion
diff --git a/‎python/packages/devui/agent_framework_devui/_mapper.py‎
Lines changed: 83 additions & 58 deletions b/‎python/packages/devui/agent_framework_devui/_mapper.py‎
Lines changed: 83 additions & 58 deletions
@@ -78,7 +78,7 @@ devui ./agents --tracing framework
 
 ## OpenAI-Compatible API
 
-DevUI provides a clean OpenAI-compatible API. Simply use your **agent/workflow name as the model**!
+For convenience, DevUI provides an OpenAI Responses backend API. This means you can run the backend and also use the OpenAI client sdk to connect to it. Use **agent/workflow name as the model**, and set streaming to `True` as needed.
 
 ```bash
 # Simple - use your entity name as the model
@@ -89,7 +89,6 @@ curl -X POST http://localhost:8080/v1/responses \
   "model": "weather_agent",
   "input": "Hello world"
 }
-
 ```
 
 Or use the OpenAI Python SDK:
@@ -102,7 +101,6 @@ client = OpenAI(
     api_key="not-needed"  # API key not required for local DevUI
 )
 
-# Simple - just use your agent/workflow name as the model!
 response = client.responses.create(
     model="weather_agent",  # Your agent/workflow name
     input="What's the weather in Seattle?"
@@ -137,7 +135,7 @@ response2 = client.responses.create(
 )
 ```
 
-**How it works:** OpenAI automatically prepends previous conversation items to each request and appends new items after completion. You don't need to manually pass message history.
+**How it works:** DevUI automatically retrieves the conversation's message history from the stored thread and passes it to the agent. You don't need to manually manage message history - just provide the same `conversation` ID for follow-up requests.
 
 ## CLI Options
 
@@ -155,26 +153,65 @@ Options:
 
 ## Key Endpoints
 
+## API Mapping
+
+Given that DevUI offers an OpenAI Responses API, it internally maps messages and events from Agent Framework to OpenAI Responses API events (in `_mapper.py`). For transparency, this mapping is shown below:
+
+| Agent Framework Content         | OpenAI Event/Type                        | Status   |
+| ------------------------------- | ---------------------------------------- | -------- |
+| `TextContent`                   | `response.output_text.delta`             | Standard |
+| `TextReasoningContent`          | `response.reasoning.delta`               | Standard |
+| `FunctionCallContent` (initial) | `response.output_item.added`             | Standard |
+| `FunctionCallContent` (args)    | `response.function_call_arguments.delta` | Standard |
+| `FunctionResultContent`         | `response.function_result.complete`      | DevUI    |
+| `ErrorContent`                  | `response.error`                         | Standard |
+| `UsageContent`                  | Final `Response.usage` field (not streamed) | Standard |
+| `WorkflowEvent`                 | `response.workflow_event.complete`       | DevUI    |
+| `DataContent`, `UriContent`     | `response.trace.complete`                | DevUI    |
+
+- **Standard** = OpenAI Responses API spec
+- **DevUI** = Custom extensions for Agent Framework features (workflows, traces, function results)
+
+### OpenAI Responses API Compliance
+
+DevUI follows the OpenAI Responses API specification for maximum compatibility:
+
+**Standard OpenAI Types Used:**
+- `ResponseOutputItemAddedEvent` - Output item notifications (function calls)
+- `Response.usage` - Token usage (in final response, not streamed)
+- All standard text, reasoning, and function call events
+
+**Custom DevUI Extensions:**
+- `response.function_result.complete` - Function execution results (DevUI executes functions, OpenAI doesn't)
+- `response.workflow_event.complete` - Agent Framework workflow events
+- `response.trace.complete` - Execution traces for debugging
+
+These custom extensions are clearly namespaced and can be safely ignored by standard OpenAI clients.
+
 ### Entity Management
+
 - `GET /v1/entities` - List discovered agents/workflows
 - `GET /v1/entities/{entity_id}/info` - Get detailed entity information
 - `POST /v1/entities/add` - Add entity from URL (for gallery samples)
 - `DELETE /v1/entities/{entity_id}` - Remove remote entity
 
 ### Execution (OpenAI Responses API)
+
 - `POST /v1/responses` - Execute agent/workflow (streaming or sync)
 
 ### Conversations (OpenAI Standard)
+
 - `POST /v1/conversations` - Create conversation
 - `GET /v1/conversations/{id}` - Get conversation
 - `POST /v1/conversations/{id}` - Update conversation metadata
 - `DELETE /v1/conversations/{id}` - Delete conversation
-- `GET /v1/conversations?agent_id={id}` - List conversations *(DevUI extension)*
+- `GET /v1/conversations?agent_id={id}` - List conversations _(DevUI extension)_
 - `POST /v1/conversations/{id}/items` - Add items to conversation
 - `GET /v1/conversations/{id}/items` - List conversation items
 - `GET /v1/conversations/{id}/items/{item_id}` - Get conversation item
 
 ### Health
+
 - `GET /health` - Health check
 
 ## Implementation
 
@@ -232,7 +232,6 @@ async def _execute_agent(
                 logger.debug(f"Executing agent with text input: {user_message[:100]}...")
             else:
                 logger.debug(f"Executing agent with multimodal ChatMessage: {type(user_message)}")
-
             # Check if agent supports streaming
             if hasattr(agent, "run_stream") and callable(agent.run_stream):
                 # Use Agent Framework's native streaming with optional thread
@@ -433,6 +432,40 @@ def _convert_openai_input_to_chat_message(
                                     elif file_url:
                                         contents.append(DataContent(uri=file_url, media_type=media_type))
 
+                                elif content_type == "function_approval_response":
+                                    # Handle function approval response (DevUI extension)
+                                    try:
+                                        from agent_framework import FunctionApprovalResponseContent, FunctionCallContent
+
+                                        request_id = content_item.get("request_id", "")
+                                        approved = content_item.get("approved", False)
+                                        function_call_data = content_item.get("function_call", {})
+
+                                        # Create FunctionCallContent from the function_call data
+                                        function_call = FunctionCallContent(
+                                            call_id=function_call_data.get("id", ""),
+                                            name=function_call_data.get("name", ""),
+                                            arguments=function_call_data.get("arguments", {}),
+                                        )
+
+                                        # Create FunctionApprovalResponseContent with correct signature
+                                        approval_response = FunctionApprovalResponseContent(
+                                            approved,  # positional argument
+                                            id=request_id,  # keyword argument 'id', NOT 'request_id'
+                                            function_call=function_call,  # FunctionCallContent object
+                                        )
+                                        contents.append(approval_response)
+                                        logger.info(
+                                            f"Added FunctionApprovalResponseContent: id={request_id}, "
+                                            f"approved={approved}, call_id={function_call.call_id}"
+                                        )
+                                    except ImportError:
+                                        logger.warning(
+                                            "FunctionApprovalResponseContent not available in agent_framework"
+                                        )
+                                    except Exception as e:
+                                        logger.error(f"Failed to create FunctionApprovalResponseContent: {e}")
+
             # Handle other OpenAI input item types as needed
             # (tool calls, function results, etc.)
 
 
@@ -27,7 +27,6 @@
     ResponseTextDeltaEvent,
     ResponseTraceEventComplete,
     ResponseUsage,
-    ResponseUsageEventComplete,
     ResponseWorkflowEventComplete,
 )
 
@@ -37,9 +36,8 @@
 EventType = Union[
     ResponseStreamEvent,
     ResponseWorkflowEventComplete,
-    ResponseFunctionResultComplete,
+    ResponseOutputItemAddedEvent,
     ResponseTraceEventComplete,
-    ResponseUsageEventComplete,
 ]
 
 
@@ -56,6 +54,9 @@ def __init__(self, max_contexts: int = 1000) -> None:
         self._conversion_contexts: OrderedDict[int, dict[str, Any]] = OrderedDict()
         self._max_contexts = max_contexts
 
+        # Track usage per request for final Response.usage (OpenAI standard)
+        self._usage_accumulator: dict[str, dict[str, int]] = {}
+
         # Register content type mappers for all 12 Agent Framework content types
         self.content_mappers = {
             "TextContent": self._map_text_content,
@@ -171,17 +172,31 @@ async def aggregate_to_response(self, events: Sequence[Any], request: AgentFrame
                 status="completed",
             )
 
-            # Create usage object
-            input_token_count = len(str(request.input)) // 4 if request.input else 0
-            output_token_count = len(full_content) // 4
-
-            usage = ResponseUsage(
-                input_tokens=input_token_count,
-                output_tokens=output_token_count,
-                total_tokens=input_token_count + output_token_count,
-                input_tokens_details=InputTokensDetails(cached_tokens=0),
-                output_tokens_details=OutputTokensDetails(reasoning_tokens=0),
-            )
+            # Get usage from accumulator (OpenAI standard)
+            request_id = str(id(request))
+            usage_data = self._usage_accumulator.get(request_id)
+
+            if usage_data:
+                usage = ResponseUsage(
+                    input_tokens=usage_data["input_tokens"],
+                    output_tokens=usage_data["output_tokens"],
+                    total_tokens=usage_data["total_tokens"],
+                    input_tokens_details=InputTokensDetails(cached_tokens=0),
+                    output_tokens_details=OutputTokensDetails(reasoning_tokens=0),
+                )
+                # Cleanup accumulator
+                del self._usage_accumulator[request_id]
+            else:
+                # Fallback: estimate if no usage was tracked
+                input_token_count = len(str(request.input)) // 4 if request.input else 0
+                output_token_count = len(full_content) // 4
+                usage = ResponseUsage(
+                    input_tokens=input_token_count,
+                    output_tokens=output_token_count,
+                    total_tokens=input_token_count + output_token_count,
+                    input_tokens_details=InputTokensDetails(cached_tokens=0),
+                    output_tokens_details=OutputTokensDetails(reasoning_tokens=0),
+                )
 
             return OpenAIResponse(
                 id=f"resp_{uuid.uuid4().hex[:12]}",
@@ -229,6 +244,7 @@ def _get_or_create_context(self, request: AgentFrameworkRequest) -> dict[str, An
                 "item_id": f"msg_{uuid.uuid4().hex[:8]}",
                 "content_index": 0,
                 "output_index": 0,
+                "request_id": str(request_key),  # For usage accumulation
                 # Track active function calls: {call_id: {name, item_id, args_chunks}}
                 "active_function_calls": {},
             }
@@ -272,10 +288,11 @@ async def _convert_agent_update(self, update: Any, context: dict[str, Any]) -> S
 
                 if content_type in self.content_mappers:
                     mapped_events = await self.content_mappers[content_type](content, context)
-                    if isinstance(mapped_events, list):
-                        events.extend(mapped_events)
-                    else:
-                        events.append(mapped_events)
+                    if mapped_events is not None:  # Handle None returns (e.g., UsageContent)
+                        if isinstance(mapped_events, list):
+                            events.extend(mapped_events)
+                        else:
+                            events.append(mapped_events)
                 else:
                     # Graceful fallback for unknown content types
                     events.append(await self._create_unknown_content_event(content, context))
@@ -315,10 +332,11 @@ async def _convert_agent_response(self, response: Any, context: dict[str, Any])
 
                         if content_type in self.content_mappers:
                             mapped_events = await self.content_mappers[content_type](content, context)
-                            if isinstance(mapped_events, list):
-                                events.extend(mapped_events)
-                            else:
-                                events.append(mapped_events)
+                            if mapped_events is not None:  # Handle None returns (e.g., UsageContent)
+                                if isinstance(mapped_events, list):
+                                    events.extend(mapped_events)
+                                else:
+                                    events.append(mapped_events)
                         else:
                             # Graceful fallback for unknown content types
                             events.append(await self._create_unknown_content_event(content, context))
@@ -331,8 +349,8 @@ async def _convert_agent_response(self, response: Any, context: dict[str, Any])
                 from agent_framework import UsageContent
 
                 usage_content = UsageContent(details=usage_details)
-                usage_event = await self._map_usage_content(usage_content, context)
-                events.append(usage_event)
+                await self._map_usage_content(usage_content, context)
+                # Note: _map_usage_content returns None - it accumulates usage for final Response.usage
 
         except Exception as e:
             logger.warning(f"Error converting agent response: {e}")
@@ -506,7 +524,11 @@ def _get_active_function_call(self, content: Any, context: dict[str, Any]) -> di
     async def _map_function_result_content(
         self, content: Any, context: dict[str, Any]
     ) -> ResponseFunctionResultComplete:
-        """Map FunctionResultContent to structured event.
+        """Map FunctionResultContent to custom DevUI event.
+
+        This is a DevUI extension - OpenAI doesn't stream function execution results
+        because in their model, applications execute functions, not the API.
+        Agent Framework executes functions, so we emit this event for debugging visibility.
 
         IMPORTANT: Always use Agent Framework's call_id from the content.
         Do NOT generate a new call_id - it must match the one from the function call event.
@@ -518,16 +540,22 @@ async def _map_function_result_content(
             logger.warning("FunctionResultContent missing call_id - this will break call/result pairing")
             call_id = f"call_{uuid.uuid4().hex[:8]}"  # Fallback only if truly missing
 
+        # Extract result
+        result = getattr(content, "result", None)
+        exception = getattr(content, "exception", None)
+
+        # Convert result to string
+        output = result if isinstance(result, str) else json.dumps(result) if result is not None else ""
+
+        # Determine status
+        status = "incomplete" if exception else "completed"
+
+        # Return custom DevUI event
         return ResponseFunctionResultComplete(
             type="response.function_result.complete",
-            data={
-                "call_id": call_id,
-                "result": getattr(content, "result", None),
-                "status": "completed" if not getattr(content, "exception", None) else "failed",
-                "exception": str(getattr(content, "exception", None)) if getattr(content, "exception", None) else None,
-                "timestamp": datetime.now().isoformat(),
-            },
             call_id=call_id,
+            output=output,
+            status=status,
             item_id=context["item_id"],
             output_index=context["output_index"],
             sequence_number=self._next_sequence(context),
@@ -543,37 +571,34 @@ async def _map_error_content(self, content: Any, context: dict[str, Any]) -> Res
             sequence_number=self._next_sequence(context),
         )
 
-    async def _map_usage_content(self, content: Any, context: dict[str, Any]) -> ResponseUsageEventComplete:
-        """Map UsageContent to structured usage event."""
-        # Store usage data in context for aggregation
-        if "usage_data" not in context:
-            context["usage_data"] = []
-        context["usage_data"].append(content)
+    async def _map_usage_content(self, content: Any, context: dict[str, Any]) -> None:
+        """Accumulate usage data for final Response.usage field.
 
+        OpenAI does NOT stream usage events. Usage appears only in final Response.
+        This method accumulates usage data per request for later inclusion in Response.usage.
+
+        Returns:
+            None - no event emitted (usage goes in final Response.usage)
+        """
         # Extract usage from UsageContent.details (UsageDetails object)
         details = getattr(content, "details", None)
-        total_tokens = 0
-        prompt_tokens = 0
-        completion_tokens = 0
+        total_tokens = getattr(details, "total_token_count", 0) or 0
+        prompt_tokens = getattr(details, "input_token_count", 0) or 0
+        completion_tokens = getattr(details, "output_token_count", 0) or 0
 
-        if details:
-            total_tokens = getattr(details, "total_token_count", 0) or 0
-            prompt_tokens = getattr(details, "input_token_count", 0) or 0
-            completion_tokens = getattr(details, "output_token_count", 0) or 0
+        # Accumulate for final Response.usage
+        request_id = context.get("request_id", "default")
+        if request_id not in self._usage_accumulator:
+            self._usage_accumulator[request_id] = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
 
-        return ResponseUsageEventComplete(
-            type="response.usage.complete",
-            data={
-                "usage_data": details.to_dict() if details and hasattr(details, "to_dict") else {},
-                "total_tokens": total_tokens,
-                "completion_tokens": completion_tokens,
-                "prompt_tokens": prompt_tokens,
-                "timestamp": datetime.now().isoformat(),
-            },
-            item_id=context["item_id"],
-            output_index=context["output_index"],
-            sequence_number=self._next_sequence(context),
-        )
+        self._usage_accumulator[request_id]["input_tokens"] += prompt_tokens
+        self._usage_accumulator[request_id]["output_tokens"] += completion_tokens
+        self._usage_accumulator[request_id]["total_tokens"] += total_tokens
+
+        logger.debug(f"Accumulated usage for {request_id}: {self._usage_accumulator[request_id]}")
+
+        # NO EVENT RETURNED - usage goes in final Response only
+        return
 
     async def _map_data_content(self, content: Any, context: dict[str, Any]) -> ResponseTraceEventComplete:
         """Map DataContent to structured trace event."""