Implement Gradio UI for SystemAgent and Showcase Unified Agent Evolution

**Problem**
Currently, interacting with the `SystemAgent` and observing its complex orchestrations (including component evolution) is primarily done through logs and script outputs. A visual, interactive interface would significantly improve usability, debugging, and demonstration capabilities. Furthermore, we need a comprehensive demo that ties together the various agent evolution mechanisms.

**Solution**
This issue proposes a two-part solution:

1.  **Implement a Gradio-based User Interface:** Create a web UI that allows users to:
    *   Chat directly with the `SystemAgent`.
    *   Observe the `SystemAgent`'s thought process, including which tools it's using and their inputs/outputs.
    *   Potentially trigger specific `SystemAgent` tasks or evolution processes.
    *   Manage chat sessions and clear conversation history.
2.  **Develop the `#109: Unified Agent Evolution Demo`:** Create a comprehensive demonstration script that showcases the full lifecycle of an agent (e.g., "Product Description Manager"). This demo will illustrate:
    *   Initial agent creation by `SystemAgent` (e.g., based on GPT-4.1).
    *   `SystemAgent`-driven evolution to a different LLM (e.g., GPT-4.1-nano) and refinement of agent/tool descriptions.
    *   (Conceptually or via API calls) Integration with Reinforcement Fine-Tuning (RFT) to improve the agent's policy for specific criteria (e.g., output format adherence).
    *   `SmartLibrary` storing different versions of the agent reflecting these evolutionary stages.

**Inspiration & Reference for Gradio UI:**
The provided Gradio code snippet serves as an excellent starting point for the UI. Key features to adapt/integrate:
*   Session management (`session_id`, `get_or_create_agent`, `remove_agent`).
*   Asynchronous agent execution (`async def run_agent`).
*   Display of agent's thought process and tool usage (the `process_agent_events` and `detailed_response` logic).
*   Chat history and clearing functionality.

**Key Tasks for Gradio UI Implementation:**

*   [ ] Adapt the provided Gradio UI script to integrate with the EAT `SystemAgent`.
    *   Replace `create_agent_with_memory()` with logic to fetch/instantiate the EAT `SystemAgent` (likely a singleton or session-based instance managed by the EAT framework).
    *   Modify `run_agent` to correctly call `SystemAgent.run()` and handle its output.
*   [ ] Implement robust observation/event handling to capture `SystemAgent`'s internal "thought process":
    *   Which tools is it selecting? (`SearchComponentTool`, `EvolveComponentTool`, etc.)
    *   What are the inputs to these tools?
    *   What are the key outputs or decisions made by these tools?
    *   This might involve enhancing `SystemAgent` or its tools to emit more structured events, or refining how `SmartAgentBus` logs can be displayed.
*   [ ] Ensure the UI clearly distinguishes between user messages, `SystemAgent`'s final responses, and its internal "thought process" / tool usage trace.
*   [ ] Implement proper state management for different user sessions if the UI is to be multi-user.
*   [ ] Style the UI for clarity (the provided CSS is a good start).
*   [ ] Add functionality to the UI to potentially trigger specific `SystemAgent` commands beyond simple chat (e.g., "Evolve component X with these requirements"). (Stretch Goal)

**Key Tasks for Unified Agent Evolution Demo (#109) (Task 6 of Overall Plan):**

*   [ ] Define the "Product Description Manager" agent's initial capabilities and tools (e.g., CRUD operations on product data, potentially using a mock `entity_store.py`).
*   [ ] **Stage 1 (Initial Creation):**
    *   Write script logic for `SystemAgent` to create the initial "Product Description Manager" (v1.0) using a base LLM (e.g., GPT-4.1).
    *   Verify basic functionality.
    *   Ensure `SmartLibrary` correctly stores this version.
*   [ ] **Stage 2 (Model Optimization & Description Enhancement):**
    *   Write script logic for `SystemAgent` to evolve the agent to v1.1 using a different LLM (e.g., GPT-4.1-nano, or `o4-mini` if RFT requires it).
    *   `SystemAgent` uses `EvolveComponentTool` to prompt an LLM to rewrite the agent's `AgentMeta` description and its tools' descriptions based on new refinement goals.
    *   Log the "before" and "after" descriptions.
    *   Verify functionality with the new model.
    *   Ensure `SmartLibrary` stores this evolved version with appropriate metadata.
*   [ ] **Stage 3 (Reinforcement Fine-Tuning - RFT):**
    *   Define a clear RFT goal (e.g., strict JSON output schema adherence for product descriptions).
    *   Define the RFT `grader` configuration (e.g., Python/json_schema grader for structure, string_check for categories, score_model for conciseness).
    *   Create sample `training_set.jsonl` and `validation_set.jsonl` files.
    *   Implement script logic to (simulate or actually) make OpenAI API calls to:
        *   Upload dataset files.
        *   Create an RFT fine-tuning job targeting the Stage 2 model (e.g., GPT-4.1-nano or `o4-mini`).
        *   Log the submitted job ID.
    *   (Optional/Simulated) Retrieve the fine-tuned model ID.
    *   Update `SmartLibrary` to associate this fine-tuned model ID with a new agent version (e.g., v1.1-RFT or v1.2).
    *   Demonstrate (by prompting the agent version using the fine-tuned model) its improved adherence to the RFT goal compared to the pre-RFT version.
*   [ ] Ensure the demo script logs key actions, component versions, and outcomes clearly.
*   [ ] All EAT components (`SystemAgent`, `SmartLibrary`, `EvolveComponentTool`, `LLMService`) must be utilized appropriately.

**Acceptance Criteria:**

*   **Gradio UI:**
    *   Users can successfully chat with the EAT `SystemAgent`.
    *   The UI displays a comprehensible trace of the `SystemAgent`'s tool usage and intermediate thoughts leading to a final answer.
    *   Session management (clearing chat, starting new session) works correctly.
*   **Unified Agent Evolution Demo (#109):**
    *   The demo script runs end-to-end, completing Stages 1, 2, and (at least initiating) Stage 3.
    *   `SmartLibrary` correctly reflects the different agent versions created through evolution and RFT.
    *   Logs clearly show the orchestration by `SystemAgent` for Stages 1 & 2, and the RFT setup for Stage 3.
    *   The RFT-enhanced agent version demonstrably performs better according to the defined RFT goal (e.g., consistent JSON output).

**Dependencies:**
*   Successful completion of MongoDB backend - #110 and Core Evolution Mechanics - #112 partial is highly recommended as this phase builds directly upon those capabilities.
*   `SmartMemory` (#111) would be beneficial for the demo to show how RFT datasets could be informed by past interactions, but it's not a hard blocker for the RFT API call demonstration itself.

**Notes:**
*   The RFT part of Demo #109 might be partially simulated (e.g., submitting the job but not waiting for full completion due to time/cost) for CI/CD and quick demo runs. The focus is on showing the *process* and EAT's role in managing RFT-enhanced models.
*   The Gradio UI will provide a valuable "window" into the complex operations performed by the EAT framework, making these advanced features more accessible and understandable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Gradio UI for SystemAgent and Showcase Unified Agent Evolution #113

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implement Gradio UI for SystemAgent and Showcase Unified Agent Evolution #113

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions