Skip to content

Implement Gradio UI for SystemAgent and Showcase Unified Agent Evolution #113

@matiasmolinas

Description

@matiasmolinas

Problem
Currently, interacting with the SystemAgent and observing its complex orchestrations (including component evolution) is primarily done through logs and script outputs. A visual, interactive interface would significantly improve usability, debugging, and demonstration capabilities. Furthermore, we need a comprehensive demo that ties together the various agent evolution mechanisms.

Solution
This issue proposes a two-part solution:

  1. Implement a Gradio-based User Interface: Create a web UI that allows users to:
    • Chat directly with the SystemAgent.
    • Observe the SystemAgent's thought process, including which tools it's using and their inputs/outputs.
    • Potentially trigger specific SystemAgent tasks or evolution processes.
    • Manage chat sessions and clear conversation history.
  2. Develop the #109: Unified Agent Evolution Demo: Create a comprehensive demonstration script that showcases the full lifecycle of an agent (e.g., "Product Description Manager"). This demo will illustrate:
    • Initial agent creation by SystemAgent (e.g., based on GPT-4.1).
    • SystemAgent-driven evolution to a different LLM (e.g., GPT-4.1-nano) and refinement of agent/tool descriptions.
    • (Conceptually or via API calls) Integration with Reinforcement Fine-Tuning (RFT) to improve the agent's policy for specific criteria (e.g., output format adherence).
    • SmartLibrary storing different versions of the agent reflecting these evolutionary stages.

Inspiration & Reference for Gradio UI:
The provided Gradio code snippet serves as an excellent starting point for the UI. Key features to adapt/integrate:

  • Session management (session_id, get_or_create_agent, remove_agent).
  • Asynchronous agent execution (async def run_agent).
  • Display of agent's thought process and tool usage (the process_agent_events and detailed_response logic).
  • Chat history and clearing functionality.

Key Tasks for Gradio UI Implementation:

  • Adapt the provided Gradio UI script to integrate with the EAT SystemAgent.
    • Replace create_agent_with_memory() with logic to fetch/instantiate the EAT SystemAgent (likely a singleton or session-based instance managed by the EAT framework).
    • Modify run_agent to correctly call SystemAgent.run() and handle its output.
  • Implement robust observation/event handling to capture SystemAgent's internal "thought process":
    • Which tools is it selecting? (SearchComponentTool, EvolveComponentTool, etc.)
    • What are the inputs to these tools?
    • What are the key outputs or decisions made by these tools?
    • This might involve enhancing SystemAgent or its tools to emit more structured events, or refining how SmartAgentBus logs can be displayed.
  • Ensure the UI clearly distinguishes between user messages, SystemAgent's final responses, and its internal "thought process" / tool usage trace.
  • Implement proper state management for different user sessions if the UI is to be multi-user.
  • Style the UI for clarity (the provided CSS is a good start).
  • Add functionality to the UI to potentially trigger specific SystemAgent commands beyond simple chat (e.g., "Evolve component X with these requirements"). (Stretch Goal)

Key Tasks for Unified Agent Evolution Demo (#109) (Task 6 of Overall Plan):

  • Define the "Product Description Manager" agent's initial capabilities and tools (e.g., CRUD operations on product data, potentially using a mock entity_store.py).
  • Stage 1 (Initial Creation):
    • Write script logic for SystemAgent to create the initial "Product Description Manager" (v1.0) using a base LLM (e.g., GPT-4.1).
    • Verify basic functionality.
    • Ensure SmartLibrary correctly stores this version.
  • Stage 2 (Model Optimization & Description Enhancement):
    • Write script logic for SystemAgent to evolve the agent to v1.1 using a different LLM (e.g., GPT-4.1-nano, or o4-mini if RFT requires it).
    • SystemAgent uses EvolveComponentTool to prompt an LLM to rewrite the agent's AgentMeta description and its tools' descriptions based on new refinement goals.
    • Log the "before" and "after" descriptions.
    • Verify functionality with the new model.
    • Ensure SmartLibrary stores this evolved version with appropriate metadata.
  • Stage 3 (Reinforcement Fine-Tuning - RFT):
    • Define a clear RFT goal (e.g., strict JSON output schema adherence for product descriptions).
    • Define the RFT grader configuration (e.g., Python/json_schema grader for structure, string_check for categories, score_model for conciseness).
    • Create sample training_set.jsonl and validation_set.jsonl files.
    • Implement script logic to (simulate or actually) make OpenAI API calls to:
      • Upload dataset files.
      • Create an RFT fine-tuning job targeting the Stage 2 model (e.g., GPT-4.1-nano or o4-mini).
      • Log the submitted job ID.
    • (Optional/Simulated) Retrieve the fine-tuned model ID.
    • Update SmartLibrary to associate this fine-tuned model ID with a new agent version (e.g., v1.1-RFT or v1.2).
    • Demonstrate (by prompting the agent version using the fine-tuned model) its improved adherence to the RFT goal compared to the pre-RFT version.
  • Ensure the demo script logs key actions, component versions, and outcomes clearly.
  • All EAT components (SystemAgent, SmartLibrary, EvolveComponentTool, LLMService) must be utilized appropriately.

Acceptance Criteria:

  • Gradio UI:
    • Users can successfully chat with the EAT SystemAgent.
    • The UI displays a comprehensible trace of the SystemAgent's tool usage and intermediate thoughts leading to a final answer.
    • Session management (clearing chat, starting new session) works correctly.
  • Unified Agent Evolution Demo (Agent Evolution Demo: GPT-4.1 -> GPT-4.1-nano (Agent Prompt & Tool Description) -> GPT-4.1-nano (RFT) for Product Description Manager Agent #109):
    • The demo script runs end-to-end, completing Stages 1, 2, and (at least initiating) Stage 3.
    • SmartLibrary correctly reflects the different agent versions created through evolution and RFT.
    • Logs clearly show the orchestration by SystemAgent for Stages 1 & 2, and the RFT setup for Stage 3.
    • The RFT-enhanced agent version demonstrably performs better according to the defined RFT goal (e.g., consistent JSON output).

Dependencies:

Notes:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions