-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem
Currently, interacting with the SystemAgent and observing its complex orchestrations (including component evolution) is primarily done through logs and script outputs. A visual, interactive interface would significantly improve usability, debugging, and demonstration capabilities. Furthermore, we need a comprehensive demo that ties together the various agent evolution mechanisms.
Solution
This issue proposes a two-part solution:
- Implement a Gradio-based User Interface: Create a web UI that allows users to:
- Chat directly with the
SystemAgent. - Observe the
SystemAgent's thought process, including which tools it's using and their inputs/outputs. - Potentially trigger specific
SystemAgenttasks or evolution processes. - Manage chat sessions and clear conversation history.
- Chat directly with the
- Develop the
#109: Unified Agent Evolution Demo: Create a comprehensive demonstration script that showcases the full lifecycle of an agent (e.g., "Product Description Manager"). This demo will illustrate:- Initial agent creation by
SystemAgent(e.g., based on GPT-4.1). SystemAgent-driven evolution to a different LLM (e.g., GPT-4.1-nano) and refinement of agent/tool descriptions.- (Conceptually or via API calls) Integration with Reinforcement Fine-Tuning (RFT) to improve the agent's policy for specific criteria (e.g., output format adherence).
SmartLibrarystoring different versions of the agent reflecting these evolutionary stages.
- Initial agent creation by
Inspiration & Reference for Gradio UI:
The provided Gradio code snippet serves as an excellent starting point for the UI. Key features to adapt/integrate:
- Session management (
session_id,get_or_create_agent,remove_agent). - Asynchronous agent execution (
async def run_agent). - Display of agent's thought process and tool usage (the
process_agent_eventsanddetailed_responselogic). - Chat history and clearing functionality.
Key Tasks for Gradio UI Implementation:
- Adapt the provided Gradio UI script to integrate with the EAT
SystemAgent.- Replace
create_agent_with_memory()with logic to fetch/instantiate the EATSystemAgent(likely a singleton or session-based instance managed by the EAT framework). - Modify
run_agentto correctly callSystemAgent.run()and handle its output.
- Replace
- Implement robust observation/event handling to capture
SystemAgent's internal "thought process":- Which tools is it selecting? (
SearchComponentTool,EvolveComponentTool, etc.) - What are the inputs to these tools?
- What are the key outputs or decisions made by these tools?
- This might involve enhancing
SystemAgentor its tools to emit more structured events, or refining howSmartAgentBuslogs can be displayed.
- Which tools is it selecting? (
- Ensure the UI clearly distinguishes between user messages,
SystemAgent's final responses, and its internal "thought process" / tool usage trace. - Implement proper state management for different user sessions if the UI is to be multi-user.
- Style the UI for clarity (the provided CSS is a good start).
- Add functionality to the UI to potentially trigger specific
SystemAgentcommands beyond simple chat (e.g., "Evolve component X with these requirements"). (Stretch Goal)
Key Tasks for Unified Agent Evolution Demo (#109) (Task 6 of Overall Plan):
- Define the "Product Description Manager" agent's initial capabilities and tools (e.g., CRUD operations on product data, potentially using a mock
entity_store.py). - Stage 1 (Initial Creation):
- Write script logic for
SystemAgentto create the initial "Product Description Manager" (v1.0) using a base LLM (e.g., GPT-4.1). - Verify basic functionality.
- Ensure
SmartLibrarycorrectly stores this version.
- Write script logic for
- Stage 2 (Model Optimization & Description Enhancement):
- Write script logic for
SystemAgentto evolve the agent to v1.1 using a different LLM (e.g., GPT-4.1-nano, oro4-miniif RFT requires it). SystemAgentusesEvolveComponentToolto prompt an LLM to rewrite the agent'sAgentMetadescription and its tools' descriptions based on new refinement goals.- Log the "before" and "after" descriptions.
- Verify functionality with the new model.
- Ensure
SmartLibrarystores this evolved version with appropriate metadata.
- Write script logic for
- Stage 3 (Reinforcement Fine-Tuning - RFT):
- Define a clear RFT goal (e.g., strict JSON output schema adherence for product descriptions).
- Define the RFT
graderconfiguration (e.g., Python/json_schema grader for structure, string_check for categories, score_model for conciseness). - Create sample
training_set.jsonlandvalidation_set.jsonlfiles. - Implement script logic to (simulate or actually) make OpenAI API calls to:
- Upload dataset files.
- Create an RFT fine-tuning job targeting the Stage 2 model (e.g., GPT-4.1-nano or
o4-mini). - Log the submitted job ID.
- (Optional/Simulated) Retrieve the fine-tuned model ID.
- Update
SmartLibraryto associate this fine-tuned model ID with a new agent version (e.g., v1.1-RFT or v1.2). - Demonstrate (by prompting the agent version using the fine-tuned model) its improved adherence to the RFT goal compared to the pre-RFT version.
- Ensure the demo script logs key actions, component versions, and outcomes clearly.
- All EAT components (
SystemAgent,SmartLibrary,EvolveComponentTool,LLMService) must be utilized appropriately.
Acceptance Criteria:
- Gradio UI:
- Users can successfully chat with the EAT
SystemAgent. - The UI displays a comprehensible trace of the
SystemAgent's tool usage and intermediate thoughts leading to a final answer. - Session management (clearing chat, starting new session) works correctly.
- Users can successfully chat with the EAT
- Unified Agent Evolution Demo (Agent Evolution Demo: GPT-4.1 -> GPT-4.1-nano (Agent Prompt & Tool Description) -> GPT-4.1-nano (RFT) for Product Description Manager Agent #109):
- The demo script runs end-to-end, completing Stages 1, 2, and (at least initiating) Stage 3.
SmartLibrarycorrectly reflects the different agent versions created through evolution and RFT.- Logs clearly show the orchestration by
SystemAgentfor Stages 1 & 2, and the RFT setup for Stage 3. - The RFT-enhanced agent version demonstrably performs better according to the defined RFT goal (e.g., consistent JSON output).
Dependencies:
- Successful completion of MongoDB backend - Consolidate All Data Persistence and Vector Search into MongoDB, Retiring ChromaDB #110 and Core Evolution Mechanics - Enhance EAT Framework with AlphaEvolve Principles for Advanced Self-Improvement and Algorithmic Discovery #112 partial is highly recommended as this phase builds directly upon those capabilities.
SmartMemory(Implement Advanced "Smart Memory" as an Agent-Tool Ecosystem for Persistent, Goal-Driven Context #111) would be beneficial for the demo to show how RFT datasets could be informed by past interactions, but it's not a hard blocker for the RFT API call demonstration itself.
Notes:
- The RFT part of Demo Agent Evolution Demo: GPT-4.1 -> GPT-4.1-nano (Agent Prompt & Tool Description) -> GPT-4.1-nano (RFT) for Product Description Manager Agent #109 might be partially simulated (e.g., submitting the job but not waiting for full completion due to time/cost) for CI/CD and quick demo runs. The focus is on showing the process and EAT's role in managing RFT-enhanced models.
- The Gradio UI will provide a valuable "window" into the complex operations performed by the EAT framework, making these advanced features more accessible and understandable.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request