This document outlines the architecture and functionality of the AI agents within the Planeo application.
Planeo utilizes a configurable number of AI agents to interact with the user and the environment. These agents can generate chat messages and respond to visual stimuli.
AI agent definitions are managed in src/domain/aiAgent.ts. This file provides:
- A Zod schema (
AIAgentSchema) defining the structure of an AI agent (currentlyidanddisplayName). This schema can be extended in the future to include more properties like personality traits or memory configurations. - A function
getAIAgents(): AIAgent[]which loads AI agent configurations.- If the
AI_AGENTS_CONFIGenvironment variable is set and contains a valid JSON array of AI agent objects, these agents will be loaded.- Example
AI_AGENTS_CONFIGin an.envfile:AI_AGENTS_CONFIG='[{"id":"custom-ai-1","displayName":"Custom AI Alpha"},{"id":"custom-ai-2","displayName":"Custom AI Beta"}]'
- Example
- If
AI_AGENTS_CONFIGis not set, is empty, or contains invalid JSON, the system defaults to two AI agents:{"id":"ai-agent-1","displayName":"AI-1"}and{"id":"ai-agent-2","displayName":"AI-2"}. - These default agents will have their eyeball positions initialized in the 3D world automatically when a user connects.
- If the
- A helper function
isAIAgentId(userId: string): booleanto check if a given user ID belongs to one of the loaded AI agents. - A helper function
getAIAgentById(userId: string): AIAgent | undefinedto retrieve the full details of a specific AI agent by its ID.
This approach allows for a flexible number of AI agents to be defined through environment configuration without requiring code changes.
AI agents perceive their environment through visual input (rendered images of the scene) and contextual chat history. Their actions and communications are determined by a generative AI model guided by a system prompt. See docs/ai_services.md for more details on the AI model interaction.
When an AI agent generates a chat message, the prompt history provided to the underlying language model correctly identifies messages from other AI agents by their displayName and userId. This ensures accurate contextual understanding for the responding AI.
Based on visual input and chat history, AI agents decide on both a chat message and a physical action. Possible actions include moving, turning, and looking at other entities.
Currently, all AI agents share a general persona of having newly materialized in the environment, feeling disoriented and cautious. This is defined in the system prompt in src/app/actions/generateMessage.ts.
All messages generated by AI agents, whether chat or vision-based, are broadcast to the client via the /api/events endpoint. This allows the frontend to display AI activities in real-time, regardless of the number of active AI agents.
- Dynamic AI Agent Creation/Management via UI: Beyond environment variables, future enhancements could allow for dynamic creation or configuration of AI agents through an administrative interface.
- Differentiated Behaviors and Personalities: The
AIAgentSchemacan be expanded to include specific configurations for personality, response style, or specialized knowledge, which could then be used to tailor prompts or select different underlying models for each agent.