This project is an experimental multi-agent AI system built using the CrewAI framework. It is designed to act as a fully autonomous Wall Street research team. Given a stock ticker (e.g., TSLA), the system dynamically queries real-time web data, parses complex financial metrics (revenue, gross margins, debt levels), and synthesizes a professional three-part investment report culminating in a concrete BUY, HOLD, or SELL recommendation.
The system operates using a sequential, multi-agent pipeline—functioning much like a multi-stage control system where the output of the first node dictates the behavior of the second.
- Agent 1: The Senior Stock Research Analyst
- Function: The "Sensor" of the system.
- Tools: Equipped with
SerperDevToolto ping Google Search for real-time SEC filings (10-K/10-Q) and news. - Constraints: Heavily prompted with negative constraints to prevent chronological hallucinations (e.g., pulling data from previous years) and forced into a sequential execution loop to prevent API buffer overflows.
- Agent 2: The Senior Investment Advisor
- Function: The "Controller" of the system.
- Task: Ingests the raw financial telemetry gathered by the Analyst and evaluates the macroeconomic environment to output a formatted Markdown report.
Building autonomous agents using smaller, highly optimized models (like llama-3.1-8b-instant via the Groq API) presented unique system-level challenges:
- Chronological Drift (Knowledge Cutoff): Smaller LLMs naturally default to their static training data. To solve this, system variables (
current_date,current_year) were injected directly from Python'sdatetimemodule into the YAML prompt templates, anchoring the agents strictly to the present day. - The "Phantom Tool" Problem: Open-weight models occasionally hallucinate tool schemas (e.g., attempting to call
<function=brave_search>). This was mitigated by applying strict, hardcoded "hardware overrides" in the agent's backstory, forcing compliance with the provided CrewAI toolset. - API Buffer Overflows (Token Limits): Initial iterations allowed the agent to execute parallel web searches, which quickly overwhelmed the 6,000 Tokens-Per-Minute (TPM) bandwidth limit of the Groq API. The system was successfully optimized by forcing the agent into a highly constrained, sequential search loop with a maximum iteration limit.
- Predictive Hallucination: When faced with a lack of data (e.g., searching for a future, unreleased earnings report), predictive text engines will fabricate numbers rather than outputting a
NULLstate. Explicit "Anti-Hallucination Constraints" were required to teach the agent how to fail gracefully and report missing data accurately.
While the pipeline logic and tool-calling architecture are fully functional, running complex financial synthesis on an 8-billion parameter model often results in cognitive dissonance (e.g., generating accurate top-line revenue, but hallucinating the sub-metrics).
Future iterations of this architecture should point the CrewAI LLM configuration toward a larger, 70B+ parameter model (like GPT-4o, Claude 3.5 Sonnet, or a local Llama 3 70B via Ollama) or utilize a RAG (Retrieval-Augmented Generation) pipeline over raw SEC PDF filings to guarantee 100% data fidelity.
- Python 3.10+
- CrewAI (Agent orchestration)
- LiteLLM / Groq API (Inference engine)
- Serper API (Real-time web scraping)
- Clone the repository.
- Install dependencies:
pip install -r requirements.txt - Set your
.envvariables forGROQ_API_KEYandSERPER_API_KEY. - Run the pipeline:
python src/my_first_agent_project/main.py