This project simulates a team of autonomous LLM agents collaboratively drafting a cross-functional project proposal. It evaluates how well agents like a Project Manager, Technical Lead, and Business Analyst emulate real-world collaboration.
Built using:
- Mistral-7B Instruct v0.2 (local GGUF) for fast inference via
llama-cpp-python - AutoGen-style roles, with shared memory and log analysis
- Full support for hallucination detection, repetition tracking, and concept growth metrics
- Can LLM agents emulate realistic cross-functional decision-making?
- Does reasoning improve across multiple rounds of conversation?
- How prone are autonomous agents to hallucination or topic drift?
- What evaluation strategies can benchmark multi-agent collaboration?
| Component | Description |
|---|---|
conversation_log.txt |
Transcript of 5 rounds of agent collaboration |
Sarthak_Research_1.ipynb |
Main notebook with simulation, visualizations |
hallucination_report.csv |
Optional export of hallucination flags |
llm_topics_by_round.csv |
Optional export of LLM-extracted round topics |
- ProjectManager: Sets timeline, phases, deliverables
- TechnicalLead: Designs architecture, picks tech stack
- BusinessAnalyst: Understands user needs, defines KPIs
| Metric | Description |
|---|---|
| 🔁 Repetition | Detects repeated phrases across agent replies |
| 🎯 Role Alignment | Checks if agents stay within their domain keywords |
| 🧠 New Concepts | Tracks keyword and topic diversity across rounds |
| 🚨 Hallucination | Flags off-topic or unverifiable terms |
| 🧩 LLM Topics | Mistral-based topic summarization per round (semantic view) |
- Concept growth per round (line chart)
- New concepts per agent/round (bar chart)
- Repetition and hallucination heatmaps
- LLM-generated topic bar chart (stacked horizontal)
All generated using
matplotlibin the main notebook
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-namePlace your mistral-7b-instruct-v0.2.Q4_K_M.gguf model inside a /models folder.
pip install llama-cpp-python matplotlibOpen Sarthak_Research_1.ipynb in Jupyter or Colab
Make sure to point to your model path correctly.
MIT License — feel free to fork and build upon this!
Sarthak Chandarana
LLM Systems Researcher
Find me on GitHub or LinkedIn for questions or collaborations!