-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Hi, and thanks for FinRobot. It is a very helpful platform for building LLM based financial analysis agents.
I maintain an MIT-licensed open-source project called WFGY (~1.5k GitHub stars).
One of its main components is a 16-problem “ProblemMap” for RAG and LLM pipelines, which catalogues common failure modes across:
- data ingestion and document ingest
- embeddings and vector stores
- retrievers and ranking
- tool routing and multi step reasoning
- evaluation gaps and guardrails
ProblemMap overview:
https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md
This checklist is already referenced or integrated by several external projects, including:
- Harvard MIMS Lab: ToolUniverse (LLM robustness and RAG debugging entry)
- QCRI LLM Lab: Multimodal RAG Survey
- curated lists such as Awesome AI in Finance and AI Agents for Cybersecurity
Why this matters for FinRobot
FinRobot is explicitly positioned as an AI agent platform for financial analysis using LLMs, where agents read filings, reports and market data, then produce decisions or summaries. In these agent stacks, reliability issues such as tool routing mistakes, misinterpreted indicators, retrieval failures on documents and long horizon reasoning errors can be quite costly.
Many FinRobot users already have working agents, but still struggle to explain where the failure lives when a financial answer looks plausible yet wrong. Is it the vector store, the document ingest, the prompt, the tool orchestration or the evaluation loop.
The 16-problem map acts as a semantic firewall and debugging guide for such LLM agents and can give FinRobot users a structured way to label and triage failures.
Concrete proposal
If you think this is aligned, I would be happy to:
-
Draft a docs page such as “Robustness and debugging with WFGY ProblemMap” that maps common FinRobot patterns to a subset of the 16 problems, for example:
- retrieval and grounding issues on financial documents
- misrouted tools and wrong trade explanations
- long horizon reasoning collapse in multi step agent workflows
-
Provide a small example notebook or script that shows how to log agent incidents and tag them with ProblemMap numbers, so teams can build their own incident library on top of FinRobot.
-
Add a concise troubleshooting table:
Symptom in a FinRobot agent → ProblemMap number → which part of the agent configuration or pipeline to inspect.
This would be a docs-only and optional integration. FinRobot would stay framework neutral and users who are interested in robustness tooling can opt in.
If this sounds interesting, I can open a PR with a first draft and iterate based on your feedback.