Date: 2023-12-15 Topic: Building an intelligent document review system with ReAct architecture
Today I designed and implemented a regulatory compliance review system using the ReAct (Reasoning + Acting) architecture. The system automatically parses legal provisions and uses multimodal AI for intelligent review.
The system consists of four main modules:
- Rule Parser: Converts legal text into structured rules
- Review Controller: Coordinates workflow using ReAct loop
- Tool Integration: Provides OCR, image comparison, etc.
- Prompt Engine: Manages LLM interaction templates
async def _execute_react_loop(self, prompt: str, max_steps: int = 5) -> Dict:
current_prompt = prompt
history = []
for step in range(max_steps):
response = await llm_chat(current_prompt, temperature=0.7)
# Parse tool call from response
tool_name, tool_args = await self._parse_latest_plugin_call(response)
if not tool_name:
continue
# Execute tool
result = await self._execute_tool(tool_name, tool_args)
# Record history
history.append({
"step": step + 1,
"thought": self._extract_thought(response),
"action": f"{tool_name}: {json.dumps(tool_args)}",
"result": result
})
# Update prompt with observation
current_prompt = f"{current_prompt}\nObservation: {json.dumps(result)}"
return self._generate_final_result(history)Parsed rules follow a structured format:
{
"rules": [
{
"rule_id": "RULE_001",
"rule_type": "content_requirement",
"source": "Article 7",
"description": "Health food advertisement content management",
"requirements": [
{
"type": "content_check",
"content": "Must match registration certificate",
"mandatory": true
}
],
"tools_required": [
{"tool": "text_recognition", "purpose": "Check text content"}
]
}
]
}def get_audit_prompt(self, text_data, tools, rules, context=None):
tool_descriptions = []
for name, info in tools.items():
desc = f"{name}: {info['description']}\n"
desc += f"Parameters: {', '.join(info['required_params'])}"
tool_descriptions.append(desc)
return self.react_template.format(
tools='\n\n'.join(tool_descriptions),
tool_names=','.join(tools.keys()),
text_content=text_data,
image_info=json.dumps(context.get('image_info', {})),
rules=json.dumps(rules, ensure_ascii=False, indent=2)
)- Scenario Recognition: Identify document type (health food ad, medical device, etc.)
- Rule Loading: Load applicable regulatory rules
- Content Extraction: OCR for text, VL model for images
- Compliance Check: Verify against rules using ReAct loop
- Risk Assessment: Generate report with findings
The ReAct architecture works well for this use case because:
- Each document may require different tools
- The reasoning chain adapts based on findings
- The process is transparent and auditable
The rule parsing was the hardest part. Legal text is ambiguous, and converting it to structured rules requires careful prompt engineering. Using progressive temperature (starting high, reducing on retry) helped improve consistency.
One insight: the ReAct loop should have escape hatches. Without max_steps limits and timeout protection, the system could loop indefinitely.
- Legal NLP and rule extraction
- Multi-agent systems for complex tasks
- Tool-use optimization in LLMs
- Regulatory technology (RegTech) patterns