Date: 2024-12-05 Topic: Implementing a regulatory compliance scene with VL+LLM
Today I implemented a scene for reviewing prescription drug advertisements - checking if they contain required compliance statements.
Prescription drug ads must prominently display: "This advertisement is for medical and pharmaceutical professionals only."
The scene needs to:
- Extract text from the advertisement image/PDF
- Check if the required statement is present and prominent
- Return a compliance judgment
The initial implementation was overly complex. After refactoring:
@router.post("/prescription_ad_review", response_model=BaseJudgementResponse)
async def recognize_prescription_ad_api(
advertisement_file: UploadFile = File(...),
batch_size: int = Form(default=1)
) -> BaseJudgementResponse:
vl_prompt = """Please identify if there is a prominent statement
'This advertisement is for medical and pharmaceutical professionals only'..."""
try:
processed_doc = await recognizeImageTextVL(
advertisement_file,
prompt=vl_prompt,
batch_size=batch_size
)
content = "\n".join([page["content"] for page in processed_doc["pages"]])
result = await llm_chat(query=analysis_query)
return BaseJudgementResponse(
code=200,
msg="Recognition successful",
data=BaseJudgement(
is_illegal="not prominently stated" in result,
reason=result
)
)
except Exception as e:
return BaseJudgementResponse(
code=400,
msg=f"Processing failed: {str(e)}",
data=None
)SUPPORTED_IMAGE_TYPES = frozenset({
'image/jpeg',
'image/png',
'image/gif',
'image/bmp',
'application/pdf'
})The batch_size parameter controls PDF page processing - balance between speed and resource usage.
Standardized error responses with exception handlers:
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request, exc):
return JSONResponse(
status_code=422,
content=BaseResponse(
code=422,
msg="Validation error",
data={"detail": exc.errors()}
).dict()
)Also spent time debugging SSH access to the deployment server:
# Check service status
systemctl status sshd
# Test with verbose output
ssh -vvv username@hostname
# Verify firewall
sudo iptables -LThe issue was a firewall rule blocking the SSH port.
The refactoring process was satisfying. The original code had multiple layers of unnecessary JSON parsing and data model conversions. The simplified version does the same thing in half the code.
The key insight: start simple, add complexity only when needed. The original implementation anticipated edge cases that never occurred, adding complexity without value.
The SSH debugging was a reminder that deployment issues are often infrastructure, not code. Knowing basic network troubleshooting (firewall rules, service status, connection testing) is essential for full-stack development.
- Regulatory text analysis with NLP
- Image text detection (OCR) improvements
- FastAPI middleware patterns
- Infrastructure automation with Ansible