This MVP implements a comprehensive motor claim liability determination system that automates document collation, analysis, and decision-readiness assessment. The system processes multiple claim documents, extracts key information, validates against baseline expectations, identifies risks and gaps, and generates actionable recommendations.
The system is built as a modular Python application following the 7 Functional Requirements (FR-1 through FR-7):
-
Data Models (
src/models.py)- Comprehensive Pydantic models for all entities
- Document, Entity, TimelineEvent, Gap, Conflict, Risk, Recommendation
- ConsolidatedFacts, ReadinessIndex, ScoringResult, ClaimDossier
-
Synthetic Data Generators (
src/data_generators/)synthetic_claim_generator.py: Generates realistic synthetic motor claims- Creates FNOL forms, statements, police reports, repair quotes, correspondence, call transcripts, photos
- Uses Faker library for realistic data generation
-
Document Processing (
src/document_processing/)document_retriever.py(FR-1): Retrieves documents for a claimnormalizer.py(FR-2): Normalizes, classifies, and deduplicates documentssummarizer.py(FR-3): Summarizes documents and extracts structured facts
-
Analysis (
src/analysis/)bed_checker.py(FR-4): Cross-checks facts against Baseline Expectations Documentrisk_detector.py(FR-5): Detects gaps, conflicts, and risksrecommendations.py(FR-6): Generates actionable next steps
-
Output (
src/output/)dossier_generator.py(FR-7): Generates comprehensive claim dossiers in Markdown format
-
Orchestration (
src/liability_processor.py)- Main processor that orchestrates all modules
- Processes claims through the full pipeline
- First Notice of Loss (FNOL)
- Claimant Statements
- Insured Statements
- Police Reports
- Repair Quotes/Invoices
- Photo Metadata
- Email/SMS Correspondence
- Call Transcripts
- Broker Submissions
- Third Party Statements
- Medical Reports (structure ready)
- Telematics (structure ready)
-
Baseline Expectations Cross-Check
- Existence scoring (required documents/fields present?)
- Completeness scoring (all required fields filled?)
- Quality scoring (meets quality criteria?)
- Overall Readiness Index (0-100)
-
Risk Detection
- Late FNOL submission
- Missing police reports
- Template-like statements
- High repair costs
- Stale documents
-
Gap Detection
- Missing required documents
- Missing signatures
- Missing third-party details
- Stale documentation
-
Conflict Detection
- Date inconsistencies
- Location inconsistencies
- Registration inconsistencies
- Damage description inconsistencies
-
Recommendations
- Prioritized action items
- Templated outreach messages
- Links to related gaps/conflicts/risks
Each claim generates a comprehensive Markdown dossier containing:
- Executive Summary with Readiness Index
- Scoring Dashboard (requirement-by-requirement breakdown)
- Master Timeline of events
- Consolidated Facts (entities, dates, locations, etc.)
- Gaps Identified (with severity and evidence)
- Conflicts Detected (with conflicting values)
- Risks Flagged (with indicators)
- Recommended Next Steps (with templated outreach)
- Document Summaries
- Appendix with metadata
Located at data/bed/motor_claims_bed.yaml:
- Defines 10 requirements for motor claims
- Specifies expected fields, quality criteria, thresholds
- Configurable scoring weights and readiness thresholds
- Risk indicator definitions
Located at config/config.yaml:
- Document processing settings
- BED path and scoring weights
- Risk detection thresholds
- Output format preferences
The system includes comprehensive synthetic data generators that create:
- Realistic claim scenarios with multiple document types
- Varied incident dates, locations, and circumstances
- Realistic names, addresses, vehicle details (via Faker)
- Document metadata (dates, authors, sources)
- Optional documents (correspondence, photos, transcripts)
# Install dependencies
pip install -r requirements.txt
# Run demo
python demo.py# Generate summary charts
python visualize_results.pyLiabilityIQ/
├── src/
│ ├── models.py # Data models
│ ├── liability_processor.py # Main orchestrator
│ ├── data_generators/
│ │ └── synthetic_claim_generator.py
│ ├── document_processing/
│ │ ├── document_retriever.py # FR-1
│ │ ├── normalizer.py # FR-2
│ │ └── summarizer.py # FR-3
│ ├── analysis/
│ │ ├── bed_checker.py # FR-4
│ │ ├── risk_detector.py # FR-5
│ │ └── recommendations.py # FR-6
│ └── output/
│ └── dossier_generator.py # FR-7
├── data/
│ ├── bed/
│ │ └── motor_claims_bed.yaml # Baseline Expectations
│ └── synthetic/ # Generated claims
├── config/
│ └── config.yaml # System configuration
├── output/ # Generated dossiers
├── demo.py # Demo script
├── visualize_results.py # Visualization helper
├── requirements.txt
└── README.md
The system is designed for easy extension:
- New Document Types: Add to
DocumentTypeenum and update classifiers - New Requirements: Add to BED YAML file
- New Risk Indicators: Add to risk detector with custom logic
- New Output Formats: Extend dossier generator (PDF, HTML, etc.)
- Integration: Replace document retriever with ClaimCenter API calls
Potential improvements for production:
- Real OCR/NLP for document processing
- Machine learning for entity extraction
- Integration with ClaimCenter database
- PDF/HTML output formats
- Web UI for interactive review
- Real-time processing capabilities
- Advanced conflict resolution algorithms
- Historical claim pattern analysis
The MVP includes:
- Synthetic data generation for testing
- End-to-end pipeline demonstration
- Visualization tools for result analysis
For production, add:
- Unit tests for each module
- Integration tests for full pipeline
- Performance benchmarks
- Accuracy validation against real claims
- pydantic: Data validation and models
- faker: Synthetic data generation
- pyyaml: Configuration file parsing
- python-dateutil: Date parsing
- matplotlib: Visualization (optional)
- reportlab: PDF generation (optional, for future)
- This is an MVP focused on demonstrating the concept and architecture
- Document processing uses simplified text extraction (production would use OCR/NLP)
- Entity extraction uses pattern matching (production would use NER models)
- All analysis is rule-based (production could add ML models)
- Output is Markdown (easily extensible to PDF/HTML)
✅ Document collation into single consolidated record
✅ Document understanding and analytical assessment
✅ Confidence scoring of available evidence
✅ Gap and completeness analysis
✅ Recommended next steps to support claim progression
✅ Synthetic data generation for demonstration
✅ Visualization of system working
✅ Actual output generation (Markdown dossiers)