Skip to content

Commit 2b9a7ac

Browse files
authored
Merge pull request #155 from Daylily-Informatics/feature/complete-domain-refactor
X
2 parents 4640179 + b8bbf1b commit 2b9a7ac

File tree

5 files changed

+388
-972
lines changed

5 files changed

+388
-972
lines changed

BLOOM_SPECIFICATION.md

Lines changed: 175 additions & 131 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,23 @@
11
# BLOOM LIMS - Comprehensive System Specification
22

3+
> **Last Updated**: 2024-12-24 | **Version**: Dynamically fetched from [GitHub Releases](https://github.com/Daylily-Informatics/bloom/releases)
4+
35
## Table of Contents
46
1. [Executive Summary](#executive-summary)
57
2. [System Architecture](#system-architecture)
6-
3. [Core Data Model](#core-data-model)
7-
4. [Database Schema](#database-schema)
8-
5. [Object Hierarchy](#object-hierarchy)
9-
6. [Template System](#template-system)
10-
7. [Workflow Engine](#workflow-engine)
11-
8. [Action System](#action-system)
12-
9. [File Management](#file-management)
13-
10. [API Layer](#api-layer)
14-
11. [Web Interface](#web-interface)
15-
12. [External Integrations](#external-integrations)
16-
13. [Configuration](#configuration)
17-
14. [Deployment](#deployment)
8+
3. [Domain Layer](#domain-layer)
9+
4. [Core Data Model](#core-data-model)
10+
5. [Database Schema](#database-schema)
11+
6. [Object Hierarchy](#object-hierarchy)
12+
7. [Template System](#template-system)
13+
8. [Workflow Engine](#workflow-engine)
14+
9. [Action System](#action-system)
15+
10. [File Management](#file-management)
16+
11. [API Layer](#api-layer)
17+
12. [Web Interface](#web-interface)
18+
13. [External Integrations](#external-integrations)
19+
14. [Configuration](#configuration)
20+
15. [Deployment](#deployment)
1821

1922
---
2023

@@ -30,14 +33,18 @@ BLOOM (Bioinformatics Laboratory Operations and Object Management) is a Laborato
3033
- **File management**: S3-compatible file storage with metadata tracking
3134
- **Barcode/label printing**: Integration with Zebra label printers via zebra_day
3235
- **FedEx tracking**: Package tracking integration via fedex_tracking_day
33-
- **Multi-interface support**: Flask web UI, FastAPI REST API, and CherryPy admin interface
36+
- **Multi-interface support**: FastAPI REST API (primary), Flask web UI
37+
- **Domain-driven architecture**: Clean separation with 8 specialized domain modules
38+
- **Pydantic validation**: Comprehensive input/output validation with schema modules
39+
- **Health monitoring**: Kubernetes-ready health check endpoints
3440

3541
### Technology Stack
36-
- **Language**: Python 3.x
37-
- **Database**: PostgreSQL (via SQLAlchemy ORM)
38-
- **Web Frameworks**: Flask (UI), FastAPI (API), CherryPy (Admin)
42+
- **Language**: Python 3.12+
43+
- **Database**: PostgreSQL 15+ (via SQLAlchemy ORM with Alembic migrations)
44+
- **Web Frameworks**: FastAPI (primary API), Flask (legacy UI)
3945
- **Storage**: AWS S3 / Supabase Storage
40-
- **Authentication**: Supabase Auth
46+
- **Authentication**: Supabase Auth (OAuth2 with social providers)
47+
- **Validation**: Pydantic v2 with pydantic-settings
4148
- **Label Printing**: zebra_day library
4249
- **Package Tracking**: fedex_tracking_day library
4350

@@ -100,25 +107,114 @@ BLOOM (Bioinformatics Laboratory Operations and Object Management) is a Laborato
100107

101108
```
102109
bloom_lims/
110+
├── _version.py # Dynamic version from GitHub releases
111+
├── __init__.py # Package initialization & exports
112+
├── config.py # Pydantic settings & configuration
113+
├── exceptions.py # Typed exception hierarchy
103114
├── bdb.py # SQLAlchemy ORM models and base classes
104115
├── db.py # Database connection and session management (BLOOMdb3)
105-
├── bobjs.py # Business logic classes (BloomObj, BloomWorkflow, etc.)
116+
├── bobjs.py # Legacy business logic (transitioning to domain/)
106117
├── bfile.py # File management (BloomFile, BloomFileSet)
107118
├── bequip.py # Equipment management (BloomEquipment)
108119
├── env.py # Environment configuration
109-
├── config/ # Configuration files
110-
│ ├── assay_config.yaml
111-
│ └── fedex_config.yaml
112-
└── templates/ # Jinja2 HTML templates for Flask UI
120+
121+
├── domain/ # Domain-driven business logic (NEW)
122+
│ ├── __init__.py
123+
│ ├── base.py # BaseDomainService with common patterns
124+
│ ├── utils.py # Shared domain utilities
125+
│ ├── containers.py # Container operations
126+
│ ├── content.py # Sample/specimen operations
127+
│ ├── equipment.py # Equipment management
128+
│ ├── files.py # File handling service
129+
│ ├── lineage.py # Object relationship tracking
130+
│ ├── templates.py # Template management
131+
│ └── workflows.py # Workflow orchestration
132+
133+
├── schemas/ # Pydantic validation schemas (NEW)
134+
│ ├── __init__.py
135+
│ ├── base.py # Common schema patterns
136+
│ ├── containers.py # Container validation
137+
│ ├── content.py # Content/sample validation
138+
│ ├── equipment.py # Equipment validation
139+
│ ├── files.py # File validation
140+
│ ├── lineage.py # Lineage validation
141+
│ ├── templates.py # Template validation
142+
│ ├── workflows.py # Workflow validation
143+
│ └── api/ # API request/response schemas
144+
145+
├── core/ # Cross-cutting concerns (NEW)
146+
│ ├── cache.py # Caching abstraction
147+
│ ├── exceptions.py # Core exceptions
148+
│ └── validation.py # Validation utilities
149+
150+
├── api/ # API layer (NEW)
151+
│ ├── versioning.py # API version negotiation
152+
│ └── rate_limiting.py # Request rate limiting
153+
154+
├── migrations/ # Alembic database migrations (NEW)
155+
│ ├── env.py
156+
│ └── versions/
157+
158+
├── health.py # Health check endpoints (NEW)
159+
├── backup/ # Backup CLI and utilities
160+
└── config/ # Configuration files
161+
├── assay_config.yaml
162+
└── fedex_config.yaml
113163
```
114164

115165
### 2.3 Entry Points
116166

117167
| Entry Point | File | Port | Purpose |
118168
|-------------|------|------|---------|
169+
| Main App | `main.py` | 5000 | Combined UI + API |
119170
| Flask UI | `bloom_lims/bkend/bkend.py` | 5000 | Web-based user interface |
120171
| FastAPI | `bloom_lims/bkend/fastapi_bkend.py` | 8000 | REST API |
121-
| CherryPy | `bloom_lims/bkend/cherrypy_bkend.py` | 8080 | Admin interface |
172+
| Health | `/health/*` | - | Kubernetes probes |
173+
174+
### 2.4 Domain Layer (NEW)
175+
176+
The domain layer provides a clean separation of business logic:
177+
178+
```mermaid
179+
graph TB
180+
subgraph "API Layer"
181+
API[FastAPI Endpoints]
182+
Flask[Flask UI Routes]
183+
end
184+
185+
subgraph "Domain Layer"
186+
Base[BaseDomainService]
187+
Containers[ContainerService]
188+
Content[ContentService]
189+
Equipment[EquipmentService]
190+
Files[FileService]
191+
Lineage[LineageService]
192+
Templates[TemplateService]
193+
Workflows[WorkflowService]
194+
end
195+
196+
subgraph "Data Layer"
197+
DB[BLOOMdb3]
198+
Models[SQLAlchemy Models]
199+
end
200+
201+
API --> Containers
202+
API --> Content
203+
API --> Workflows
204+
Flask --> Containers
205+
Flask --> Content
206+
207+
Containers --> Base
208+
Content --> Base
209+
Equipment --> Base
210+
Files --> Base
211+
Lineage --> Base
212+
Templates --> Base
213+
Workflows --> Base
214+
215+
Base --> DB
216+
DB --> Models
217+
```
122218

123219
---
124220

@@ -1181,119 +1277,67 @@ results = bobj.search_objects(
11811277

11821278
---
11831279

1184-
## Appendix C: System Strengths and Weaknesses Analysis
1185-
1186-
### C.1 Strengths (Ranked by Impact)
1187-
1188-
#### 🟢 HIGH IMPACT STRENGTHS
1189-
1190-
| Rank | Strength | Description | Business Value |
1191-
|------|----------|-------------|----------------|
1192-
| 1 | **Template-Driven Architecture** | All objects created from JSON templates without code changes | Enables rapid customization for different lab workflows |
1193-
| 2 | **Flexible JSON Storage** | `json_addl` field allows arbitrary properties without schema changes | Adapts to evolving requirements without migrations |
1194-
| 3 | **Comprehensive Lineage Tracking** | Full parent-child relationships with audit trail | Complete sample provenance and regulatory compliance |
1195-
| 4 | **Multi-Interface Support** | Flask UI, FastAPI, CherryPy all available | Different interfaces for different use cases |
1196-
| 5 | **Action System** | Configurable actions defined in templates | Business logic changes without code deployment |
1197-
1198-
#### 🟡 MEDIUM IMPACT STRENGTHS
1199-
1200-
| Rank | Strength | Description | Business Value |
1201-
|------|----------|-------------|----------------|
1202-
| 6 | **SQLAlchemy ORM** | Mature, well-tested database abstraction | Reliable data access, potential DB portability |
1203-
| 7 | **External Integrations** | zebra_day, fedex_tracking_day built-in | Ready-to-use lab equipment integration |
1204-
| 8 | **Soft Delete Pattern** | `is_deleted` flag preserves data | Data recovery, audit compliance |
1205-
| 9 | **Hierarchical Classification** | super_type/btype/b_sub_type/version | Organized, queryable object taxonomy |
1206-
| 10 | **EUID System** | Human-readable unique identifiers | Easy barcode scanning and manual entry |
1207-
1208-
### C.2 Weaknesses (Ranked by Urgency)
1209-
1210-
#### 🔴 CRITICAL - Address Immediately
1211-
1212-
| Rank | Weakness | Description | Risk | Recommended Fix |
1213-
|------|----------|-------------|------|-----------------|
1214-
| 1 | **No Database Migrations** | Schema changes require manual intervention | Data loss, deployment failures | Implement Alembic migrations |
1215-
| 2 | **Hardcoded Values in Business Logic** | Many methods have hardcoded types/versions | Brittle, hard to maintain | Extract to configuration |
1216-
| 3 | **Inconsistent Error Handling** | Mix of exceptions, logging, silent failures | Unpredictable behavior | Standardize error handling |
1217-
| 4 | **Session Management Issues** | Commented `session.flush()`, manual commits | Transaction integrity risks | Implement proper UoW pattern |
1218-
| 5 | **No Input Validation** | Limited validation on API inputs | Security vulnerabilities | Add Pydantic models |
1219-
1220-
#### 🟠 HIGH - Address Soon
1221-
1222-
| Rank | Weakness | Description | Risk | Recommended Fix |
1223-
|------|----------|-------------|------|-----------------|
1224-
| 6 | **Limited Test Coverage** | Tests exist but coverage unclear | Regression bugs | Add comprehensive test suite |
1225-
| 7 | **No API Versioning** | API endpoints not versioned | Breaking changes affect clients | Implement /v1/ prefix |
1226-
| 8 | **Monolithic bobjs.py** | 3800+ lines in single file | Hard to maintain/test | Split into focused modules |
1227-
| 9 | **No Caching Layer** | Every query hits database | Performance issues at scale | Add Redis/memcached |
1228-
| 10 | **Synchronous Operations** | All operations blocking | Poor scalability | Add async support |
1280+
## Appendix C: Architecture Improvements Summary
1281+
1282+
### C.1 Strengths
1283+
1284+
| Strength | Description | Business Value |
1285+
|----------|-------------|----------------|
1286+
| **Template-Driven Architecture** | All objects created from JSON templates without code changes | Enables rapid customization for different lab workflows |
1287+
| **Domain-Driven Design** | Clean separation into 8 specialized domain modules | Maintainable, testable, scalable codebase |
1288+
| **Flexible JSON Storage** | `json_addl` field allows arbitrary properties without schema changes | Adapts to evolving requirements without migrations |
1289+
| **Comprehensive Lineage Tracking** | Full parent-child relationships with audit trail | Complete sample provenance and regulatory compliance |
1290+
| **Pydantic Validation** | Strong typing and validation on all inputs | Security, reliability, developer experience |
1291+
| **Health Monitoring** | Kubernetes-ready health probes | Production-grade deployment support |
1292+
| **SQLAlchemy + Alembic** | Mature ORM with migration support | Reliable data access, safe schema evolution |
1293+
| **External Integrations** | zebra_day, fedex_tracking_day built-in | Ready-to-use lab equipment integration |
1294+
| **EUID System** | Human-readable unique identifiers | Easy barcode scanning and manual entry |
1295+
| **Dynamic Versioning** | Version pulled from GitHub releases | Always accurate version display |
1296+
1297+
### C.2 Remaining Work
1298+
1299+
| Priority | Item | Status |
1300+
|----------|------|--------|
1301+
| Medium | Redis caching integration | Module exists, needs connection |
1302+
| Medium | Full async support | Partial implementation |
1303+
| Medium | Batch API endpoints | Planned |
1304+
| Low | Plugin architecture | Planned |
1305+
| Low | GraphQL API | Planned |
1306+
1307+
### C.3 Architecture Progress
1308+
1309+
```mermaid
1310+
pie title Technical Debt Resolution
1311+
"Completed" : 75
1312+
"In Progress" : 15
1313+
"Remaining" : 10
1314+
```
12291315

1230-
#### 🟡 MEDIUM - Plan for Future
1316+
### C.4 Completed Improvements ✅
12311317

1232-
| Rank | Weakness | Description | Risk | Recommended Fix |
1233-
|------|----------|-------------|------|-----------------|
1234-
| 11 | **No Rate Limiting** | API has no request limits | DoS vulnerability | Add rate limiting middleware |
1235-
| 12 | **Limited Logging Structure** | Inconsistent log formats | Hard to debug/monitor | Implement structured logging |
1236-
| 13 | **No Health Checks** | No endpoint for service health | Deployment monitoring gaps | Add /health endpoint |
1237-
| 14 | **Documentation Gaps** | Limited inline documentation | Onboarding difficulty | Add docstrings, type hints |
1238-
| 15 | **No Batch Operations** | Single-object operations only | Slow bulk processing | Add batch API endpoints |
1318+
The following items from the original roadmap have been completed:
12391319

1240-
### C.3 Technical Debt Summary
1320+
#### Phase 1: Stability
1321+
-**Alembic database migrations** - `bloom_lims/migrations/`
1322+
-**Standardized error handling** - `bloom_lims/exceptions.py`, `bloom_lims/core/exceptions.py`
1323+
-**Pydantic input validation** - `bloom_lims/schemas/` (10 schema modules)
1324+
-**Session management patterns** - `_TransactionContext`, context managers in `BLOOMdb3`
1325+
-**Comprehensive logging** - Structured logging throughout
12411326

1242-
```
1243-
┌─────────────────────────────────────────────────────────────────┐
1244-
│ TECHNICAL DEBT MATRIX │
1245-
├─────────────────────────────────────────────────────────────────┤
1246-
│ │
1247-
│ HIGH IMPACT │
1248-
│ ▲ │
1249-
│ │ ┌─────────────┐ ┌─────────────┐ │
1250-
│ │ │ Migrations │ │ Error │ │
1251-
│ │ │ (CRITICAL) │ │ Handling │ │
1252-
│ │ └─────────────┘ └─────────────┘ │
1253-
│ │ │
1254-
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
1255-
│ │ │ Session │ │ Input │ │ Test │ │
1256-
│ │ │ Management │ │ Validation │ │ Coverage │ │
1257-
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │
1258-
│ │ │
1259-
│ │ ┌─────────────┐ ┌─────────────┐ │
1260-
│ │ │ Monolithic │ │ API │ │
1261-
│ │ │ Code │ │ Versioning │ │
1262-
│ │ └─────────────┘ └─────────────┘ │
1263-
│ │ │
1264-
│ LOW IMPACT │
1265-
│ └──────────────────────────────────────────────────────▶ │
1266-
│ LOW EFFORT HIGH EFFORT │
1267-
│ │
1268-
└─────────────────────────────────────────────────────────────────┘
1269-
```
1327+
#### Phase 2: Quality
1328+
-**Domain module refactor** - `bloom_lims/domain/` (8 specialized modules)
1329+
-**API versioning** - `/api/v1/` prefix with `bloom_lims/api/versioning.py`
1330+
-**Health check endpoints** - `/health`, `/health/live`, `/health/ready`, `/health/metrics`
1331+
-**Type hints** - Added throughout new modules
12701332

1271-
### C.4 Recommended Improvement Roadmap
1272-
1273-
#### Phase 1: Stability (1-2 months)
1274-
1. Implement Alembic database migrations
1275-
2. Standardize error handling across all modules
1276-
3. Add input validation with Pydantic
1277-
4. Fix session management patterns
1278-
5. Add comprehensive logging
1279-
1280-
#### Phase 2: Quality (2-3 months)
1281-
1. Increase test coverage to 80%+
1282-
2. Split bobjs.py into focused modules
1283-
3. Add API versioning
1284-
4. Implement health check endpoints
1285-
5. Add type hints throughout
1286-
1287-
#### Phase 3: Scale (3-6 months)
1288-
1. Add caching layer (Redis)
1289-
2. Implement async operations
1290-
3. Add batch API endpoints
1291-
4. Implement rate limiting
1292-
5. Add performance monitoring
1333+
#### Phase 3: Scale (In Progress)
1334+
- 🔲 **Caching layer** - Module exists at `bloom_lims/core/cache.py`, needs Redis integration
1335+
- 🔲 **Async operations** - Partial support, full implementation pending
1336+
- 🔲 **Batch API endpoints** - Planned
1337+
-**Rate limiting** - `bloom_lims/api/rate_limiting.py`
12931338

12941339
---
12951340

1296-
*Document Version: 1.0*
1297-
*Last Updated: 2024-12-23*
1298-
*Generated for BLOOM LIMS v0.10.12*
1299-
1341+
*Document Version: 2.0*
1342+
*Last Updated: 2024-12-24*
1343+
*BLOOM LIMS - Version dynamically fetched from GitHub releases*

0 commit comments

Comments
 (0)