|
| 1 | +# TAPDB Development Guidelines (AGENT.md) |
| 2 | + |
| 3 | +## AI Assistant Instructions |
| 4 | + |
| 5 | +This document provides guidelines for AI assistants working on the TAPDB (Templated Abstract Polymorphic Database) library. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Project Overview |
| 10 | + |
| 11 | +TAPDB is a standalone library extracted from BLOOM LIMS that implements a **three-table polymorphic object model** with JSON-driven template configuration. The core innovation is enabling new object types through JSON templates without code changes. |
| 12 | + |
| 13 | +### Key Files |
| 14 | + |
| 15 | +| File/Directory | Purpose | |
| 16 | +|----------------|---------| |
| 17 | +| `tapdb/models/` | SQLAlchemy ORM classes | |
| 18 | +| `tapdb/connection.py` | Database connection manager | |
| 19 | +| `tapdb/templates/` | Template loading and management | |
| 20 | +| `tapdb/factory/` | Instance creation logic | |
| 21 | +| `schema/tapdb_schema.sql` | PostgreSQL DDL | |
| 22 | +| `config/` | JSON template definitions | |
| 23 | + |
| 24 | +--- |
| 25 | + |
| 26 | +## Architecture Principles |
| 27 | + |
| 28 | +### 1. Three-Table Model |
| 29 | + |
| 30 | +All data lives in three tables: |
| 31 | +- `generic_template` - Blueprints/definitions |
| 32 | +- `generic_instance` - Concrete objects |
| 33 | +- `generic_instance_lineage` - Relationships |
| 34 | + |
| 35 | +**Never add new tables for new object types.** New types are defined via JSON templates. |
| 36 | + |
| 37 | +### 2. Polymorphic Inheritance |
| 38 | + |
| 39 | +Use SQLAlchemy's single-table inheritance: |
| 40 | + |
| 41 | +```python |
| 42 | +class container_instance(generic_instance): |
| 43 | + __mapper_args__ = { |
| 44 | + 'polymorphic_identity': 'container_instance' |
| 45 | + } |
| 46 | +``` |
| 47 | + |
| 48 | +### 3. JSON-Driven Configuration |
| 49 | + |
| 50 | +All customization happens in `json_addl`: |
| 51 | +- Properties |
| 52 | +- Instantiation layouts |
| 53 | +- Actions |
| 54 | +- Metadata |
| 55 | + |
| 56 | +### 4. Soft Deletes |
| 57 | + |
| 58 | +**Never use DELETE.** Always set `is_deleted = TRUE`. |
| 59 | + |
| 60 | +### 5. Audit Trail |
| 61 | + |
| 62 | +All changes are tracked via database triggers. Set `session.current_username` before operations. |
| 63 | + |
| 64 | +--- |
| 65 | + |
| 66 | +## Coding Standards |
| 67 | + |
| 68 | +### Python Style |
| 69 | + |
| 70 | +- Python 3.10+ required |
| 71 | +- Type hints on all public functions |
| 72 | +- Docstrings for all public APIs |
| 73 | +- Format with `black`, lint with `ruff` |
| 74 | + |
| 75 | +### Naming Conventions |
| 76 | + |
| 77 | +| Element | Convention | Example | |
| 78 | +|---------|------------|---------| |
| 79 | +| ORM classes | snake_case | `generic_instance` | |
| 80 | +| Python classes | PascalCase | `TemplateManager` | |
| 81 | +| Functions | snake_case | `create_instance` | |
| 82 | +| Constants | UPPER_SNAKE | `DEFAULT_BATCH_SIZE` | |
| 83 | +| Template codes | slash-separated | `container/plate/fixed-plate-96/1.0/` | |
| 84 | + |
| 85 | +### Import Order |
| 86 | + |
| 87 | +```python |
| 88 | +# Standard library |
| 89 | +import os |
| 90 | +from pathlib import Path |
| 91 | + |
| 92 | +# Third-party |
| 93 | +from sqlalchemy import Column, Text |
| 94 | +from sqlalchemy.orm import relationship |
| 95 | + |
| 96 | +# Local |
| 97 | +from tapdb.models import generic_instance |
| 98 | +from tapdb.connection import TAPDBConnection |
| 99 | +``` |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## Database Guidelines |
| 104 | + |
| 105 | +### Schema Changes |
| 106 | + |
| 107 | +1. **Never modify core table structure** without migration |
| 108 | +2. Use Alembic for all schema changes |
| 109 | +3. Test migrations both up and down |
| 110 | +4. Preserve trigger functions |
| 111 | + |
| 112 | +### JSON Queries |
| 113 | + |
| 114 | +Use GIN-indexed patterns: |
| 115 | + |
| 116 | +```python |
| 117 | +# Good - uses GIN index |
| 118 | +session.query(generic_instance).filter( |
| 119 | + generic_instance.json_addl.contains({'properties': {'type': 'blood'}}) |
| 120 | +) |
| 121 | + |
| 122 | +# Avoid - doesn't use index efficiently |
| 123 | +session.query(generic_instance).filter( |
| 124 | + generic_instance.json_addl['properties']['type'].astext == 'blood' |
| 125 | +) |
| 126 | +``` |
| 127 | + |
| 128 | +### Session Management |
| 129 | + |
| 130 | +```python |
| 131 | +# Always use context manager for transactions |
| 132 | +with db.session_scope() as session: |
| 133 | + instance = factory.create_instance(...) |
| 134 | + # Auto-commits on success, rolls back on exception |
| 135 | +``` |
| 136 | + |
| 137 | +--- |
| 138 | + |
| 139 | +## Template System |
| 140 | + |
| 141 | +### Template Code Format |
| 142 | + |
| 143 | +``` |
| 144 | +{super_type}/{btype}/{b_sub_type}/{version}/ |
| 145 | +``` |
| 146 | + |
| 147 | +Always include trailing slash. |
| 148 | + |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +## Common Tasks |
| 153 | + |
| 154 | +### Adding a New Polymorphic Type |
| 155 | + |
| 156 | +1. Add ORM class in `tapdb/models/`: |
| 157 | + |
| 158 | +```python |
| 159 | +class new_type_instance(generic_instance): |
| 160 | + __mapper_args__ = { |
| 161 | + 'polymorphic_identity': 'new_type_instance' |
| 162 | + } |
| 163 | +``` |
| 164 | + |
| 165 | +2. Add template class if needed: |
| 166 | + |
| 167 | +```python |
| 168 | +class new_type_template(generic_template): |
| 169 | + __mapper_args__ = { |
| 170 | + 'polymorphic_identity': 'new_type_template' |
| 171 | + } |
| 172 | +``` |
| 173 | + |
| 174 | +3. Register in `__init__.py` exports |
| 175 | +4. Add EUID sequence in schema if new prefix needed |
| 176 | +5. Create JSON templates in `config/new_type/` |
| 177 | + |
| 178 | +### Adding a New Action |
| 179 | + |
| 180 | +1. Define action template in `config/action/core.json`: |
| 181 | + |
| 182 | +```json |
| 183 | +{ |
| 184 | + "new_action": { |
| 185 | + "1.0": { |
| 186 | + "action_template": { |
| 187 | + "action_name": "New Action", |
| 188 | + "method_name": "do_action_new_action", |
| 189 | + "action_enabled": "1", |
| 190 | + ... |
| 191 | + } |
| 192 | + } |
| 193 | + } |
| 194 | +} |
| 195 | +``` |
| 196 | + |
| 197 | +2. Implement handler method: |
| 198 | + |
| 199 | +```python |
| 200 | +def do_action_new_action(self, instance, data): |
| 201 | + # Implementation |
| 202 | + return {'status': 'success'} |
| 203 | +``` |
| 204 | + |
| 205 | +3. Import in target templates via `action_imports` |
| 206 | + |
| 207 | +### Querying Lineage |
| 208 | + |
| 209 | +```python |
| 210 | +# Get all children of an instance |
| 211 | +children = session.query(generic_instance).join( |
| 212 | + generic_instance_lineage, |
| 213 | + generic_instance_lineage.child_instance_uuid == generic_instance.uuid |
| 214 | +).filter( |
| 215 | + generic_instance_lineage.parent_instance_uuid == parent.uuid, |
| 216 | + generic_instance_lineage.is_deleted == False |
| 217 | +).all() |
| 218 | + |
| 219 | +# Get all parents |
| 220 | +parents = session.query(generic_instance).join( |
| 221 | + generic_instance_lineage, |
| 222 | + generic_instance_lineage.parent_instance_uuid == generic_instance.uuid |
| 223 | +).filter( |
| 224 | + generic_instance_lineage.child_instance_uuid == child.uuid, |
| 225 | + generic_instance_lineage.is_deleted == False |
| 226 | +).all() |
| 227 | +``` |
| 228 | + |
| 229 | +--- |
| 230 | + |
| 231 | +## Performance Guidelines |
| 232 | + |
| 233 | +### Batch Operations |
| 234 | + |
| 235 | +Always batch large operations: |
| 236 | + |
| 237 | +```python |
| 238 | +BATCH_SIZE = 100 |
| 239 | + |
| 240 | +for i in range(0, len(items), BATCH_SIZE): |
| 241 | + batch = items[i:i + BATCH_SIZE] |
| 242 | + for item in batch: |
| 243 | + session.add(item) |
| 244 | + session.flush() |
| 245 | +``` |
| 246 | + |
| 247 | +### Eager Loading |
| 248 | + |
| 249 | +Use `joinedload` for known relationships: |
| 250 | + |
| 251 | +```python |
| 252 | +from sqlalchemy.orm import joinedload |
| 253 | + |
| 254 | +instances = session.query(generic_instance).options( |
| 255 | + joinedload(generic_instance.template) |
| 256 | +).filter(...).all() |
| 257 | +``` |
| 258 | + |
| 259 | +### JSON Field Updates |
| 260 | + |
| 261 | +Always use `flag_modified` after updating json_addl: |
| 262 | + |
| 263 | +```python |
| 264 | +from sqlalchemy.orm.attributes import flag_modified |
| 265 | + |
| 266 | +instance.json_addl['properties']['key'] = 'value' |
| 267 | +flag_modified(instance, 'json_addl') |
| 268 | +``` |
| 269 | + |
| 270 | +--- |
| 271 | + |
| 272 | +## Error Handling |
| 273 | + |
| 274 | +### Standard Exceptions |
| 275 | + |
| 276 | +```python |
| 277 | +class TAPDBError(Exception): |
| 278 | + """Base exception for TAPDB.""" |
| 279 | + pass |
| 280 | + |
| 281 | +class TemplateNotFoundError(TAPDBError): |
| 282 | + """Raised when a template code doesn't resolve.""" |
| 283 | + pass |
| 284 | + |
| 285 | +class InvalidTemplateCodeError(TAPDBError): |
| 286 | + """Raised when template code format is invalid.""" |
| 287 | + pass |
| 288 | + |
| 289 | +class LineageError(TAPDBError): |
| 290 | + """Raised for lineage-related errors.""" |
| 291 | + pass |
| 292 | +``` |
| 293 | + |
| 294 | +### Error Messages |
| 295 | + |
| 296 | +Include context in error messages: |
| 297 | + |
| 298 | +```python |
| 299 | +raise TemplateNotFoundError( |
| 300 | + f"Template not found: {template_code}. " |
| 301 | + f"Searched in super_type={parts['super_type']}, btype={parts['btype']}" |
| 302 | +) |
| 303 | +``` |
| 304 | + |
| 305 | +--- |
| 306 | + |
| 307 | +## Security Considerations |
| 308 | + |
| 309 | +### SQL Injection |
| 310 | + |
| 311 | +Always use parameterized queries: |
| 312 | + |
| 313 | +```python |
| 314 | +# Good |
| 315 | +session.execute(text("SELECT * FROM t WHERE id = :id"), {"id": user_input}) |
| 316 | + |
| 317 | +# Bad - never do this |
| 318 | +session.execute(text(f"SELECT * FROM t WHERE id = {user_input}")) |
| 319 | +``` |
| 320 | + |
| 321 | +### JSON Validation |
| 322 | + |
| 323 | +Validate JSON input before storing: |
| 324 | + |
| 325 | +```python |
| 326 | +import jsonschema |
| 327 | + |
| 328 | +def validate_properties(properties: dict, schema: dict): |
| 329 | + jsonschema.validate(properties, schema) |
| 330 | +``` |
| 331 | + |
| 332 | +--- |
| 333 | + |
| 334 | +## Debugging Tips |
| 335 | + |
| 336 | +### Enable SQL Logging |
| 337 | + |
| 338 | +```python |
| 339 | +db = TAPDBConnection(echo=True) # Logs all SQL |
| 340 | +``` |
| 341 | + |
| 342 | +### Inspect Polymorphic Type |
| 343 | + |
| 344 | +```python |
| 345 | +print(instance.polymorphic_discriminator) |
| 346 | +print(type(instance).__name__) |
| 347 | +``` |
| 348 | + |
| 349 | +### Check Audit Trail |
| 350 | + |
| 351 | +```sql |
| 352 | +SELECT * FROM audit_log |
| 353 | +WHERE rel_table_uuid_fk = 'instance-uuid' |
| 354 | +ORDER BY changed_at DESC; |
| 355 | +``` |
| 356 | + |
| 357 | +--- |
| 358 | + |
| 359 | +## Do Not |
| 360 | + |
| 361 | +- ❌ Add new tables for new object types |
| 362 | +- ❌ Use physical DELETE operations |
| 363 | +- ❌ Modify json_addl without flag_modified |
| 364 | +- ❌ Skip audit logging (always set session.current_username) |
| 365 | +- ❌ Create circular lineage relationships |
| 366 | +- ❌ Use raw SQL without parameterization |
| 367 | +- ❌ Commit in library code (let caller manage transactions) |
| 368 | + |
| 369 | +--- |
| 370 | + |
| 371 | +## Do |
| 372 | + |
| 373 | +- ✅ Define new types via JSON templates |
| 374 | +- ✅ Use soft deletes (is_deleted = TRUE) |
| 375 | +- ✅ Use flag_modified for JSON updates |
| 376 | +- ✅ Set session.current_username for audit |
| 377 | +- ✅ Use batch operations for bulk inserts |
| 378 | +- ✅ Write tests for all new functionality |
| 379 | +- ✅ Use type hints and docstrings |
| 380 | + |
| 381 | +--- |
| 382 | + |
| 383 | +*AGENT.md for TAPDB Library - AI Assistant Development Guidelines* |
| 384 | + |
| 385 | + |
0 commit comments