Skip to content

Commit 7803271

Browse files
author
John Major
committed
X
1 parent 9fe6758 commit 7803271

File tree

4 files changed

+2397
-0
lines changed

4 files changed

+2397
-0
lines changed

docs/AGENT.md

Lines changed: 385 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,385 @@
1+
# TAPDB Development Guidelines (AGENT.md)
2+
3+
## AI Assistant Instructions
4+
5+
This document provides guidelines for AI assistants working on the TAPDB (Templated Abstract Polymorphic Database) library.
6+
7+
---
8+
9+
## Project Overview
10+
11+
TAPDB is a standalone library extracted from BLOOM LIMS that implements a **three-table polymorphic object model** with JSON-driven template configuration. The core innovation is enabling new object types through JSON templates without code changes.
12+
13+
### Key Files
14+
15+
| File/Directory | Purpose |
16+
|----------------|---------|
17+
| `tapdb/models/` | SQLAlchemy ORM classes |
18+
| `tapdb/connection.py` | Database connection manager |
19+
| `tapdb/templates/` | Template loading and management |
20+
| `tapdb/factory/` | Instance creation logic |
21+
| `schema/tapdb_schema.sql` | PostgreSQL DDL |
22+
| `config/` | JSON template definitions |
23+
24+
---
25+
26+
## Architecture Principles
27+
28+
### 1. Three-Table Model
29+
30+
All data lives in three tables:
31+
- `generic_template` - Blueprints/definitions
32+
- `generic_instance` - Concrete objects
33+
- `generic_instance_lineage` - Relationships
34+
35+
**Never add new tables for new object types.** New types are defined via JSON templates.
36+
37+
### 2. Polymorphic Inheritance
38+
39+
Use SQLAlchemy's single-table inheritance:
40+
41+
```python
42+
class container_instance(generic_instance):
43+
__mapper_args__ = {
44+
'polymorphic_identity': 'container_instance'
45+
}
46+
```
47+
48+
### 3. JSON-Driven Configuration
49+
50+
All customization happens in `json_addl`:
51+
- Properties
52+
- Instantiation layouts
53+
- Actions
54+
- Metadata
55+
56+
### 4. Soft Deletes
57+
58+
**Never use DELETE.** Always set `is_deleted = TRUE`.
59+
60+
### 5. Audit Trail
61+
62+
All changes are tracked via database triggers. Set `session.current_username` before operations.
63+
64+
---
65+
66+
## Coding Standards
67+
68+
### Python Style
69+
70+
- Python 3.10+ required
71+
- Type hints on all public functions
72+
- Docstrings for all public APIs
73+
- Format with `black`, lint with `ruff`
74+
75+
### Naming Conventions
76+
77+
| Element | Convention | Example |
78+
|---------|------------|---------|
79+
| ORM classes | snake_case | `generic_instance` |
80+
| Python classes | PascalCase | `TemplateManager` |
81+
| Functions | snake_case | `create_instance` |
82+
| Constants | UPPER_SNAKE | `DEFAULT_BATCH_SIZE` |
83+
| Template codes | slash-separated | `container/plate/fixed-plate-96/1.0/` |
84+
85+
### Import Order
86+
87+
```python
88+
# Standard library
89+
import os
90+
from pathlib import Path
91+
92+
# Third-party
93+
from sqlalchemy import Column, Text
94+
from sqlalchemy.orm import relationship
95+
96+
# Local
97+
from tapdb.models import generic_instance
98+
from tapdb.connection import TAPDBConnection
99+
```
100+
101+
---
102+
103+
## Database Guidelines
104+
105+
### Schema Changes
106+
107+
1. **Never modify core table structure** without migration
108+
2. Use Alembic for all schema changes
109+
3. Test migrations both up and down
110+
4. Preserve trigger functions
111+
112+
### JSON Queries
113+
114+
Use GIN-indexed patterns:
115+
116+
```python
117+
# Good - uses GIN index
118+
session.query(generic_instance).filter(
119+
generic_instance.json_addl.contains({'properties': {'type': 'blood'}})
120+
)
121+
122+
# Avoid - doesn't use index efficiently
123+
session.query(generic_instance).filter(
124+
generic_instance.json_addl['properties']['type'].astext == 'blood'
125+
)
126+
```
127+
128+
### Session Management
129+
130+
```python
131+
# Always use context manager for transactions
132+
with db.session_scope() as session:
133+
instance = factory.create_instance(...)
134+
# Auto-commits on success, rolls back on exception
135+
```
136+
137+
---
138+
139+
## Template System
140+
141+
### Template Code Format
142+
143+
```
144+
{super_type}/{btype}/{b_sub_type}/{version}/
145+
```
146+
147+
Always include trailing slash.
148+
149+
150+
---
151+
152+
## Common Tasks
153+
154+
### Adding a New Polymorphic Type
155+
156+
1. Add ORM class in `tapdb/models/`:
157+
158+
```python
159+
class new_type_instance(generic_instance):
160+
__mapper_args__ = {
161+
'polymorphic_identity': 'new_type_instance'
162+
}
163+
```
164+
165+
2. Add template class if needed:
166+
167+
```python
168+
class new_type_template(generic_template):
169+
__mapper_args__ = {
170+
'polymorphic_identity': 'new_type_template'
171+
}
172+
```
173+
174+
3. Register in `__init__.py` exports
175+
4. Add EUID sequence in schema if new prefix needed
176+
5. Create JSON templates in `config/new_type/`
177+
178+
### Adding a New Action
179+
180+
1. Define action template in `config/action/core.json`:
181+
182+
```json
183+
{
184+
"new_action": {
185+
"1.0": {
186+
"action_template": {
187+
"action_name": "New Action",
188+
"method_name": "do_action_new_action",
189+
"action_enabled": "1",
190+
...
191+
}
192+
}
193+
}
194+
}
195+
```
196+
197+
2. Implement handler method:
198+
199+
```python
200+
def do_action_new_action(self, instance, data):
201+
# Implementation
202+
return {'status': 'success'}
203+
```
204+
205+
3. Import in target templates via `action_imports`
206+
207+
### Querying Lineage
208+
209+
```python
210+
# Get all children of an instance
211+
children = session.query(generic_instance).join(
212+
generic_instance_lineage,
213+
generic_instance_lineage.child_instance_uuid == generic_instance.uuid
214+
).filter(
215+
generic_instance_lineage.parent_instance_uuid == parent.uuid,
216+
generic_instance_lineage.is_deleted == False
217+
).all()
218+
219+
# Get all parents
220+
parents = session.query(generic_instance).join(
221+
generic_instance_lineage,
222+
generic_instance_lineage.parent_instance_uuid == generic_instance.uuid
223+
).filter(
224+
generic_instance_lineage.child_instance_uuid == child.uuid,
225+
generic_instance_lineage.is_deleted == False
226+
).all()
227+
```
228+
229+
---
230+
231+
## Performance Guidelines
232+
233+
### Batch Operations
234+
235+
Always batch large operations:
236+
237+
```python
238+
BATCH_SIZE = 100
239+
240+
for i in range(0, len(items), BATCH_SIZE):
241+
batch = items[i:i + BATCH_SIZE]
242+
for item in batch:
243+
session.add(item)
244+
session.flush()
245+
```
246+
247+
### Eager Loading
248+
249+
Use `joinedload` for known relationships:
250+
251+
```python
252+
from sqlalchemy.orm import joinedload
253+
254+
instances = session.query(generic_instance).options(
255+
joinedload(generic_instance.template)
256+
).filter(...).all()
257+
```
258+
259+
### JSON Field Updates
260+
261+
Always use `flag_modified` after updating json_addl:
262+
263+
```python
264+
from sqlalchemy.orm.attributes import flag_modified
265+
266+
instance.json_addl['properties']['key'] = 'value'
267+
flag_modified(instance, 'json_addl')
268+
```
269+
270+
---
271+
272+
## Error Handling
273+
274+
### Standard Exceptions
275+
276+
```python
277+
class TAPDBError(Exception):
278+
"""Base exception for TAPDB."""
279+
pass
280+
281+
class TemplateNotFoundError(TAPDBError):
282+
"""Raised when a template code doesn't resolve."""
283+
pass
284+
285+
class InvalidTemplateCodeError(TAPDBError):
286+
"""Raised when template code format is invalid."""
287+
pass
288+
289+
class LineageError(TAPDBError):
290+
"""Raised for lineage-related errors."""
291+
pass
292+
```
293+
294+
### Error Messages
295+
296+
Include context in error messages:
297+
298+
```python
299+
raise TemplateNotFoundError(
300+
f"Template not found: {template_code}. "
301+
f"Searched in super_type={parts['super_type']}, btype={parts['btype']}"
302+
)
303+
```
304+
305+
---
306+
307+
## Security Considerations
308+
309+
### SQL Injection
310+
311+
Always use parameterized queries:
312+
313+
```python
314+
# Good
315+
session.execute(text("SELECT * FROM t WHERE id = :id"), {"id": user_input})
316+
317+
# Bad - never do this
318+
session.execute(text(f"SELECT * FROM t WHERE id = {user_input}"))
319+
```
320+
321+
### JSON Validation
322+
323+
Validate JSON input before storing:
324+
325+
```python
326+
import jsonschema
327+
328+
def validate_properties(properties: dict, schema: dict):
329+
jsonschema.validate(properties, schema)
330+
```
331+
332+
---
333+
334+
## Debugging Tips
335+
336+
### Enable SQL Logging
337+
338+
```python
339+
db = TAPDBConnection(echo=True) # Logs all SQL
340+
```
341+
342+
### Inspect Polymorphic Type
343+
344+
```python
345+
print(instance.polymorphic_discriminator)
346+
print(type(instance).__name__)
347+
```
348+
349+
### Check Audit Trail
350+
351+
```sql
352+
SELECT * FROM audit_log
353+
WHERE rel_table_uuid_fk = 'instance-uuid'
354+
ORDER BY changed_at DESC;
355+
```
356+
357+
---
358+
359+
## Do Not
360+
361+
- ❌ Add new tables for new object types
362+
- ❌ Use physical DELETE operations
363+
- ❌ Modify json_addl without flag_modified
364+
- ❌ Skip audit logging (always set session.current_username)
365+
- ❌ Create circular lineage relationships
366+
- ❌ Use raw SQL without parameterization
367+
- ❌ Commit in library code (let caller manage transactions)
368+
369+
---
370+
371+
## Do
372+
373+
- ✅ Define new types via JSON templates
374+
- ✅ Use soft deletes (is_deleted = TRUE)
375+
- ✅ Use flag_modified for JSON updates
376+
- ✅ Set session.current_username for audit
377+
- ✅ Use batch operations for bulk inserts
378+
- ✅ Write tests for all new functionality
379+
- ✅ Use type hints and docstrings
380+
381+
---
382+
383+
*AGENT.md for TAPDB Library - AI Assistant Development Guidelines*
384+
385+

0 commit comments

Comments
 (0)