Skip to content

Commit 5cb5ae4

Browse files
Rename AttributeTypes to Codec Types in documentation
Terminology changes in spec and user docs: - "AttributeTypes" → "Codec Types" (category name) - "AttributeType" → "Codec" (base class) - "@register_type" → "@dj.codec" (decorator) - "type_name" → "name" (class attribute) The term "Codec" better conveys the encode/decode semantics of these types, drawing on the familiar audio/video codec analogy. Code changes (class renaming, backward-compat aliases) to follow. Co-authored-by: dimitri-yatsenko <[email protected]>
1 parent 40d7871 commit 5cb5ae4

File tree

2 files changed

+52
-52
lines changed

2 files changed

+52
-52
lines changed

docs/src/design/tables/attributes.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -76,10 +76,10 @@ for portable pipelines. Using native types will generate a warning.
7676

7777
See the [storage types spec](storage-types-spec.md) for complete mappings.
7878

79-
## AttributeTypes (special datatypes)
79+
## Codec types (special datatypes)
8080

81-
AttributeTypes provide `encode()`/`decode()` semantics for complex data that doesn't
82-
fit native database types. They are denoted with angle brackets: `<type_name>`.
81+
Codecs provide `encode()`/`decode()` semantics for complex data that doesn't
82+
fit native database types. They are denoted with angle brackets: `<name>`.
8383

8484
### Storage mode: `@` convention
8585

@@ -90,7 +90,7 @@ The `@` character indicates **external storage** (object store vs database):
9090
- **`@` alone**: Use default store - e.g., `<blob@>`
9191
- **`@name`**: Use named store - e.g., `<blob@cold>`
9292

93-
### Built-in AttributeTypes
93+
### Built-in codecs
9494

9595
**Serialization types** - for Python objects:
9696

@@ -123,9 +123,9 @@ The `@` character indicates **external storage** (object store vs database):
123123
- `<filepath@store>`: Reference to existing file in a configured store. No file
124124
copying occurs. Returns `ObjectRef` for lazy access. External only. See [filepath](filepath.md).
125125

126-
### User-defined AttributeTypes
126+
### User-defined codecs
127127

128-
- `<custom_type>`: Define your own [custom attribute type](customtype.md) with
128+
- `<custom_type>`: Define your own [custom codec](customtype.md) with
129129
bidirectional conversion between Python objects and database storage. Use for
130130
graphs, domain-specific objects, or custom data structures.
131131

docs/src/design/tables/storage-types-spec.md

Lines changed: 46 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ This document defines a three-layer type architecture:
66

77
1. **Native database types** - Backend-specific (`FLOAT`, `TINYINT UNSIGNED`, `LONGBLOB`). Discouraged for direct use.
88
2. **Core DataJoint types** - Standardized across backends, scientist-friendly (`float32`, `uint8`, `bool`, `json`).
9-
3. **AttributeTypes** - Programmatic types with `encode()`/`decode()` semantics. Composable.
9+
3. **Codec Types** - Programmatic types with `encode()`/`decode()` semantics. Composable.
1010

1111
```
1212
┌───────────────────────────────────────────────────────────────────┐
13-
AttributeTypes (Layer 3) │
13+
Codec Types (Layer 3)
1414
│ │
15-
│ Built-in: <blob> <attach> <object@> <hash@> <filepath@> │
15+
│ Built-in: <blob> <attach> <object@> <hash@> <filepath@>
1616
│ User: <custom> <mytype> ... │
1717
├───────────────────────────────────────────────────────────────────┤
1818
│ Core DataJoint Types (Layer 2) │
@@ -31,7 +31,7 @@ This document defines a three-layer type architecture:
3131

3232
**Syntax distinction:**
3333
- Core types: `int32`, `float64`, `varchar(255)` - no brackets
34-
- AttributeTypes: `<blob>`, `<object@store>`, `<filepath@main>` - angle brackets
34+
- Codec types: `<blob>`, `<object@store>`, `<filepath@main>` - angle brackets
3535
- The `@` character indicates external storage (object store vs database)
3636

3737
### OAS Storage Regions
@@ -106,7 +106,7 @@ created_at : datetime = CURRENT_TIMESTAMP
106106

107107
### Binary Types
108108

109-
The core `bytes` type stores raw bytes without any serialization. Use `<blob>` AttributeType
109+
The core `bytes` type stores raw bytes without any serialization. Use the `<blob>` codec
110110
for serialized Python objects.
111111

112112
| Core Type | Description | MySQL | PostgreSQL |
@@ -193,25 +193,25 @@ definitions. This ensures consistent behavior across all tables and simplifies p
193193
- **No per-column overrides**: `CHARACTER SET` and `COLLATE` are rejected in type definitions
194194
- **Like timezone**: Encoding is infrastructure configuration, not part of the data model
195195

196-
## AttributeTypes (Layer 3)
196+
## Codec Types (Layer 3)
197197

198-
AttributeTypes provide `encode()`/`decode()` semantics on top of core types. They are
198+
Codec types provide `encode()`/`decode()` semantics on top of core types. They are
199199
composable and can be built-in or user-defined.
200200

201201
### Storage Mode: `@` Convention
202202

203-
The `@` character in AttributeType syntax indicates **external storage** (object store):
203+
The `@` character in codec syntax indicates **external storage** (object store):
204204

205205
- **No `@`**: Internal storage (database) - e.g., `<blob>`, `<attach>`
206206
- **`@` present**: External storage (object store) - e.g., `<blob@>`, `<attach@store>`
207207
- **`@` alone**: Use default store - e.g., `<blob@>`
208208
- **`@name`**: Use named store - e.g., `<blob@cold>`
209209

210-
Some types support both modes (`<blob>`, `<attach>`), others are external-only (`<object@>`, `<hash@>`, `<filepath@>`).
210+
Some codecs support both modes (`<blob>`, `<attach>`), others are external-only (`<object@>`, `<hash@>`, `<filepath@>`).
211211

212-
### Type Resolution and Chaining
212+
### Codec Resolution and Chaining
213213

214-
AttributeTypes resolve to core types through chaining. The `get_dtype(is_external)` method
214+
Codecs resolve to core types through chaining. The `get_dtype(is_external)` method
215215
returns the appropriate dtype based on storage mode:
216216

217217
```
@@ -233,7 +233,7 @@ Resolution at declaration time:
233233

234234
### `<object@>` / `<object@store>` - Path-Addressed Storage
235235

236-
**Built-in AttributeType. External only.**
236+
**Built-in codec. External only.**
237237

238238
OAS (Object-Augmented Schema) storage for files and folders:
239239

@@ -257,9 +257,9 @@ class Analysis(dj.Computed):
257257
#### Implementation
258258

259259
```python
260-
class ObjectType(AttributeType):
260+
class ObjectCodec(dj.Codec):
261261
"""Path-addressed OAS storage. External only."""
262-
type_name = "object"
262+
name = "object"
263263

264264
def get_dtype(self, is_external: bool) -> str:
265265
if not is_external:
@@ -278,7 +278,7 @@ class ObjectType(AttributeType):
278278

279279
### `<hash@>` / `<hash@store>` - Hash-Addressed Storage
280280

281-
**Built-in AttributeType. External only.**
281+
**Built-in codec. External only.**
282282

283283
Hash-addressed storage with deduplication:
284284

@@ -303,9 +303,9 @@ store_root/
303303
#### Implementation
304304

305305
```python
306-
class HashType(AttributeType):
306+
class HashCodec(dj.Codec):
307307
"""Hash-addressed storage. External only."""
308-
type_name = "hash"
308+
name = "hash"
309309

310310
def get_dtype(self, is_external: bool) -> str:
311311
if not is_external:
@@ -346,7 +346,7 @@ features JSONB NOT NULL
346346

347347
### `<filepath@store>` - Portable External Reference
348348

349-
**Built-in AttributeType. External only (store required).**
349+
**Built-in codec. External only (store required).**
350350

351351
Relative path references within configured stores:
352352

@@ -397,9 +397,9 @@ just use `varchar`. A string is simpler and more transparent.
397397
#### Implementation
398398

399399
```python
400-
class FilepathType(AttributeType):
400+
class FilepathCodec(dj.Codec):
401401
"""Store-relative file references. External only."""
402-
type_name = "filepath"
402+
name = "filepath"
403403

404404
def get_dtype(self, is_external: bool) -> str:
405405
if not is_external:
@@ -452,12 +452,12 @@ column_name JSONB NOT NULL
452452
```
453453

454454
The `json` database type:
455-
- Used as dtype by built-in AttributeTypes (`<object@>`, `<hash@>`, `<filepath@store>`)
455+
- Used as dtype by built-in codecs (`<object@>`, `<hash@>`, `<filepath@store>`)
456456
- Stores arbitrary JSON-serializable data
457457
- Automatically uses appropriate type for database backend
458458
- Supports JSON path queries where available
459459

460-
## Built-in AttributeTypes
460+
## Built-in Codecs
461461

462462
### `<blob>` / `<blob@>` - Serialized Python Objects
463463

@@ -471,10 +471,10 @@ blob format. Compatible with MATLAB.
471471
- **`<blob@store>`**: Stored in specific named store
472472

473473
```python
474-
@dj.register_type
475-
class BlobType(AttributeType):
474+
@dj.codec
475+
class BlobCodec(dj.Codec):
476476
"""Serialized Python objects. Supports internal and external."""
477-
type_name = "blob"
477+
name = "blob"
478478

479479
def get_dtype(self, is_external: bool) -> str:
480480
return "<hash>" if is_external else "bytes"
@@ -511,10 +511,10 @@ Stores files with filename preserved. On fetch, extracts to configured download
511511
- **`<attach@store>`**: Stored in specific named store
512512

513513
```python
514-
@dj.register_type
515-
class AttachType(AttributeType):
514+
@dj.codec
515+
class AttachCodec(dj.Codec):
516516
"""File attachment with filename. Supports internal and external."""
517-
type_name = "attach"
517+
name = "attach"
518518

519519
def get_dtype(self, is_external: bool) -> str:
520520
return "<hash>" if is_external else "bytes"
@@ -543,15 +543,15 @@ class Attachments(dj.Manual):
543543
"""
544544
```
545545

546-
## User-Defined AttributeTypes
546+
## User-Defined Codecs
547547

548-
Users can define custom AttributeTypes for domain-specific data:
548+
Users can define custom codecs for domain-specific data:
549549

550550
```python
551-
@dj.register_type
552-
class GraphType(AttributeType):
551+
@dj.codec
552+
class GraphCodec(dj.Codec):
553553
"""Store NetworkX graphs. Internal only (no external support)."""
554-
type_name = "graph"
554+
name = "graph"
555555

556556
def get_dtype(self, is_external: bool) -> str:
557557
if is_external:
@@ -568,13 +568,13 @@ class GraphType(AttributeType):
568568
return G
569569
```
570570

571-
Custom types can support both modes by returning different dtypes:
571+
Custom codecs can support both modes by returning different dtypes:
572572

573573
```python
574-
@dj.register_type
575-
class ImageType(AttributeType):
574+
@dj.codec
575+
class ImageCodec(dj.Codec):
576576
"""Store images. Supports both internal and external."""
577-
type_name = "image"
577+
name = "image"
578578

579579
def get_dtype(self, is_external: bool) -> str:
580580
return "<hash>" if is_external else "bytes"
@@ -632,7 +632,7 @@ def garbage_collect(store_name):
632632
store.delete(hash_path(hash_id))
633633
```
634634

635-
## Built-in AttributeType Comparison
635+
## Built-in Codec Comparison
636636

637637
| Feature | `<blob>` | `<attach>` | `<object@>` | `<hash@>` | `<filepath@>` |
638638
|---------|----------|------------|-------------|--------------|---------------|
@@ -658,13 +658,13 @@ def garbage_collect(store_name):
658658
1. **Three-layer architecture**:
659659
- Layer 1: Native database types (backend-specific, discouraged)
660660
- Layer 2: Core DataJoint types (standardized, scientist-friendly)
661-
- Layer 3: AttributeTypes (encode/decode, composable)
661+
- Layer 3: Codec types (encode/decode, composable)
662662
2. **Core types are scientist-friendly**: `float32`, `uint8`, `bool`, `bytes` instead of `FLOAT`, `TINYINT UNSIGNED`, `LONGBLOB`
663-
3. **AttributeTypes use angle brackets**: `<blob>`, `<object@store>`, `<filepath@main>` - distinguishes from core types
663+
3. **Codecs use angle brackets**: `<blob>`, `<object@store>`, `<filepath@main>` - distinguishes from core types
664664
4. **`@` indicates external storage**: No `@` = database, `@` present = object store
665-
5. **`get_dtype(is_external)` method**: Types resolve dtype at declaration time based on storage mode
666-
6. **AttributeTypes are composable**: `<blob@>` uses `<hash@>`, which uses `json`
667-
7. **Built-in external types use JSON dtype**: Stores metadata (path, hash, store name, etc.)
665+
5. **`get_dtype(is_external)` method**: Codecs resolve dtype at declaration time based on storage mode
666+
6. **Codecs are composable**: `<blob@>` uses `<hash@>`, which uses `json`
667+
7. **Built-in external codecs use JSON dtype**: Stores metadata (path, hash, store name, etc.)
668668
8. **Two OAS regions**: object (PK-addressed) and hash (hash-addressed) within managed stores
669669
9. **Filepath for portability**: `<filepath@store>` uses relative paths within stores for environment portability
670670
10. **No `uri` type**: For arbitrary URLs, use `varchar`—simpler and more transparent
@@ -673,9 +673,9 @@ def garbage_collect(store_name):
673673
- No `@` = internal storage (database)
674674
- `@` alone = default store
675675
- `@name` = named store
676-
12. **Dual-mode types**: `<blob>` and `<attach>` support both internal and external storage
677-
13. **External-only types**: `<object@>`, `<hash@>`, `<filepath@>` require `@`
678-
14. **Transparent access**: AttributeTypes return Python objects or file paths
676+
12. **Dual-mode codecs**: `<blob>` and `<attach>` support both internal and external storage
677+
13. **External-only codecs**: `<object@>`, `<hash@>`, `<filepath@>` require `@`
678+
14. **Transparent access**: Codecs return Python objects or file paths
679679
15. **Lazy access**: `<object@>` and `<filepath@store>` return ObjectRef
680680
16. **MD5 for content hashing**: See [Hash Algorithm Choice](#hash-algorithm-choice) below
681681
17. **No separate registry**: Hash metadata stored in JSON columns, not a separate table

0 commit comments

Comments
 (0)