Skip to content

Commit 5922d01

Browse files
committed
Add documentation for PairedFieldDescriptor implementation
1 parent 6651bae commit 5922d01

File tree

1 file changed

+90
-0
lines changed

1 file changed

+90
-0
lines changed
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Paired Field Descriptor System
2+
3+
## Overview
4+
5+
The Paired Field Descriptor is a Django model descriptor designed to manage fields with both manual and machine learning (ML) generated variants. This system provides a flexible approach to handling metadata fields, with a focus on tag management and priority handling.
6+
7+
## Core Concepts
8+
9+
### Field Pairing Mechanism
10+
The descriptor automatically creates two associated fields for each defined descriptor:
11+
- **Manual Field**: Manually entered or curated metadata
12+
- **ML Field**: Machine learning generated metadata
13+
14+
### Key Characteristics
15+
- Manual field takes precedence over ML field
16+
- Flexible field type support
17+
- Handles empty arrays and None values
18+
- Requires explicit setting of ML fields
19+
20+
## Implementation
21+
22+
### Creating a Paired Field Descriptor
23+
24+
```python
25+
tdamm_tag = PairedFieldDescriptor(
26+
field_name="tdamm_tag",
27+
field_type=ArrayField(models.CharField(max_length=255, choices=TDAMMTags.choices), blank=True, null=True),
28+
verbose_name="TDAMM Tags",
29+
)
30+
```
31+
32+
#### Parameters
33+
- `field_name`: Base name for the descriptor
34+
- `field_type`: Django field type (supports various field types)
35+
- `verbose_name`: Optional human-readable name
36+
37+
### Field Naming Convention
38+
When you define a descriptor, two additional fields are automatically created:
39+
- `{field_name}_manual`: For manually entered values
40+
- `{field_name}_ml`: For machine learning generated values
41+
42+
## Characteristics
43+
44+
### Field Priority
45+
1. Manual field always takes precedence
46+
2. ML field serves as a fallback
47+
3. Empty manual fields or None values defer to ML field
48+
49+
### Field Retrieval
50+
```python
51+
# Retrieval automatically prioritizes manual field
52+
tags = url.tdamm_tag # Returns manual tags if exist, otherwise ML tags
53+
```
54+
55+
### Field Setting
56+
```python
57+
# Sets only the manual field
58+
url.tdamm_tag = ["MMA_M_EM", "MMA_M_G"]
59+
60+
# ML field must be set explicitly
61+
url.tdamm_tag_ml = ["MMA_O_BH"]
62+
```
63+
64+
### Field Deletion
65+
```python
66+
# Deletes both manual and ML fields
67+
del url.tdamm_tag
68+
```
69+
70+
### Data Preservation
71+
- Paired fields maintain their state during:
72+
- Dump to Delta migration
73+
- Delta to Curated promotion
74+
- Manual entries take precedence in all migration stages
75+
76+
## Serializer Integration
77+
78+
Here's the way to configure the serializer to retrieve the paired field, seamlessly extracting either manual or ML tags based on the descriptor's priority rules.
79+
```python
80+
class DeltaUrlSerializer(serializers.ModelSerializer):
81+
tdamm_tag = serializers.SerializerMethodField()
82+
83+
class Meta:
84+
model = DeltaUrl
85+
fields = ("url", "tdamm_tag")
86+
87+
def get_tdamm_tag(self, obj):
88+
tags = obj.tdamm_tag
89+
return tags if tags is not None else []
90+
```

0 commit comments

Comments
 (0)