Skip to content

Commit 93392df

Browse files
committed
add clarification about pattern behavior to the lifecycle readme
1 parent 430a1c1 commit 93392df

File tree

1 file changed

+25
-57
lines changed

1 file changed

+25
-57
lines changed

sde_collections/models/README_LIFECYCLE.md

Lines changed: 25 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ This document explains the lifecycle of URLs in the system, focusing on two crit
1313
- **CuratedUrls**: Production-ready, approved content
1414

1515
### Fields That Transfer
16-
All fields are transferred between states, including:
16+
All fields transfer between states, including:
1717
- URL
1818
- Scraped Title
1919
- Generated Title
@@ -23,6 +23,21 @@ All fields are transferred between states, including:
2323
- Scraped Text
2424
- Any additional metadata
2525

26+
## Pattern Application
27+
28+
### When Patterns Are Applied
29+
Patterns are applied in two scenarios:
30+
1. During migration from Dump to Delta
31+
2. When a new pattern is created/updated
32+
33+
Patterns are NOT applied during promotion. The effects of patterns (modified titles, document types, etc.) are carried through to CuratedUrls during promotion, but the patterns themselves don't reapply.
34+
35+
### Pattern Effects
36+
- Patterns modify DeltaUrls when they are created or when DeltaUrls are created through migration
37+
- Pattern-modified fields (titles, document types, etc.) become part of the DeltaUrl's data
38+
- These modifications persist through promotion to CuratedUrls
39+
- Pattern relationships (which patterns affect which URLs) are maintained for tracking purposes
40+
2641
## Migration Process (Dump → Delta)
2742

2843
### Overview
@@ -43,49 +58,7 @@ Migration converts DumpUrls to DeltaUrls, preserving all fields and applying pat
4358

4459
### Examples
4560

46-
#### Example 1: Basic Migration
47-
```python
48-
# Starting State
49-
dump_url = DumpUrl(
50-
url="example.com/doc",
51-
scraped_title="Original Title",
52-
document_type=DocumentTypes.DOCUMENTATION
53-
)
54-
55-
# After Migration
56-
delta_url = DeltaUrl(
57-
url="example.com/doc",
58-
scraped_title="Original Title",
59-
document_type=DocumentTypes.DOCUMENTATION,
60-
to_delete=False
61-
)
62-
```
63-
64-
#### Example 2: Migration with Existing Curated
65-
```python
66-
# Starting State
67-
dump_url = DumpUrl(
68-
url="example.com/doc",
69-
scraped_title="New Title",
70-
document_type=DocumentTypes.DOCUMENTATION
71-
)
72-
73-
curated_url = CuratedUrl(
74-
url="example.com/doc",
75-
scraped_title="Old Title",
76-
document_type=DocumentTypes.DOCUMENTATION
77-
)
78-
79-
# After Migration
80-
delta_url = DeltaUrl(
81-
url="example.com/doc",
82-
scraped_title="New Title", # Different from curated
83-
document_type=DocumentTypes.DOCUMENTATION,
84-
to_delete=False
85-
)
86-
```
87-
88-
#### Example 3: Migration with Pattern Application
61+
#### Example 1: Migration with Pattern Application
8962
```python
9063
# Starting State
9164
dump_url = DumpUrl(
@@ -111,15 +84,15 @@ delta_url = DeltaUrl(
11184
## Promotion Process (Delta → Curated)
11285

11386
### Overview
114-
Promotion moves DeltaUrls to CuratedUrls, applying all changes including explicit NULL values. This occurs when:
115-
- A curator marks a collection as Curated.
87+
Promotion moves DeltaUrls to CuratedUrls, carrying forward all changes including pattern-applied modifications. This occurs when:
88+
- A curator marks a collection as Curated
11689

11790
### Steps
11891
1. Process each DeltaUrl:
11992
- If marked for deletion: Remove matching CuratedUrl
12093
- Otherwise: Update/create CuratedUrl with ALL fields
12194
2. Clear all DeltaUrls
122-
3. Refresh pattern relationships
95+
3. Update pattern relationship tracking
12396

12497
### Examples
12598

@@ -186,18 +159,13 @@ curated_url = CuratedUrl(
186159

187160
## Important Notes
188161

162+
189163
### Field Handling
190164
- ALL fields are copied during migration and promotion
191165
- NULL values in DeltaUrls are treated as explicit values
192166
- Pattern-set values take precedence over original values
193167

194-
### Pattern Application
195-
- Patterns are applied after migration
196-
- Pattern effects persist through promotion
197-
- Multiple patterns can affect the same URL
198-
199-
### Data Integrity
200-
- Migrations preserve all field values
201-
- Promotions apply all changes
202-
- Deletion flags are honored during promotion
203-
- Pattern relationships are maintained
168+
### Pattern Behavior
169+
- Patterns only apply during migration or when patterns themselves are created/updated
170+
- Pattern effects are preserved during promotion as regular field values
171+
- Patterns are NOT re-applied during promotion. This means you can't add a DeltaUrl outside of the migration process and expect patterns to apply. In this case, you would need to either add it as a DumpUrl and migrate it correctly, or add it as a DeltaUrl manually apply the pattern.

0 commit comments

Comments
 (0)