Skip to content

Commit e2650e5

Browse files
committed
update the pattern resolution readme
1 parent e285697 commit e2650e5

File tree

1 file changed

+92
-20
lines changed

1 file changed

+92
-20
lines changed

sde_collections/models/README_PATTERN_RESOLUTION.md

Lines changed: 92 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -4,45 +4,117 @@
44
The pattern system uses a "smallest set priority" strategy for resolving conflicts between overlapping patterns. This applies to title patterns, division patterns, and document type patterns. The pattern that matches the smallest set of URLs takes precedence.
55

66
## How It Works
7-
87
When multiple patterns match a URL, the system:
98
1. Counts how many total URLs each pattern matches
109
2. Compares the counts
1110
3. Applies the pattern that matches the fewest URLs
1211

13-
### Example
12+
### Example Pattern Hierarchy
1413
```
1514
Pattern A: */docs/* # Matches 100 URLs
1615
Pattern B: */docs/api/* # Matches 20 URLs
1716
Pattern C: */docs/api/v2/* # Matches 5 URLs
1817
19-
For URL "/docs/api/v2/users":
20-
- All patterns match
21-
- Pattern C wins (5 URLs < 20 URLs < 100 URLs)
18+
Example URLs and Which Patterns Apply:
19+
1. https://example.com/docs/overview.html
20+
✓ Matches Pattern A
21+
✗ Doesn't match Pattern B or C
22+
Result: Pattern A applies (only match)
23+
24+
2. https://example.com/docs/api/endpoints.html
25+
✓ Matches Pattern A
26+
✓ Matches Pattern B
27+
✗ Doesn't match Pattern C
28+
Result: Pattern B applies (20 < 100 URLs)
29+
30+
3. https://example.com/docs/api/v2/users.html
31+
✓ Matches Pattern A
32+
✓ Matches Pattern B
33+
✓ Matches Pattern C
34+
Result: Pattern C applies (5 < 20 < 100 URLs)
2235
```
2336

2437
## Pattern Types and Resolution
2538

2639
### Title Patterns
27-
```python
28-
# More specific title pattern takes precedence
29-
Pattern A: */docs/* → title="Documentation" # 100 URLs
30-
Pattern B: */docs/api/* → title="API Reference" # 20 URLs
31-
Result: URL gets title "API Reference"
40+
```
41+
Patterns:
42+
A: */docs/* → title="Documentation" # Matches 100 URLs
43+
B: */docs/api/* → title="API Reference" # Matches 20 URLs
44+
C: */docs/api/v2/* → title="V2 API Guide" # Matches 5 URLs
45+
46+
Example URLs:
47+
1. https://example.com/docs/getting-started.html
48+
• Matches: Pattern A
49+
• Result: title="Documentation"
50+
51+
2. https://example.com/docs/api/authentication.html
52+
• Matches: Patterns A, B
53+
• Result: title="API Reference"
54+
55+
3. https://example.com/docs/api/v2/oauth.html
56+
• Matches: Patterns A, B, C
57+
• Result: title="V2 API Guide"
3258
```
3359

3460
### Division Patterns
35-
```python
36-
# More specific division assignment wins
37-
Pattern A: *.pdf → division="GENERAL" # 500 URLs
38-
Pattern B: */specs/*.pdf → division="ENGINEERING" # 50 URLs
39-
Result: URL gets division "ENGINEERING"
61+
```
62+
Patterns:
63+
A: *.pdf → division="GENERAL" # Matches 500 URLs
64+
B: */specs/*.pdf → division="ENGINEERING" # Matches 50 URLs
65+
C: */specs/2024/*.pdf → division="RESEARCH" # Matches 10 URLs
66+
67+
Example URLs:
68+
1. https://example.com/docs/report.pdf
69+
• Matches: Pattern A
70+
• Result: division="GENERAL"
71+
72+
2. https://example.com/specs/architecture.pdf
73+
• Matches: Patterns A, B
74+
• Result: division="ENGINEERING"
75+
76+
3. https://example.com/specs/2024/roadmap.pdf
77+
• Matches: Patterns A, B, C
78+
• Result: division="RESEARCH"
4079
```
4180

4281
### Document Type Patterns
43-
```python
44-
# Most specific document type classification applies
45-
Pattern A: */docs/*type="DOCUMENTATION" # 200 URLs
46-
Pattern B: */docs/data/*type="DATA" # 30 URLs
47-
Result: URL gets type "DATA"
82+
```
83+
Patterns:
84+
A: */docs/* → type="DOCUMENTATION" # Matches 200 URLs
85+
B: */docs/data/* → type="DATA" # Matches 30 URLs
86+
C: */docs/data/schemas/* → type="SCHEMA" # Matches 8 URLs
87+
88+
Example URLs:
89+
1. https://example.com/docs/guide.html
90+
• Matches: Pattern A
91+
• Result: type="DOCUMENTATION"
92+
93+
2. https://example.com/docs/data/metrics.json
94+
• Matches: Patterns A, B
95+
• Result: type="DATA"
96+
97+
3. https://example.com/docs/data/schemas/user.json
98+
• Matches: Patterns A, B, C
99+
• Result: type="SCHEMA"
100+
```
101+
102+
## Special Cases
103+
104+
### Mixed Pattern Types
105+
```
106+
When different pattern types overlap, each is resolved independently:
107+
108+
URL: https://example.com/docs/api/v2/schema.json
109+
Matching Patterns:
110+
1. */docs/* → title="Documentation", 100 matches
111+
2. */docs/* → doc_type="DOCUMENTATION", 100 matches
112+
3. */docs/api/* → title="API Reference", 50 matches
113+
4. */docs/api/v2/* → division="ENGINEERING", 10 matches
114+
5. */docs/api/v2/*.json → doc_type="DATA", 3 matches
115+
116+
Final Result:
117+
• title="API Reference" (from pattern 3, most specific title pattern)
118+
• division="ENGINEERING" (from pattern 4, only matching division pattern)
119+
• doc_type="DATA" (from pattern 5, most specific doc_type pattern)
48120
```

0 commit comments

Comments
 (0)