|
14 | 14 | - Delta URL exists with pattern effect
|
15 | 15 | - Pattern is removed
|
16 | 16 | ```
|
17 |
| -Curated: None |
18 |
| -Delta: division=BIOLOGY (from pattern) |
19 |
| -[Pattern removed] |
20 |
| -Result: Delta remains with division=None |
| 17 | +Curated: None exists |
| 18 | +Delta: url=new.com, division=None |
| 19 | +``` |
| 20 | +`[Pattern: division=BIOLOGY], created` |
| 21 | +``` |
| 22 | +Curated: None exists |
| 23 | +Delta: url=new.com, division=BIOLOGY |
| 24 | +``` |
| 25 | +`[Pattern: division=BIOLOGY], deleted` |
| 26 | +``` |
| 27 | +Curated: None exists |
| 28 | +Delta: url=new.com, division=None |
21 | 29 | ```
|
22 | 30 |
|
23 |
| -### Case 2: Delta and Curated Exist |
| 31 | +### Case 2: Delta Created to Apply Pattern |
24 | 32 | **Scenario:**
|
25 |
| -- Both curated and delta URLs exist |
| 33 | +- A Curated with no division already exists |
| 34 | +- A pattern is created |
| 35 | +- A delta is created to to apply a pattern |
26 | 36 | - Pattern is removed
|
| 37 | +- Delta should be deleted |
| 38 | +``` |
| 39 | +Curated: division=None |
27 | 40 | ```
|
28 |
| -Curated: division=GENERAL |
| 41 | +`[Pattern: division=BIOLOGY], created` |
| 42 | +``` |
| 43 | +Curated: division=None |
29 | 44 | Delta: division=BIOLOGY (from pattern)
|
30 |
| -[Pattern removed] |
31 |
| -Result: Delta reverts to curated value (division=GENERAL) |
32 |
| -If delta now matches curated exactly, delta is deleted |
| 45 | +``` |
| 46 | +`[Pattern: division=BIOLOGY], deleted` |
| 47 | +``` |
| 48 | +Curated: division=None |
33 | 49 | ```
|
34 | 50 |
|
35 |
| -### Case 3: Curated Only |
36 |
| -**Scenario:** |
37 |
| -- Only curated URL exists |
| 51 | +### Case 3: Pre-existing Delta |
| 52 | +- A Curated with no division already exists |
| 53 | +- A Delta with an updated scraped_title exists |
| 54 | +- A pattern is created to set division |
| 55 | +- A delta is created to apply a pattern |
38 | 56 | - Pattern is removed
|
| 57 | +- Delta should be maintained because of scraped_title |
| 58 | + |
| 59 | +``` |
| 60 | +Curated: division=None |
| 61 | +Delta: scraped_title="Modified", division=None |
| 62 | +``` |
| 63 | +`[Pattern: division=BIOLOGY], created` |
| 64 | +``` |
| 65 | +Curated: division=None |
| 66 | +Delta: scraped_title="Modified", division=BIOLOGY (from pattern) |
| 67 | +``` |
| 68 | +`[Pattern: division=BIOLOGY], deleted` |
39 | 69 | ```
|
40 |
| -Curated: division=GENERAL |
41 |
| -Delta: None |
42 |
| -[Pattern removed] |
43 |
| -Result: New delta created with division=None |
| 70 | +Curated: division=None |
| 71 | +Delta: scraped_title="Modified", division=None |
44 | 72 | ```
|
45 | 73 |
|
46 | 74 | ### Case 4: Multiple Pattern Effects
|
47 | 75 | **Scenario:**
|
48 | 76 | - Delta has changes from multiple patterns
|
49 | 77 | - One pattern is removed
|
50 | 78 | ```
|
51 |
| -Curated: division=GENERAL, doc_type=DOCUMENTATION |
52 | 79 | Delta: division=BIOLOGY, doc_type=DATA (from two patterns)
|
53 |
| -[Division pattern removed] |
54 |
| -Result: Delta remains with division=GENERAL, doc_type=DATA preserved |
| 80 | +Pattern: division=BIOLOGY |
| 81 | +Pattern: doc_type=DATA |
| 82 | +``` |
| 83 | +`[Pattern: division=BIOLOGY], deleted` |
| 84 | +``` |
| 85 | +Delta: division=None, doc_type=DATA |
| 86 | +Pattern: doc_type=DATA |
55 | 87 | ```
|
56 | 88 |
|
57 |
| -### Case 5: Pattern Removal with Manual Changes |
58 |
| -**Scenario:** |
59 |
| -- Delta has both pattern effect and manual changes |
60 |
| -- Pattern is removed |
| 89 | +### Case 5: Overlapping Patterns, Specific Deleted |
61 | 90 | ```
|
62 |
| -Curated: division=GENERAL, title="Original" |
63 |
| -Delta: division=BIOLOGY, title="Modified" (pattern + manual) |
64 |
| -[Pattern removed] |
65 |
| -Result: Delta remains with division=GENERAL, title="Modified" preserved |
| 91 | +Curated: division=ASTROPHYSICS (because of specific pattern) |
| 92 | +Specific Pattern: division=ASTROPHYSICS |
| 93 | +General Pattern: division=BIOLOGY |
66 | 94 | ```
|
| 95 | +`[Specific Pattern: division=ASTROPHYSICS], deleted` |
| 96 | + |
| 97 | +``` |
| 98 | +Curated: division=BIOLOGY (because of general pattern) |
| 99 | +General Pattern: division=BIOLOGY |
| 100 | +``` |
| 101 | + |
| 102 | + |
| 103 | +### Case 6: Overlapping Patterns, General Deleted |
| 104 | +``` |
| 105 | +Curated: division=ASTROPHYSICS (because of specific pattern) |
| 106 | +Specific Pattern: division=ASTROPHYSICS |
| 107 | +General Pattern: division=BIOLOGY |
| 108 | +``` |
| 109 | +`[General Pattern: division=BIOLOGY], deleted` |
| 110 | + |
| 111 | +``` |
| 112 | +Curated: division=ASTROPHYSICS (because of specific pattern) |
| 113 | +Specific Pattern: division=ASTROPHYSICS |
| 114 | +``` |
| 115 | + |
67 | 116 |
|
68 | 117 | ## Implementation Steps
|
69 | 118 |
|
|
0 commit comments