Skip to content

Commit 2ce8fc6

Browse files
resolutio extraction
1 parent 8e9d726 commit 2ce8fc6

File tree

6 files changed

+942
-57
lines changed

6 files changed

+942
-57
lines changed

docs/extraction_analysis.md

Lines changed: 335 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,335 @@
1+
# Entity and Frequency Extraction Analysis
2+
3+
**Date:** January 27, 2026
4+
**Analyzed:** 5 SG Reports + 8 Resolutions (expanded)
5+
6+
## Executive Summary
7+
8+
This analysis examined UN SG reports and their mandating resolutions to determine where information about the **authoring entity** and **reporting frequency** can be reliably extracted, with a focus on **LLM-based extraction** rather than regex patterns.
9+
10+
### Key Findings
11+
12+
| Information Type | Best Source | Extraction Method | Reliability |
13+
|------------------|-------------|-------------------|-------------|
14+
| **Authoring Entity** | MARC 710__a + Manual list | Direct lookup | High |
15+
| **Report Frequency** | Resolution operative paragraphs | LLM extraction | Medium-High |
16+
| **Report Scope/Content** | Resolution operative paragraphs | LLM extraction | High |
17+
| **Previous Resolutions** | Resolution preamble | LLM extraction | High |
18+
| **Mandate Duration** | Resolution operative paragraphs | LLM extraction | High |
19+
| **Target Session/Date** | Resolution operative paragraphs | LLM extraction | High |
20+
21+
---
22+
23+
## 1. Resolution Structure Analysis
24+
25+
### Consistent Document Structure
26+
27+
All resolutions follow a predictable structure that LLMs can reliably parse:
28+
29+
```
30+
[Header: Symbol, Title, Adopting Body, Date]
31+
32+
The [Body],
33+
34+
[PREAMBLE - "Recalling", "Recognizing", "Noting", etc.]
35+
- References to previous resolutions
36+
- Context and background
37+
- Concerns and observations
38+
39+
[OPERATIVE PARAGRAPHS - numbered]
40+
1. Decides/Takes note/Welcomes...
41+
2. Requests the Secretary-General to...
42+
3. Invites Member States to...
43+
...
44+
```
45+
46+
### Different Body Patterns
47+
48+
| Body | Session Reference | Date Format | Example |
49+
|------|-------------------|-------------|---------|
50+
| **General Assembly** | "at its [Nth] session" | Session numbers (78th, 79th) | "at its eightieth session" |
51+
| **Security Council** | Specific dates | "until [date]" | "until 31 October 2025" |
52+
| **ECOSOC** | Session years | "[year] session" | "2020 session" |
53+
| **Human Rights Council** | Session numbers | "its [Nth] session" | "its thirty-first session" |
54+
55+
---
56+
57+
## 2. Extractable Data Fields (LLM Targets)
58+
59+
### 2.1 Frequency Information
60+
61+
**Finding:** Frequency is often **implicit** rather than explicit. LLMs must infer from context.
62+
63+
#### Explicit Frequency Patterns
64+
Some resolutions contain explicit frequency terms:
65+
- "annual report" / "annually"
66+
- "biennial" / "every two years"
67+
- "on a regular basis"
68+
69+
**Example from A/RES/75/233 (QCPR):**
70+
> "Reiterates its request to present **annual reports** to the Economic and Social Council"
71+
72+
#### Implicit Frequency (Target Session)
73+
More commonly, frequency must be inferred from target session:
74+
- "submit to the General Assembly at its **eightieth session**" (from 78th → 80th = biennial)
75+
- "report to the Human Rights Council, starting from its **thirty-first session**"
76+
77+
**Example from A/RES/78/70:**
78+
> "Requests the Secretary-General to submit to the General Assembly at its **eightieth session** a report on the implementation"
79+
80+
This implies biennial reporting (78th session resolution → 80th session report).
81+
82+
#### Mandate Extension Patterns (Security Council)
83+
SC resolutions often extend mandates by specific periods:
84+
> "Decides to extend the mandate of the Verification Mission until **31 October 2025**"
85+
86+
This implies the next report/review is due around that date.
87+
88+
### 2.2 Report Content/Scope Requirements
89+
90+
Resolutions often specify what reports should cover:
91+
92+
**Example from A/RES/78/70 (Mine Action):**
93+
> "report on the implementation of the present resolution and the progress made in mine action"
94+
95+
**Example from A/HRC/RES/28/6 (Albinism):**
96+
> Mandate includes:
97+
> - "(c) To promote and report on developments towards and the challenges and obstacles to the realization of the enjoyment of human rights"
98+
> - "(d) To gather, request, receive and exchange information... on violations of the rights"
99+
100+
### 2.3 Previous/Related Resolutions
101+
102+
The preamble typically lists all predecessor resolutions:
103+
104+
**Example from A/RES/79/150:**
105+
> "Recalling its resolutions 44/82 of 8 December 1989, 50/142 of 21 December 1995, 52/81 of 12 December 1997, 54/124 of 17 December 1999, 56/113 of 19 December 2001..."
106+
107+
This provides:
108+
- Complete resolution chain history
109+
- Dates showing reporting pattern over time
110+
- Related topic references
111+
112+
### 2.4 Responsible Entity
113+
114+
**Primary entity assignment:**
115+
> "**Requests the Secretary-General** to submit..."
116+
> "**Requests the Independent Expert** to..."
117+
> "**Invites the Special Rapporteur** to..."
118+
119+
**Supporting entities:**
120+
> "in collaboration with relevant stakeholders"
121+
> "working with the Special Jurisdiction for Peace"
122+
123+
### 2.5 Mandate Duration (for Special Procedures)
124+
125+
**Example from A/HRC/RES/28/6:**
126+
> "Decides to appoint, **for a period of three years**, an Independent Expert"
127+
128+
---
129+
130+
## 3. LLM Extraction Strategy
131+
132+
### Recommended Prompt Structure
133+
134+
```
135+
Given the following UN resolution text, extract:
136+
137+
1. REPORTING MANDATE (if any):
138+
- Target session/date for next report
139+
- Implied frequency (annual/biennial/etc.)
140+
- Reporting entity (Secretary-General, Special Rapporteur, etc.)
141+
- Report topic/scope requirements
142+
143+
2. RESOLUTION CHAIN:
144+
- List of previous related resolutions mentioned
145+
- Pattern of sessions/years (for frequency inference)
146+
147+
3. MANDATE DETAILS (if establishing/extending a mandate):
148+
- Duration
149+
- Key tasks assigned
150+
- Review/renewal date
151+
152+
Return structured JSON with confidence scores.
153+
```
154+
155+
### Expected Output Schema
156+
157+
```json
158+
{
159+
"reporting_mandate": {
160+
"exists": true,
161+
"target_session": "80th session",
162+
"target_date": null,
163+
"implied_frequency": "biennial",
164+
"frequency_confidence": 0.85,
165+
"responsible_entity": "Secretary-General",
166+
"scope": "implementation of the present resolution and progress made in mine action",
167+
"explicit_frequency_mentioned": false
168+
},
169+
"resolution_chain": [
170+
{"symbol": "A/RES/76/74", "date": "2021-12-09"},
171+
{"symbol": "A/RES/74/80", "date": "2019-12-13"}
172+
],
173+
"mandate_details": {
174+
"duration": null,
175+
"tasks": [],
176+
"review_date": null
177+
}
178+
}
179+
```
180+
181+
### Validation Approach
182+
183+
1. **Cross-reference with historical data**: Compare LLM-inferred frequency with actual publication history
184+
2. **Session arithmetic**: Verify target session math (78th + 2 = 80th for biennial)
185+
3. **Entity normalization**: Map extracted entities to canonical department names
186+
187+
---
188+
189+
## 4. Resolutions Without Report Mandates
190+
191+
Not all resolutions mandate reports. Some resolutions:
192+
- Establish procedures (E/RES/2020/5 on statistics coordination)
193+
- Make declarations
194+
- Request actions other than reporting
195+
196+
**LLM should indicate `reporting_mandate.exists: false` for these.**
197+
198+
---
199+
200+
## 5. Sample Analyses
201+
202+
### A/RES/78/70 (Assistance in Mine Action)
203+
204+
| Field | Extracted Value |
205+
|-------|-----------------|
206+
| Target session | 80th session |
207+
| Implied frequency | Biennial |
208+
| Entity | Secretary-General |
209+
| Scope | "implementation of the present resolution and progress made in mine action" |
210+
| Previous resolutions | 76/74, many others since 1990s |
211+
212+
### S/RES/2754 (2024) (Colombia Verification Mission)
213+
214+
| Field | Extracted Value |
215+
|-------|-----------------|
216+
| Target date | 31 October 2025 |
217+
| Implied frequency | Annual (mandate extension) |
218+
| Entity | Secretary-General (via Verification Mission) |
219+
| Scope | Implementation of 2016 Final Peace Agreement |
220+
| Mandate duration | Extended until Oct 2025 |
221+
222+
### A/HRC/RES/28/6 (Independent Expert on Albinism)
223+
224+
| Field | Extracted Value |
225+
|-------|-----------------|
226+
| Target session | 31st session (HRC) |
227+
| Implied frequency | Annual (to HRC and GA) |
228+
| Entity | Independent Expert on Albinism |
229+
| Mandate duration | 3 years |
230+
| Scope | 8 specific mandate areas listed |
231+
232+
---
233+
234+
## 6. Implementation Recommendations
235+
236+
### Phase 1: LLM Extraction Pipeline
237+
238+
1. **Fetch resolution fulltext** (already implemented)
239+
2. **Run LLM extraction** with structured prompt
240+
3. **Store extracted fields** in new table `resolution_mandates`:
241+
```sql
242+
CREATE TABLE resolution_mandates (
243+
resolution_symbol TEXT PRIMARY KEY,
244+
target_session TEXT,
245+
target_date DATE,
246+
inferred_frequency TEXT,
247+
frequency_confidence FLOAT,
248+
responsible_entity TEXT,
249+
report_scope TEXT,
250+
mandate_duration TEXT,
251+
previous_resolutions TEXT[],
252+
extracted_at TIMESTAMPTZ,
253+
llm_model TEXT
254+
);
255+
```
256+
257+
### Phase 2: Report-Resolution Linking Enhancement
258+
259+
1. **Join reports to resolution mandates**
260+
2. **Compare inferred vs. actual frequency**
261+
3. **Flag discrepancies** for review
262+
263+
### Phase 3: Survey Integration
264+
265+
Use extracted data to inform survey:
266+
- "This report is mandated to cover: [scope]"
267+
- "Current frequency: [actual] vs. Mandated: [inferred]"
268+
- "Mandate expires: [date]"
269+
270+
---
271+
272+
## 7. Cost-Benefit Analysis
273+
274+
### LLM vs. Regex
275+
276+
| Aspect | Regex | LLM |
277+
|--------|-------|-----|
278+
| **Development time** | High (many patterns) | Low (one prompt) |
279+
| **Maintenance** | High (new patterns) | Low (self-adapting) |
280+
| **Accuracy** | Medium (misses context) | High (understands intent) |
281+
| **Cost per resolution** | ~$0 | ~$0.01-0.05 |
282+
| **Handles edge cases** | Poor | Good |
283+
| **Confidence scoring** | No | Yes |
284+
285+
**Recommendation:** Use LLM extraction with validation against historical data.
286+
287+
### Estimated Costs
288+
289+
- ~5,000 resolutions to process
290+
- ~$0.03 per resolution (GPT-4o mini or Claude Haiku)
291+
- **Total: ~$150 one-time extraction**
292+
- Can be run incrementally as new resolutions are added
293+
294+
---
295+
296+
## Appendix: Resolution Examples
297+
298+
### Example 1: Explicit Reporting Request
299+
300+
From **A/RES/78/70**:
301+
```
302+
Requests the Secretary-General to submit to the General Assembly at its
303+
eightieth session a report on the implementation of the present resolution
304+
and the progress made in mine action, and to include in that report an
305+
appendix containing information provided by Member States.
306+
```
307+
308+
### Example 2: Mandate Establishment
309+
310+
From **A/HRC/RES/28/6**:
311+
```
312+
Decides to appoint, for a period of three years, an Independent Expert on the
313+
enjoyment of human rights by persons with albinism, with the following mandate:
314+
(a) To engage in dialogue and consult with States...
315+
(b) To identify, exchange and promote good practices...
316+
(c) To promote and report on developments...
317+
(h) To report to the Human Rights Council, starting from its thirty-first session,
318+
and to the General Assembly;
319+
```
320+
321+
### Example 3: Mandate Extension (Security Council)
322+
323+
From **S/RES/2754 (2024)**:
324+
```
325+
1. Decides to extend the mandate of the Verification Mission until 31 October 2025;
326+
2. Expresses its willingness to work with the Government of Colombia on
327+
the further extension of the mandate...
328+
```
329+
330+
### Example 4: No Report Mandate
331+
332+
From **E/RES/2020/5** (Statistics Coordination):
333+
- Contains procedural requests to coordinate statistical programmes
334+
- No reporting mandate to Secretary-General
335+
- LLM should return `reporting_mandate.exists: false`

0 commit comments

Comments
 (0)