Skip to content

Commit 84636a0

Browse files
committed
Improve entity extraction prompts for relationship array alignment
1 parent b0e973f commit 84636a0

File tree

2 files changed

+54
-16
lines changed

2 files changed

+54
-16
lines changed

libs/langchain-mongodb/langchain_mongodb/graphrag/example_templates.py

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
"startDate": ["2018-01-01"]
2626
}},
2727
"relationships": {{
28-
"targets": ["Jasbinder Kaur", "Jarnail Singh"],
28+
"target_ids": ["Jasbinder Kaur", "Jarnail Singh"],
2929
"types": ["Friend", "Friend"],
3030
"attributes": [
3131
{{ "since": ["2019-05-01"] }},
@@ -37,7 +37,7 @@
3737
"_id": "Jarnail Singh",
3838
"type": "Person",
3939
"relationships": {{
40-
"targets": ["Alice Palace"],
40+
"target_ids": ["Alice Palace"],
4141
"types": ["Friend"],
4242
"attributes": [{{ "since": ["2019-05-01"] }}]
4343
}}
@@ -46,7 +46,7 @@
4646
"_id": "Jasbinder Kaur",
4747
"type": "Person",
4848
"relationships": {{
49-
"targets": ["Alice Palace"],
49+
"target_ids": ["Alice Palace"],
5050
"types": ["Friend"],
5151
"attributes": [{{ "since": ["2015-05-01"], "frequency": ["weekly"] }}]
5252
}}
@@ -71,8 +71,9 @@
7171
"location": ["San Francisco"]
7272
}},
7373
"relationships": {{
74-
"targets": ["Elon Musk", "Sam Altman"],
75-
"types": ["Speaker", "Speaker"]
74+
"target_ids": ["Elon Musk", "Sam Altman"],
75+
"types": ["Speaker", "Speaker"],
76+
"attributes": [{{}}, {{}}]
7677
}}
7778
}},
7879
{{ "_id": "Elon Musk", "type": "Person" }},
@@ -92,8 +93,9 @@
9293
"_id": "Quantum Computing",
9394
"type": "Concept",
9495
"relationships": {{
95-
"targets": ["Quantum Mechanics"],
96-
"types": ["Based On"]
96+
"target_ids": ["Quantum Mechanics"],
97+
"types": ["Based On"],
98+
"attributes": [{{}}]
9799
}}
98100
}},
99101
{{ "_id": "Quantum Mechanics", "type": "Concept" }}
@@ -114,8 +116,9 @@
114116
"type": "Event",
115117
"attributes": {{ "date": ["2023-03-01"] }},
116118
"relationships": {{
117-
"targets": ["NASA"],
118-
"types": ["Managed By"]
119+
"target_ids": ["NASA"],
120+
"types": ["Managed By"],
121+
"attributes": [{{}}]
119122
}}
120123
}},
121124
{{
@@ -126,8 +129,9 @@
126129
"_id": "Bill Nelson",
127130
"type": "Person",
128131
"relationships": {{
129-
"targets": ["Artemis II Mission"],
130-
"types": ["Praised By"]
132+
"target_ids": ["Artemis II Mission"],
133+
"types": ["Praised By"],
134+
"attributes": [{{}}]
131135
}}
132136
}}
133137
]
@@ -146,16 +150,18 @@
146150
"_id": "Rust",
147151
"type": "Programming Language",
148152
"relationships": {{
149-
"targets": ["Memory Safety"],
150-
"types": ["Ensures"]
153+
"target_ids": ["Memory Safety"],
154+
"types": ["Ensures"],
155+
"attributes": [{{}}]
151156
}}
152157
}},
153158
{{
154159
"_id": "Memory Safety",
155160
"type": "Concept",
156161
"relationships": {{
157-
"targets": ["Ownership Model"],
158-
"types": ["Uses"]
162+
"target_ids": ["Ownership Model"],
163+
"types": ["Uses"],
164+
"attributes": [{{}}]
159165
}}
160166
}},
161167
{{ "_id": "Ownership Model", "type": "Concept" }}

libs/langchain-mongodb/langchain_mongodb/graphrag/prompts.py

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,37 @@
4646
Instead of using specific and momentary types such as 'worked_at', use more general and timeless relationship types
4747
like 'employee'. Add details as attributes. Make sure to use general and timeless relationship types!
4848
49+
### CRITICAL: Array Length Alignment
50+
The relationships object contains three arrays: `target_ids`, `types`, and `attributes`.
51+
**These three arrays MUST have EXACTLY the same length.**
52+
- Each position (index) in these arrays describes ONE complete relationship.
53+
- Position 0 in `target_ids`, `types`, and `attributes` together describe the first relationship.
54+
- Position 1 in `target_ids`, `types`, and `attributes` together describe the second relationship.
55+
- And so on...
56+
57+
If a relationship has no attributes, you MUST still include an empty object `{{}}` in the `attributes` array at that position.
58+
59+
Example of CORRECT alignment:
60+
```json
61+
"relationships": {{
62+
"target_ids": ["Entity A", "Entity B"],
63+
"types": ["partners", "supplier"],
64+
"attributes": [
65+
{{"since": ["2020"]}},
66+
{{}}
67+
]
68+
}}
69+
```
70+
71+
Example of INCORRECT (DO NOT DO THIS):
72+
```json
73+
"relationships": {{
74+
"target_ids": ["Entity A", "Entity B"],
75+
"types": ["partners"],
76+
"attributes": [{{"since": ["2020"]}}]
77+
}}
78+
```
79+
4980
**Allowed Relationship Types**:
5081
- Extract ONLY relationships whose `type` matches one of the following: {allowed_relationship_types}.
5182
- If this list is empty, ANY relationship type is permitted.
@@ -64,7 +95,8 @@
6495
1. Validate that all extracted entities have an `_id` and `type`.
6596
2. Validate that all `type` values are in {allowed_entity_types}.
6697
3. Validate that all relationships use keys in {allowed_relationship_types}.
67-
4. Exclude any entities or relationships failing validation.
98+
4. **CRITICAL**: For each entity with relationships, verify that `target_ids`, `types`, and `attributes` arrays have EXACTLY the same length.
99+
5. Exclude any entities or relationships failing validation.
68100
69101
## Output Schema
70102
Output a valid JSON document with a single top-level key, `entities`, as an array of objects.

0 commit comments

Comments
 (0)