Skip to content

Commit a4ddd14

Browse files
authored
fix: change order of JSON Schema to search mapper transformations (#32)
* fix: change order of JSON Schema to search mapper transformations In the JSON Schema to search mapper, the suppression flags need to be addressed first, otherwise the jsonref.replace_refs function may remove them. Signed-off-by: Cesar Berrospi Ramis <[email protected]> * build: update dependencies Updating pydantic from 2.8.2 to 2.9.2 triggers a change in JSON Schema from models: the lists with only 1 element get flatten. Signed-off-by: Cesar Berrospi Ramis <[email protected]> * chore: improve verbose in JSON Schema to search mapper test Signed-off-by: Cesar Berrospi Ramis <[email protected]> --------- Signed-off-by: Cesar Berrospi Ramis <[email protected]>
1 parent b49e93e commit a4ddd14

File tree

7 files changed

+597
-498
lines changed

7 files changed

+597
-498
lines changed

docling_core/search/json_schema_to_search_mapper.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from copy import deepcopy
99
from typing import Any, Optional, Pattern, Tuple, TypedDict
1010

11-
from jsonref import JsonRef
11+
from jsonref import replace_refs
1212

1313

1414
class SearchIndexDefinition(TypedDict):
@@ -95,7 +95,11 @@ def get_index_definition(self, schema: dict) -> SearchIndexDefinition:
9595
which define the fields, their data types, and other specifications to index
9696
JSON documents into a Lucene index.
9797
"""
98-
mapping = JsonRef.replace_refs(schema)
98+
mapping = deepcopy(schema)
99+
100+
mapping = self._suppress(mapping, self._suppress_key)
101+
102+
mapping = replace_refs(mapping)
99103

100104
mapping = self._merge_unions(mapping)
101105

@@ -105,8 +109,6 @@ def get_index_definition(self, schema: dict) -> SearchIndexDefinition:
105109

106110
mapping = self._remove_keys(mapping, self._rm_keys)
107111

108-
mapping = self._suppress(mapping, self._suppress_key)
109-
110112
mapping = self._translate_keys_re(mapping)
111113

112114
mapping = self._clean(mapping)

docs/Document.json

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -323,11 +323,7 @@
323323
"type": "string"
324324
},
325325
"bounding_box": {
326-
"allOf": [
327-
{
328-
"$ref": "#/$defs/BoundingBoxContainer"
329-
}
330-
],
326+
"$ref": "#/$defs/BoundingBoxContainer",
331327
"x-es-suppress": true
332328
},
333329
"prov": {

docs/Document.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6052,7 +6052,7 @@ Must be one of:
60526052
| **Type** | `object` |
60536053
| **Required** | Yes |
60546054
| **Additional properties** | [[Any type: allowed]](# "Additional Properties of any type are allowed.") |
6055-
| **Defined in** | |
6055+
| **Defined in** | #/$defs/BoundingBoxContainer |
60566056

60576057
**Description:** Bounding box container.
60586058

docs/Generic.json

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -58,11 +58,7 @@
5858
"x-es-type": "text"
5959
},
6060
"file-info": {
61-
"allOf": [
62-
{
63-
"$ref": "#/$defs/FileInfoObject"
64-
}
65-
],
61+
"$ref": "#/$defs/FileInfoObject",
6662
"description": "Minimal identification information of the document within a collection.",
6763
"title": "Document information"
6864
}

docs/Generic.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@
7575
| **Type** | `object` |
7676
| **Required** | Yes |
7777
| **Additional properties** | [[Any type: allowed]](# "Additional Properties of any type are allowed.") |
78-
| **Defined in** | |
78+
| **Defined in** | #/$defs/FileInfoObject |
7979

8080
**Description:** Minimal identification information of the document within a collection.
8181

poetry.lock

Lines changed: 579 additions & 476 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

test/test_json_schema_to_search_mapper.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,10 @@ def test_json_schema_to_search_mapper_0():
5151
index_ref = _load(filename)
5252

5353
diff = jsondiff.diff(index_ref, index_def)
54-
print(json.dumps(index_def, indent=2))
55-
print(diff)
56-
assert index_def == index_ref
54+
# print(json.dumps(index_def, indent=2))
55+
assert (
56+
index_def == index_ref
57+
), f"Error in search mappings of ExportedCCSDocument. Difference:\n{json.dumps(diff, indent=2)}"
5758

5859

5960
def test_json_schema_to_search_mapper_1():
@@ -99,6 +100,7 @@ def test_json_schema_to_search_mapper_1():
99100
index_ref = _load(filename)
100101

101102
diff = jsondiff.diff(index_ref, index_def)
102-
# print(json.dumps(index_def,indent=2))
103-
print(diff)
104-
assert index_def == index_ref
103+
# print(json.dumps(index_def, indent=2))
104+
assert (
105+
index_def == index_ref
106+
), f"Error in search mappings of Record. Difference:\n{json.dumps(diff, indent=2)}"

0 commit comments

Comments
 (0)