Skip to content

Commit b78ecfb

Browse files
committed
- Treat ext: keys as valid identifiers in OSW normalizer filters
- Add/extend unit tests for ext-based filter classification Refresh TESTING_OVERVIEW totals and per-suite counts to include async tests - Note ext-based classification coverage in test focus summary
1 parent 9ceba6c commit b78ecfb

File tree

9 files changed

+533
-77
lines changed

9 files changed

+533
-77
lines changed

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,10 @@
66
- Add unit coverage for OSW 0.3 natural feature handling.
77
- Expand OSM normalizer coverage and robustness: preserve non-compliant/unknown tags as `ext:*`, canonicalize JSON ext values, normalize elevation from 3D geometries, tolerate string IDs, and harden edge-case handling with tests.
88
- OSW→OSM improvements: promote invalid/unknown fields (incl. dict/list) to `ext:*`, set `version="1"` for visible elements, derive `ext:elevation` from Z coords, and keep invalid incline/climb values under `ext:` instead of dropping them.
9-
- OSM→OSW improvements: verify OSW 0.3 `$schema` headers, export tree/tree_row/wood features, and add multi-exterior handling tests for zones/polygons plus line parsing guards.
9+
- OSM→OSW improvements: verify OSW 0.3 `$schema` headers, export tree/tree_row/wood features, treat `ext:` tags as valid identifiers in OSW normalizers for filtering, and add multi-exterior handling tests for zones/polygons plus line parsing guards.
1010
- Added extensive unit tests for osm/osw normalizers and graph serializers (filters, geojson import/export, zebra crossing mapping, kerb/foot validators, invalid line/polygon/zone branches, ref normalization, etc.).
1111
- Added fixtures for vegetation and 3D elevation scenarios (`tree-test.xml`) and custom-property round-trip checks.
12+
- Implemented collision-free ID handling: sequential remapping of nodes/ways/relations on OSW→OSM export with reference rewrites, plus tests confirming sequential IDs and schema/tag updates.
1213

1314
### 0.2.13
1415
- Added default `version="1"` attribute to all nodes, ways, and relations generated during OSW→OSM conversion.
Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
Below is a breakdown of all test scenarios (`def test_`) across the suite and the main areas they cover.
1+
Below is a breakdown of all test scenarios (`def test_` and `async def test_`) across the suite and the main areas they cover.
22

3-
Total scenarios: 202
3+
Total scenarios: 211
44
Code coverage: 98% (`coverage report`)
55

66
Test module | Scenarios | Focus / scenarios covered (key checks)
@@ -11,12 +11,12 @@ tests/unit_tests/helpers/test_response.py | 6 | Response dataclass defaults, att
1111
tests/unit_tests/test_formatter.py | 7 | Formatter success/error paths; converter delegation; cleanup lifecycle; exception surfacing; mocking converter calls; workdir creation/idempotence; cleanup of existing vs missing files
1212
tests/unit_tests/test_osm2osw/test_osm2osw.py | 15 | OSM→OSW: file counts/types; width/incline validation/cleaning; `$schema`=0.3 headers; ext-tag passthrough; point geometry enforcement; duplicate vs unique IDs; schema header verification; bad-input path; type assertions; tree/tree_row/wood export coverage; invalid-node-tag skip logic
1313
tests/unit_tests/test_osm_compliance/test_osm_compliance.py | 2 | OSW validation of OSW→OSM→OSW roundtrip; incline tag preservation using official validator
14-
tests/unit_tests/test_osw2osm/test_osw2osm.py | 14 | OSW→OSM: version attrs (visible/non-visible); incline/climb handling; invalid incline to `ext:`; custom dict/list props to `ext:`; 3D elevation to `ext:elevation`; missing-zip error; invalid width to `ext:`; XML output type assertions; ext tags on invalid properties; climb suppression when incline present
14+
tests/unit_tests/test_osw2osm/test_osw2osm.py | 16 | OSW→OSM: version attrs (visible/non-visible); incline/climb handling; invalid incline to `ext:`; custom dict/list props to `ext:`; 3D elevation to `ext:elevation`; missing-zip error; invalid width to `ext:`; XML output type assertions; ext tags on invalid properties; climb suppression when incline present; sequential ID remap and ref rewrite assertions
1515
tests/unit_tests/test_roundtrip/test_roundtrip.py | 2 | Full roundtrip (OSW zip → OSM XML → OSW → OSM) smoke checks, ID preservation, schema continuity, ext:* tag parity for both OSW-zip and raw-OSM starting points
1616
tests/unit_tests/test_serializer/test_osm_graph.py | 37 | Parsers (ways/nodes/points/lines/zones/polygons) incl. invalid locations & multi-exteriors; tagged node parsing; simplify/construct geometries; to_geojson ID/export rules; to_undirected variants; filter_edges node-copy; from_geojson import/export; point ID prefix trimming; progress callbacks; invalid location skip logic; duplicate-id protection; empty-graph exports
1717
tests/unit_tests/test_serializer/test_osm_normalizer.py | 19 | OSM normalizer tag filtering: datatype coercion/NaN removal; incline/climb/foot handling; ext tag retention; zebra crossing mapping; kerb/foot validators; width/incline edge cases; implied foot removal; `_id` sourcing when tags absent/empty
1818
tests/unit_tests/test_serializer/test_osm_osm_normalizer.py | 11 | OSM normalizer edge cases: `_stash_ext` JSON canonicalization/errors/unknown keys; dict/list promotion to `ext:`; zone area tagging; elevation extraction fallbacks; ID normalization across nodes/ways/relations and refs/nodeRefs/refs; ref write-back branches; non-numeric ID tolerance; canonical `ext:` serialization
19-
tests/unit_tests/test_serializer/test_osw_normalizer.py | 56 | OSW normalizer: filters/normalizers for all feature types; tree/tree_row/wood support; leaf_cycle/leaf_type validation; crossing markings (incl. zebra inference); kerb/foot/surface validators; invalid branches raising; keep_key/default behaviors; width/incline/climb handling; ext tag passthrough; literal keep-key handling; natural-* guards; tactile paving/surface normalization
19+
tests/unit_tests/test_serializer/test_osw_normalizer.py | 63 | OSW normalizer: filters/normalizers for all feature types; tree/tree_row/wood support; leaf_cycle/leaf_type validation; crossing markings (incl. zebra inference); kerb/foot/surface validators; invalid branches raising; keep_key/default behaviors; width/incline/climb handling; ext tag passthrough and ext-based filter classification; literal keep-key handling; natural-* guards; tactile paving/surface normalization
2020

2121
Detailed scenario highlights (what we explicitly exercise)
2222
- tests/unit_tests/helpers/test_osm.py: async counters on `wa.microsoft.osm.pbf` confirm expected counts for ways/points/nodes; `get_osm_graph` builds an `OSMGraph` then `simplify_og`/`construct_geometries` run without mutating return types; way/node/point/zone/polygon filters accept tagged inputs and return booleans.
@@ -25,11 +25,11 @@ Detailed scenario highlights (what we explicitly exercise)
2525
- tests/unit_tests/test_formatter.py: `Formatter.osm2osw` happy/failed paths surface `Response.status`; workdir is created idempotently whether or not it exists; cleanup removes tracked files and ignores missing ones; `Formatter.osw2osm` delegates to `OSW2OSM.convert` exactly once (mocked) and propagates its response.
2626
- tests/unit_tests/test_osm2osw/test_osm2osw.py: end-to-end conversion yields six outputs (nodes/points/edges/zones/polygons/lines) with string paths; GeoJSONs contain non-empty geometries with string `_id`s and no duplicates; width tags are numeric, incline tags remain numeric on edges, and invalid node tags lead to no files; `$schema` header equals 0.3 and carries through tree/tree_row/wood fixtures; ext:* properties are preserved; file naming matches expected entity types; failure path returns `status=False`.
2727
- tests/unit_tests/test_osm_compliance/test_osm_compliance.py: runs OSW→OSM→OSW through `python_osw_validation` to assert zero validation issues; checks that incline tags survive the full round-trip.
28-
- tests/unit_tests/test_osw2osm/test_osw2osm.py: converts OSW ZIPs to a single OSM XML, ensuring width tags are present and numeric; error path when ZIP is missing; incline tags are present but climb tags are stripped or shifted to `ext:incline` for invalid values; custom/non-compliant properties (dict/list) are promoted to ext:* JSON; 3D node coordinates emit `ext:elevation`; `_ensure_version_attribute` backfills version on visible elements; all generated paths are strings and end with `.xml`.
28+
- tests/unit_tests/test_osw2osm/test_osw2osm.py: converts OSW ZIPs to a single OSM XML, ensuring width tags are present and numeric; error path when ZIP is missing; incline tags are present but climb tags are stripped or shifted to `ext:incline` for invalid values; custom/non-compliant properties (dict/list) are promoted to ext:* JSON; 3D node coordinates emit `ext:elevation`; `_ensure_version_attribute` backfills version on visible elements; sequential ID remap rewrites ids/refs; all generated paths are strings and end with `.xml`.
2929
- tests/unit_tests/test_roundtrip/test_roundtrip.py: two smoke flows keep ext:* tags intact—(1) OSW ZIP → OSM XML → OSW → OSM, (2) raw OSM XML → OSW → OSM—comparing node/way ext:* sets for equality.
3030
- tests/unit_tests/test_serializer/test_osm_graph.py: graph metadata (directed/multigraph) and undirected copies retain node attrs; parsers handle missing nodes/invalid coordinates and multi-exterior polygons/zones; tagged-node parser only ingests OSW nodes; simplify/construct geometries rebuild missing geometries for points/lines with node refs; `to_geojson` preserves IDs, trims point prefixes, handles empty graphs, exports progress callbacks; `from_geojson` ingests features and respects mapping hooks and filter functions.
3131
- tests/unit_tests/test_serializer/test_osm_normalizer.py: width/incline/climb coercion removes NaN/invalid strings, retains valid ints/floats; climb removal rules when incline present, except steps keep climb/down; ext_osm_id assignment prefers tags but falls back to internal IDs and skips empty values; implied foot tags dropped where inappropriate.
3232
- tests/unit_tests/test_serializer/test_osm_osm_normalizer.py: `_stash_ext` normalizes JSON strings, skips None, and serializes unknown structures; filter_tags moves unknown keys or invalid datatypes to ext:* and adds area tags for zones; elevation extraction rejects NaN, falling back through z/ele tags; ID normalization covers negative IDs and writes back refs/nodeRefs/refs variants.
3333
- tests/unit_tests/test_serializer/test_osw_normalizer.py: validators classify sidewalks/crossings/traffic islands/stairs/living streets/powerpoles/trees/tree_rows/wood; invalid geometries raise where expected; stair normalization keeps/drops climb per validity and defaults highway/foot; width/incline/climb handling mirrors OSM normalizer; crossing markings inferred from zebra tags; keep_key/default behaviors honored; tactile paving/surface/leaf_cycle/leaf_type/kerb/foot rules validated; natural-* checks drop invalid feature types.
3434

35-
Method: counted functions matching `def test_` under `tests/` and grouped scenario themes per file.
35+
Method: counted functions matching `def test_` and `async def test_` under `tests/` and grouped scenario themes per file.

docs/id_remapping.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# ID Remapping (OSW → OSM)
2+
3+
This document explains how IDs are generated and remapped when converting OSW (GeoJSON) to OSM XML.
4+
5+
## Goal
6+
Produce collision-free OSM XML where all node/way/relation IDs are sequential per type (starting at 1) and all references are updated accordingly, while preserving OSW identifiers in `_id` tags.
7+
8+
## Process
9+
1. **Initial IDs from OSW content**
10+
- Nodes/points/lines/zones/polygons parsed from OSW GeoJSON enter the OSM graph with their OSW `_id`/references.
11+
- Extension/unknown properties are preserved under `ext:*`; elevation from 3D coordinates becomes `ext:elevation`.
12+
13+
2. **OSW→OSM export**
14+
- `OSW2OSM.convert()` runs the normal ogr2osm pipeline, writing an OSM XML file.
15+
- `_ensure_version_attribute` ensures all elements have `version="1"` (visible elements get it if missing).
16+
17+
3. **Sequential remap**
18+
- After the XML is written, `_remap_ids_to_sequential` rewrites IDs and references:
19+
- Nodes are renumbered `1..N` in document order; their `_id` tags are updated to the new ID.
20+
- Ways are renumbered `1..M`; their `_id` tags are updated. All `<nd ref>` values are rewritten to the new node IDs.
21+
- Relations are renumbered `1..K`; their `_id` tags are updated. All `<member ref>` values are rewritten based on member `type` (node/way/relation) using the new ID maps.
22+
- The remap runs in-place on the XML so the final output has consistent, collision-free IDs and references.
23+
24+
4. **What remains**
25+
- Original OSW identifiers survive in other tags (e.g., `ext:osm_id` if provided, other `ext:*`), but `_id` reflects the new sequential OSM ID.
26+
27+
## Notes / rationale
28+
- The remap ensures deterministic, collision-free IDs regardless of source naming schemes (e.g., OSW prefixes, extension data).
29+
- Reference integrity is maintained by rewriting all node refs in ways and member refs in relations.
30+
- Version attributes are normalized before remapping to satisfy OSM validators expecting `version`.
31+
32+
## Minimal example (what the remap does)
33+
Input XML (simplified):
34+
```xml
35+
<osm>
36+
<node id="10" lat="0" lon="0"><tag k="_id" v="10"/></node>
37+
<node id="20" lat="1" lon="1"><tag k="_id" v="20"/></node>
38+
<way id="30">
39+
<nd ref="10"/><nd ref="20"/>
40+
<tag k="_id" v="30"/>
41+
</way>
42+
<relation id="40">
43+
<member type="node" ref="20"/>
44+
<member type="way" ref="30"/>
45+
<tag k="_id" v="40"/>
46+
</relation>
47+
</osm>
48+
```
49+
50+
After `_remap_ids_to_sequential`:
51+
```xml
52+
<osm>
53+
<node id="1" ...><tag k="_id" v="1"/></node>
54+
<node id="2" ...><tag k="_id" v="2"/></node>
55+
<way id="1">
56+
<nd ref="1"/><nd ref="2"/>
57+
<tag k="_id" v="1"/>
58+
</way>
59+
<relation id="1">
60+
<member type="node" ref="2"/>
61+
<member type="way" ref="1"/>
62+
<tag k="_id" v="1"/>
63+
</relation>
64+
</osm>
65+
```
66+
All IDs now start at 1 per type, and every reference points to the remapped IDs.
67+
68+
## Relevant code
69+
- Entry point: `OSW2OSM.convert()` (`src/osm_osw_reformatter/osw2osm/osw2osm.py`)
70+
- Calls `_ensure_version_attribute`
71+
- Calls `_remap_ids_to_sequential`
72+
- Remap implementation: `_remap_ids_to_sequential` in `osw2osm.py` rewrites element IDs and their refs in-place and updates `_id` tags.

src/osm_osw_reformatter/osw2osm/osw2osm.py

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ def convert(self) -> Response:
3434
data_writer = ogr2osm.OsmDataWriter(output_file, suppress_empty_tags=True)
3535
osm_data.output(data_writer)
3636
self._ensure_version_attribute(output_file)
37+
self._remap_ids_to_sequential(output_file)
3738

3839
del translation_object
3940
del datasource
@@ -64,3 +65,51 @@ def _ensure_version_attribute(osm_xml_path: Path) -> None:
6465
element.set('version', '1')
6566

6667
tree.write(osm_xml_path, encoding='utf-8', xml_declaration=True)
68+
69+
@staticmethod
70+
def _remap_ids_to_sequential(osm_xml_path: Path) -> None:
71+
"""Remap node/way/relation IDs to sequential values starting at 1 and update references."""
72+
try:
73+
tree = ET.parse(osm_xml_path)
74+
except Exception:
75+
return
76+
77+
root = tree.getroot()
78+
79+
def remap_elements(xpath: str):
80+
mapping = {}
81+
elems = root.findall(xpath)
82+
for idx, elem in enumerate(elems, start=1):
83+
old_id = elem.get('id')
84+
if old_id is None:
85+
continue
86+
mapping[old_id] = str(idx)
87+
elem.set('id', str(idx))
88+
for tag in elem.findall("./tag[@k='_id']"):
89+
tag.set('v', str(idx))
90+
return mapping
91+
92+
node_map = remap_elements('.//node')
93+
way_map = remap_elements('.//way')
94+
rel_map = remap_elements('.//relation')
95+
96+
# Update way nd refs
97+
for way in root.findall('.//way'):
98+
for nd in way.findall('nd'):
99+
ref = nd.get('ref')
100+
if ref in node_map:
101+
nd.set('ref', node_map[ref])
102+
103+
# Update relation member refs
104+
for rel in root.findall('.//relation'):
105+
for member in rel.findall('member'):
106+
ref = member.get('ref')
107+
m_type = member.get('type')
108+
if m_type == 'node' and ref in node_map:
109+
member.set('ref', node_map[ref])
110+
elif m_type == 'way' and ref in way_map:
111+
member.set('ref', way_map[ref])
112+
elif m_type == 'relation' and ref in rel_map:
113+
member.set('ref', rel_map[ref])
114+
115+
tree.write(osm_xml_path, encoding='utf-8', xml_declaration=True)

0 commit comments

Comments
 (0)