Skip to content

Commit 6b2d2aa

Browse files
authored
Merge pull request #42 from TaskarCenterAtUW/develop
Add 0.3 support
2 parents dbb1d3a + 63275ca commit 6b2d2aa

File tree

15 files changed

+1377
-80
lines changed

15 files changed

+1377
-80
lines changed

CHANGELOG.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,20 @@
11
# Change log
22

3+
### 0.3.0
4+
- Update converters to emit OSW 0.3 schema id and support new vegetation features (trees, tree rows, woods).
5+
- Extend OSW normalizers to keep `leaf_cycle` and `leaf_type` where allowed for points, lines, and polygons.
6+
- Add unit coverage for OSW 0.3 natural feature handling.
7+
- Expand OSM normalizer coverage and robustness: preserve non-compliant/unknown tags as `ext:*`, canonicalize JSON ext values, normalize elevation from 3D geometries, tolerate string IDs, and harden edge-case handling with tests.
8+
- OSW→OSM improvements: promote invalid/unknown fields (incl. dict/list) to `ext:*`, set `version="1"` for visible elements, derive `ext:elevation` from Z coords, and keep invalid incline/climb values under `ext:` instead of dropping them.
9+
- OSM→OSW improvements: verify OSW 0.3 `$schema` headers, export tree/tree_row/wood features, treat `ext:` tags as valid identifiers in OSW normalizers for filtering, and add multi-exterior handling tests for zones/polygons plus line parsing guards.
10+
- Added extensive unit tests for osm/osw normalizers and graph serializers (filters, geojson import/export, zebra crossing mapping, kerb/foot validators, invalid line/polygon/zone branches, ref normalization, etc.).
11+
- Added fixtures for vegetation and 3D elevation scenarios (`tree-test.xml`) and custom-property round-trip checks.
12+
- Implemented collision-free ID handling: sequential remapping of nodes/ways/relations on OSW→OSM export with reference rewrites, plus tests confirming sequential IDs and schema/tag updates.
13+
14+
### 0.2.13
15+
- Added default `version="1"` attribute to all nodes, ways, and relations generated during OSW→OSM conversion.
16+
- Introduced unit test coverage to verify version attributes are written for all OSM elements.
17+
318
### 0.2.12
419
- Updated OSMTaggedNodeParser to apply the OSW node and point filters with normalization before adding loose tagged nodes, ensuring non-compliant features like crossings are no longer emitted.
520
- Extended serializer tests to cover the new tagged-node behavior, confirming that compliant kerb features are retained while schema-invalid crossings are skipped.
@@ -92,4 +107,4 @@
92107
- Added unit test cases
93108
- Added README.md file
94109
- Added CHANGELOG.md file
95-
- Added test.pypi pipeline
110+
- Added test.pypi pipeline

docs/TESTING_OVERVIEW.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Below is a breakdown of all test scenarios (`def test_` and `async def test_`) across the suite and the main areas they cover.
2+
3+
Total scenarios: 211
4+
Code coverage: 98% (`coverage report`)
5+
6+
Test module | Scenarios | Focus / scenarios covered (key checks)
7+
--- | --- | ---
8+
tests/unit_tests/helpers/test_osm.py | 11 | Way/node/point/line/zone/polygon counters; entity counter dispatch; OSMGraph creation and simplification/geometry construction lifecycles; filter helpers return booleans for tagged inputs
9+
tests/unit_tests/helpers/test_osw.py | 22 | OSW filters (all geometries); unzip/merge optional files & missing-file handling; simplify/construct graph end-to-end; per-entity counters; ext tag retention; temp-file cleanup; optional files presence/absence; merge deletes intermediates
10+
tests/unit_tests/helpers/test_response.py | 6 | Response dataclass defaults, attribute access, repr/str formatting, mutation for lists/strings/error payloads
11+
tests/unit_tests/test_formatter.py | 7 | Formatter success/error paths; converter delegation; cleanup lifecycle; exception surfacing; mocking converter calls; workdir creation/idempotence; cleanup of existing vs missing files
12+
tests/unit_tests/test_osm2osw/test_osm2osw.py | 15 | OSM→OSW: file counts/types; width/incline validation/cleaning; `$schema`=0.3 headers; ext-tag passthrough; point geometry enforcement; duplicate vs unique IDs; schema header verification; bad-input path; type assertions; tree/tree_row/wood export coverage; invalid-node-tag skip logic
13+
tests/unit_tests/test_osm_compliance/test_osm_compliance.py | 2 | OSW validation of OSW→OSM→OSW roundtrip; incline tag preservation using official validator
14+
tests/unit_tests/test_osw2osm/test_osw2osm.py | 16 | OSW→OSM: version attrs (visible/non-visible); incline/climb handling; invalid incline to `ext:`; custom dict/list props to `ext:`; 3D elevation to `ext:elevation`; missing-zip error; invalid width to `ext:`; XML output type assertions; ext tags on invalid properties; climb suppression when incline present; sequential ID remap and ref rewrite assertions
15+
tests/unit_tests/test_roundtrip/test_roundtrip.py | 2 | Full roundtrip (OSW zip → OSM XML → OSW → OSM) smoke checks, ID preservation, schema continuity, ext:* tag parity for both OSW-zip and raw-OSM starting points
16+
tests/unit_tests/test_serializer/test_osm_graph.py | 37 | Parsers (ways/nodes/points/lines/zones/polygons) incl. invalid locations & multi-exteriors; tagged node parsing; simplify/construct geometries; to_geojson ID/export rules; to_undirected variants; filter_edges node-copy; from_geojson import/export; point ID prefix trimming; progress callbacks; invalid location skip logic; duplicate-id protection; empty-graph exports
17+
tests/unit_tests/test_serializer/test_osm_normalizer.py | 19 | OSM normalizer tag filtering: datatype coercion/NaN removal; incline/climb/foot handling; ext tag retention; zebra crossing mapping; kerb/foot validators; width/incline edge cases; implied foot removal; `_id` sourcing when tags absent/empty
18+
tests/unit_tests/test_serializer/test_osm_osm_normalizer.py | 11 | OSM normalizer edge cases: `_stash_ext` JSON canonicalization/errors/unknown keys; dict/list promotion to `ext:`; zone area tagging; elevation extraction fallbacks; ID normalization across nodes/ways/relations and refs/nodeRefs/refs; ref write-back branches; non-numeric ID tolerance; canonical `ext:` serialization
19+
tests/unit_tests/test_serializer/test_osw_normalizer.py | 63 | OSW normalizer: filters/normalizers for all feature types; tree/tree_row/wood support; leaf_cycle/leaf_type validation; crossing markings (incl. zebra inference); kerb/foot/surface validators; invalid branches raising; keep_key/default behaviors; width/incline/climb handling; ext tag passthrough and ext-based filter classification; literal keep-key handling; natural-* guards; tactile paving/surface normalization
20+
21+
Detailed scenario highlights (what we explicitly exercise)
22+
- tests/unit_tests/helpers/test_osm.py: async counters on `wa.microsoft.osm.pbf` confirm expected counts for ways/points/nodes; `get_osm_graph` builds an `OSMGraph` then `simplify_og`/`construct_geometries` run without mutating return types; way/node/point/zone/polygon filters accept tagged inputs and return booleans.
23+
- tests/unit_tests/helpers/test_osw.py: counts for ways/nodes/points/zones/lines/polygons across the same PBF; unzip returns the expected nodes/edges/points artifacts and gracefully returns empty dict when files are missing; merge combines multiple GeoJSON FeatureCollections and deletes temp inputs; zone/polygon filters assert boolean output; temp cleanup covers both existing and already-removed files.
24+
- tests/unit_tests/helpers/test_response.py: default `Response` has `status=True` with `None` files/error; supports list or string `generated_files`; preserves custom error messages and `None` errors in success cases.
25+
- tests/unit_tests/test_formatter.py: `Formatter.osm2osw` happy/failed paths surface `Response.status`; workdir is created idempotently whether or not it exists; cleanup removes tracked files and ignores missing ones; `Formatter.osw2osm` delegates to `OSW2OSM.convert` exactly once (mocked) and propagates its response.
26+
- tests/unit_tests/test_osm2osw/test_osm2osw.py: end-to-end conversion yields six outputs (nodes/points/edges/zones/polygons/lines) with string paths; GeoJSONs contain non-empty geometries with string `_id`s and no duplicates; width tags are numeric, incline tags remain numeric on edges, and invalid node tags lead to no files; `$schema` header equals 0.3 and carries through tree/tree_row/wood fixtures; ext:* properties are preserved; file naming matches expected entity types; failure path returns `status=False`.
27+
- tests/unit_tests/test_osm_compliance/test_osm_compliance.py: runs OSW→OSM→OSW through `python_osw_validation` to assert zero validation issues; checks that incline tags survive the full round-trip.
28+
- tests/unit_tests/test_osw2osm/test_osw2osm.py: converts OSW ZIPs to a single OSM XML, ensuring width tags are present and numeric; error path when ZIP is missing; incline tags are present but climb tags are stripped or shifted to `ext:incline` for invalid values; custom/non-compliant properties (dict/list) are promoted to ext:* JSON; 3D node coordinates emit `ext:elevation`; `_ensure_version_attribute` backfills version on visible elements; sequential ID remap rewrites ids/refs; all generated paths are strings and end with `.xml`.
29+
- tests/unit_tests/test_roundtrip/test_roundtrip.py: two smoke flows keep ext:* tags intact—(1) OSW ZIP → OSM XML → OSW → OSM, (2) raw OSM XML → OSW → OSM—comparing node/way ext:* sets for equality.
30+
- tests/unit_tests/test_serializer/test_osm_graph.py: graph metadata (directed/multigraph) and undirected copies retain node attrs; parsers handle missing nodes/invalid coordinates and multi-exterior polygons/zones; tagged-node parser only ingests OSW nodes; simplify/construct geometries rebuild missing geometries for points/lines with node refs; `to_geojson` preserves IDs, trims point prefixes, handles empty graphs, exports progress callbacks; `from_geojson` ingests features and respects mapping hooks and filter functions.
31+
- tests/unit_tests/test_serializer/test_osm_normalizer.py: width/incline/climb coercion removes NaN/invalid strings, retains valid ints/floats; climb removal rules when incline present, except steps keep climb/down; ext_osm_id assignment prefers tags but falls back to internal IDs and skips empty values; implied foot tags dropped where inappropriate.
32+
- tests/unit_tests/test_serializer/test_osm_osm_normalizer.py: `_stash_ext` normalizes JSON strings, skips None, and serializes unknown structures; filter_tags moves unknown keys or invalid datatypes to ext:* and adds area tags for zones; elevation extraction rejects NaN, falling back through z/ele tags; ID normalization covers negative IDs and writes back refs/nodeRefs/refs variants.
33+
- tests/unit_tests/test_serializer/test_osw_normalizer.py: validators classify sidewalks/crossings/traffic islands/stairs/living streets/powerpoles/trees/tree_rows/wood; invalid geometries raise where expected; stair normalization keeps/drops climb per validity and defaults highway/foot; width/incline/climb handling mirrors OSM normalizer; crossing markings inferred from zebra tags; keep_key/default behaviors honored; tactile paving/surface/leaf_cycle/leaf_type/kerb/foot rules validated; natural-* checks drop invalid feature types.
34+
35+
Method: counted functions matching `def test_` and `async def test_` under `tests/` and grouped scenario themes per file.

docs/id_remapping.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# ID Remapping (OSW → OSM)
2+
3+
This document explains how IDs are generated and remapped when converting OSW (GeoJSON) to OSM XML.
4+
5+
## Goal
6+
Produce collision-free OSM XML where all node/way/relation IDs are sequential per type (starting at 1) and all references are updated accordingly, while preserving OSW identifiers in `_id` tags.
7+
8+
## Process
9+
1. **Initial IDs from OSW content**
10+
- Nodes/points/lines/zones/polygons parsed from OSW GeoJSON enter the OSM graph with their OSW `_id`/references.
11+
- Extension/unknown properties are preserved under `ext:*`; elevation from 3D coordinates becomes `ext:elevation`.
12+
13+
2. **OSW→OSM export**
14+
- `OSW2OSM.convert()` runs the normal ogr2osm pipeline, writing an OSM XML file.
15+
- `_ensure_version_attribute` ensures all elements have `version="1"` (visible elements get it if missing).
16+
17+
3. **Sequential remap**
18+
- After the XML is written, `_remap_ids_to_sequential` rewrites IDs and references:
19+
- Nodes are renumbered `1..N` in document order; their `_id` tags are updated to the new ID.
20+
- Ways are renumbered `1..M`; their `_id` tags are updated. All `<nd ref>` values are rewritten to the new node IDs.
21+
- Relations are renumbered `1..K`; their `_id` tags are updated. All `<member ref>` values are rewritten based on member `type` (node/way/relation) using the new ID maps.
22+
- The remap runs in-place on the XML so the final output has consistent, collision-free IDs and references.
23+
24+
4. **What remains**
25+
- Original OSW identifiers survive in other tags (e.g., `ext:osm_id` if provided, other `ext:*`), but `_id` reflects the new sequential OSM ID.
26+
27+
## Notes / rationale
28+
- The remap ensures deterministic, collision-free IDs regardless of source naming schemes (e.g., OSW prefixes, extension data).
29+
- Reference integrity is maintained by rewriting all node refs in ways and member refs in relations.
30+
- Version attributes are normalized before remapping to satisfy OSM validators expecting `version`.
31+
32+
## Minimal example (what the remap does)
33+
Input XML (simplified):
34+
```xml
35+
<osm>
36+
<node id="10" lat="0" lon="0"><tag k="_id" v="10"/></node>
37+
<node id="20" lat="1" lon="1"><tag k="_id" v="20"/></node>
38+
<way id="30">
39+
<nd ref="10"/><nd ref="20"/>
40+
<tag k="_id" v="30"/>
41+
</way>
42+
<relation id="40">
43+
<member type="node" ref="20"/>
44+
<member type="way" ref="30"/>
45+
<tag k="_id" v="40"/>
46+
</relation>
47+
</osm>
48+
```
49+
50+
After `_remap_ids_to_sequential`:
51+
```xml
52+
<osm>
53+
<node id="1" ...><tag k="_id" v="1"/></node>
54+
<node id="2" ...><tag k="_id" v="2"/></node>
55+
<way id="1">
56+
<nd ref="1"/><nd ref="2"/>
57+
<tag k="_id" v="1"/>
58+
</way>
59+
<relation id="1">
60+
<member type="node" ref="2"/>
61+
<member type="way" ref="1"/>
62+
<tag k="_id" v="1"/>
63+
</relation>
64+
</osm>
65+
```
66+
All IDs now start at 1 per type, and every reference points to the remapped IDs.
67+
68+
## Relevant code
69+
- Entry point: `OSW2OSM.convert()` (`src/osm_osw_reformatter/osw2osm/osw2osm.py`)
70+
- Calls `_ensure_version_attribute`
71+
- Calls `_remap_ids_to_sequential`
72+
- Remap implementation: `_remap_ids_to_sequential` in `osw2osm.py` rewrites element IDs and their refs in-place and updates `_id` tags.

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@ shapely~=2.0.2
55
pyproj~=3.6.1
66
coverage~=7.5.1
77
ogr2osm==1.2.0
8-
python-osw-validation==0.2.15
8+
python-osw-validation==0.3.1

src/osm_osw_reformatter/osw2osm/osw2osm.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import gc
22
import ogr2osm
3+
from xml.etree import ElementTree as ET
34
from pathlib import Path
45
from ..helpers.osw import OSWHelper
56
from ..helpers.response import Response
@@ -32,6 +33,8 @@ def convert(self) -> Response:
3233
# Instantiate either ogr2osm.OsmDataWriter or ogr2osm.PbfDataWriter
3334
data_writer = ogr2osm.OsmDataWriter(output_file, suppress_empty_tags=True)
3435
osm_data.output(data_writer)
36+
self._ensure_version_attribute(output_file)
37+
self._remap_ids_to_sequential(output_file)
3538

3639
del translation_object
3740
del datasource
@@ -46,3 +49,67 @@ def convert(self) -> Response:
4649
finally:
4750
gc.collect()
4851
return resp
52+
53+
@staticmethod
54+
def _ensure_version_attribute(osm_xml_path: Path) -> None:
55+
"""Ensure nodes, ways, and relations include a version attribute."""
56+
try:
57+
tree = ET.parse(osm_xml_path)
58+
except Exception:
59+
return
60+
61+
root = tree.getroot()
62+
for tag in ('node', 'way', 'relation'):
63+
for element in root.findall(f'.//{tag}'):
64+
if not element.get('version'):
65+
element.set('version', '1')
66+
67+
tree.write(osm_xml_path, encoding='utf-8', xml_declaration=True)
68+
69+
@staticmethod
70+
def _remap_ids_to_sequential(osm_xml_path: Path) -> None:
71+
"""Remap node/way/relation IDs to sequential values starting at 1 and update references."""
72+
try:
73+
tree = ET.parse(osm_xml_path)
74+
except Exception:
75+
return
76+
77+
root = tree.getroot()
78+
79+
def remap_elements(xpath: str):
80+
mapping = {}
81+
elems = root.findall(xpath)
82+
for idx, elem in enumerate(elems, start=1):
83+
old_id = elem.get('id')
84+
if old_id is None:
85+
continue
86+
mapping[old_id] = str(idx)
87+
elem.set('id', str(idx))
88+
for tag in elem.findall("./tag[@k='_id']"):
89+
tag.set('v', str(idx))
90+
return mapping
91+
92+
node_map = remap_elements('.//node')
93+
way_map = remap_elements('.//way')
94+
rel_map = remap_elements('.//relation')
95+
96+
# Update way nd refs
97+
for way in root.findall('.//way'):
98+
for nd in way.findall('nd'):
99+
ref = nd.get('ref')
100+
if ref in node_map:
101+
nd.set('ref', node_map[ref])
102+
103+
# Update relation member refs
104+
for rel in root.findall('.//relation'):
105+
for member in rel.findall('member'):
106+
ref = member.get('ref')
107+
m_type = member.get('type')
108+
if m_type == 'node' and ref in node_map:
109+
member.set('ref', node_map[ref])
110+
elif m_type == 'way' and ref in way_map:
111+
member.set('ref', way_map[ref])
112+
elif m_type == 'relation' and ref in rel_map:
113+
member.set('ref', rel_map[ref])
114+
115+
tree.write(osm_xml_path, encoding='utf-8', xml_declaration=True)

src/osm_osw_reformatter/serializer/osm/osm_graph.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -730,3 +730,9 @@ def from_geojson(cls, nodes_path, edges_path):
730730

731731
for edge_feature in edges_fc['features']:
732732
props = edge_feature['properties']
733+
u = props.pop('_u_id')
734+
v = props.pop('_v_id')
735+
props['geometry'] = shape(edge_feature['geometry'])
736+
G.add_edges_from([(u, v, props)])
737+
738+
return osm_graph

0 commit comments

Comments
 (0)