You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -49,11 +60,13 @@ This document describes the HMM-based map matching algorithm for calculating a c
49
60
50
61
## Algorithm Phases
51
62
52
-
The path calculation consists of three main phases:
63
+
The path calculation consists of five main phases:
53
64
54
65
1.**Candidate Selection** — Identify potential track segments for each GNSS coordinate
55
66
2.**Emission Probability** — Calculate the likelihood that each GNSS position was on each candidate segment (HMM emission model)
56
-
3.**Viterbi Decoding & Path Reconstruction** — Decode the globally optimal netelement sequence using a log-space Viterbi algorithm with transition probabilities derived from shortest-path routing, then insert bridge netelements to produce the final continuous path
67
+
3.**Viterbi Decoding & Path Reconstruction** — Decode the globally optimal netelement sequence using a log-space Viterbi algorithm with transition probabilities derived from shortest-path routing, then insert bridge netelements to produce the initial continuous path
68
+
4.**Post-Viterbi Path Validation** — Three-pass sanity validation: reachability check with bridge re-routing, oscillation collapse, and direction violation removal with cascade detection
69
+
5.**Gap Filling** — Re-insert bridge netelements where consecutive segments lost direct connectivity after sanity removals, with U-turn detection to prevent direction reversals
This represents the geometric mean of per-state probabilities, clamped to [0, 1].
282
295
283
296
### Output
284
-
A single optimal `TrainPath` consisting of an ordered list of `AssociatedNetElement` segments with intrinsic coordinate ranges, plus an overall probability score.
297
+
An initial `TrainPath` consisting of an ordered list of `AssociatedNetElement` segments with intrinsic coordinate ranges, plus an overall probability score. This path is then refined by Phase 4 (validation) and Phase 5 (gap filling).
298
+
299
+
---
300
+
301
+
## Phase 4: Post-Viterbi Path Validation
302
+
303
+
### Objective
304
+
Refine the Viterbi-decoded path by removing segments that are topologically unreachable, collapse oscillation artefacts, and resolve directional inconsistencies. The Viterbi algorithm operates at the candidate-lattice level and may produce locally optimal but globally inconsistent paths — for example, when penalty carry-forward forces transitions through disconnected regions, or when GNSS noise causes the same netelement to appear multiple times with short intermediate detours.
305
+
306
+
Validation consists of three sequential passes, each producing structured `SanityDecision` records for debug output.
307
+
308
+
### Pass 1: Reachability Validation and Bridge Re-Routing
309
+
310
+
Walks the path segments sequentially, checking each consecutive pair for topological reachability via Dijkstra on the directed topology graph.
311
+
312
+
**For each consecutive pair (A, B):**
313
+
1. If A and B are the same netelement → always valid (kept)
314
+
2. If any (from_side, to_side) combination yields a Dijkstra path → reachable (kept)
315
+
3. Otherwise → **unreachable**:
316
+
- Remove B from the path
317
+
- Look ahead to the next segment C: attempt Dijkstra from A to C
318
+
- If a route exists, insert bridge netelements between A and C
319
+
- Record a warning and a `SanityDecision` with action `"removed"` or `"rerouted"`
320
+
321
+
### Pass 2: Oscillation Collapse
322
+
323
+
Detects and collapses oscillation patterns where the same netelement appears more than once with a short intermediate detour — e.g., `A → B → C → A` where B and C are noise.
324
+
325
+
**Detection criteria** — an oscillation is detected when the same netelement `NE` appears at positions `i` and `j` (with `j > i + 1`) and:
326
+
- The number of distinct intermediate netelements is ≤ `MAX_OSCILLATION_INTERMEDIATE_NES` (default: 3)
327
+
- The intermediate GNSS coverage is less than the first occurrence's coverage, or < 10 GNSS positions in absolute terms
328
+
329
+
**Collapse action:**
330
+
- Merge segments `i` and `j`: extend segment `i`'s GNSS range to cover segment `j`'s range
331
+
- Remove all intermediate segments (`i+1` through `j`)
332
+
- Record a `SanityDecision` with action `"collapsed-oscillation"`
333
+
334
+
The process iterates until no more oscillations are found (fixed-point).
335
+
336
+
**Guard**: Sequences with more than `MAX_OSCILLATION_INTERMEDIATE_NES` distinct intermediate netelements are treated as genuine path segments (the train actually traversed them), even when their total GNSS coverage is small relative to the repeated netelement.
337
+
338
+
### Pass 3: Direction Violation Removal
339
+
340
+
Detects and removes segments that create directional inconsistencies (U-turns). Walks the path checking each triple (A, B, C) for consistency using the topology graph.
341
+
342
+
**Consistency check** (`triple_is_consistent`): A triple (A, B, C) is consistent if there exists any combination of netelement sides such that A has a direct netrelation edge to B AND B has a direct netrelation edge to C, with the exit side from A→B being the same netelement side as the entry for B→C (i.e., the train can traverse B without reversing).
343
+
344
+
**Removal strategy for inconsistent triples:**
345
+
346
+
1.**Cascade detection**: If segment B has caused ≥ `MAX_DIRECTION_CASCADE_REMOVALS` (default: 3) neighbour removals (tracked separately as *anchor* removals and *protected* removals) **and** the next segment C would be removable, force-remove B as the likely source of path corruption. Two separate counters prevent conflating unrelated removal patterns:
347
+
-*anchor*: incremented for A when B is removed and A→B was connected (A stays, successive Bs get eaten)
348
+
-*protected*: incremented for B when C is removed as fallback because B was too significant (B stays, successive Cs get eaten)
349
+
350
+
2.**Oscillation remnant (A == C)**: Remove C (the second occurrence of A)
351
+
352
+
3.**Connected A→B (wrong exit)**: Target B for removal. If B exceeds the GNSS threshold, try removing C as a fallback instead
353
+
354
+
4.**Disconnected A→B (orphan)**: Prefer bridge segments; otherwise remove the smaller of {A, B}
355
+
356
+
**Removability**: A segment is automatically removable if it is a bridge (zero GNSS span) or has fewer than `DIRECTION_REMOVAL_GNSS_THRESHOLD` (default: 100) GNSS positions. Segments exceeding this threshold are kept with a warning when no smaller alternative exists.
357
+
358
+
The process iterates until no more violations are found (fixed-point).
359
+
360
+
### Output
361
+
A validated path with unreachable, oscillating, and direction-violating segments removed. Structured `SanityDecision` records for each action taken, exported as `05_path_sanity_decisions.geojson` in debug mode.
362
+
363
+
---
364
+
365
+
## Phase 5: Gap Filling
366
+
367
+
### Objective
368
+
After sanity validation may have removed segments, consecutive pairs in the path may no longer be directly connected. Gap filling re-inserts bridge netelements to restore path continuity.
369
+
370
+
### Process
371
+
372
+
Walks the validated path and checks each consecutive pair for direct topological connectivity (a zero-weight netrelation edge in the topology graph).
373
+
374
+
**For each gap (no direct edge between consecutive segments A and B):**
375
+
1. Try all (from_side, to_side) combinations via Dijkstra to find the cheapest route
376
+
2. Trace intermediate netelements along the shortest path
377
+
3. Before inserting bridges, check for U-turns (see below)
378
+
4. Insert bridge netelements with zero GNSS span between A and B
379
+
5. Record a `GapFill` record for debug output
380
+
381
+
**When no Dijkstra route exists**: Record a warning and leave the gap as-is.
382
+
383
+
### U-Turn Detection
384
+
385
+
Before inserting bridge netelements, the algorithm checks whether the route would create a directional reversal at the target segment. Specifically, if the last bridge netelement, the target segment, and the segment after the target form a directionally inconsistent triple, the gap-fill would force the path to enter the target and immediately reverse.
386
+
387
+
**When a U-turn is detected:**
388
+
- Skip the target segment entirely
389
+
- Absorb its GNSS range into the predecessor
390
+
- Re-evaluate the gap from the same predecessor to the next segment in the path
391
+
392
+
This prevents gap filling from introducing new direction violations that would require another round of sanity passes.
393
+
394
+
### Output
395
+
A gap-filled path with bridge netelements restoring continuity. Structured `GapFill` records for each action, exported as `06_filling_gaps.geojson` in debug mode.
285
396
286
397
---
287
398
@@ -347,6 +458,9 @@ The algorithm exposes several tunable parameters:
347
458
| Beta (β) | 50.0 meters | Transition probability scale (Newson & Krumm). Controls tolerance for mismatch between route distance and great-circle distance. Higher values are more forgiving of detours. |
348
459
| Edge-zone distance | 50.0 meters | Distance threshold from projected point to nearest netelement endpoint. Candidates farther than this from any endpoint are considered interior and cannot transition to a different netelement (transition probability = 0). |
349
460
| Turn-angle penalty scale | 30.0 degrees | Controls how aggressively sharp turns at netelement connections are penalised. `exp(-turn_angle / turn_scale)`: smaller values yield stronger penalty for the same angle. |
461
+
| Direction removal GNSS threshold | 100 positions | Minimum GNSS positions a segment must span to be protected from automatic removal during direction-violation processing. Segments below this threshold are considered artefacts eligible for removal. |
462
+
| Max oscillation intermediate NEs | 3 | Maximum number of distinct intermediate netelements that can be collapsed as an oscillation. Sequences with more intermediates are treated as genuine path segments. |
463
+
| Max direction cascade removals | 3 | Maximum number of neighbour removals a single netelement can cause during direction-violation processing before it is force-removed as the likely source of path corruption. |
350
464
351
465
---
352
466
@@ -363,7 +477,7 @@ The algorithm exposes several tunable parameters:
363
477
364
478
### Limitations
365
479
366
-
-**Assumes Single Traversal:**Cannot handle loops where the same segment is traversed multiple times
480
+
-**Assumes Single Traversal:**Oscillation collapse (Phase 4, Pass 2) handles cases where the same segment appears multiple times due to GNSS noise, but the algorithm assumes the train does not intentionally traverse the same physical segment more than once in a journey
367
481
-**Offline Only:** Not designed for real-time streaming processing
368
482
-**Requires Quality Topology:** Network data must be accurate and complete
369
483
-**Parameter Sensitivity:** The β parameter and edge-zone distance require tuning for different network geometries
Records the outcome of each consecutive-segment-pair evaluation during post-Viterbi path validation (Phase 4). Used for debug output (`05_path_sanity_decisions.geojson`).
958
+
959
+
### Rust Structure
960
+
961
+
```rust
962
+
/// Decision record for a single consecutive-segment pair during sanity validation.
963
+
#[derive(Debug, Clone, Serialize, Deserialize)]
964
+
pubstructSanityDecision {
965
+
/// Index of this pair (0 = first consecutive pair)
966
+
pubpair_index:usize,
967
+
968
+
/// Netelement ID of the source segment
969
+
pubfrom_netelement_id:String,
970
+
971
+
/// Netelement ID of the target segment
972
+
pubto_netelement_id:String,
973
+
974
+
/// Whether the target was reachable from the source
/// "removed-direction-violation", or "removed-direction-cascade"
979
+
pubaction:String,
980
+
981
+
/// Netelement IDs inserted by Dijkstra re-routing (empty if not rerouted),
982
+
/// or removed intermediate NEs (for oscillation collapse / direction removals)
983
+
pubrerouted_via:Vec<String>,
984
+
985
+
/// Warning message (empty if reachable and kept)
986
+
pubwarning:String,
987
+
}
988
+
```
989
+
990
+
### Validation Rules
991
+
992
+
| Field | Constraint |
993
+
|-------|-----------|
994
+
|`pair_index`| Sequential, starting from 0 |
995
+
|`from_netelement_id`| Non-empty, must reference an existing netelement |
996
+
|`to_netelement_id`| Non-empty, must reference an existing netelement |
997
+
|`action`| One of: `"kept"`, `"removed"`, `"rerouted"`, `"collapsed-oscillation"`, `"removed-direction-violation"`, `"removed-direction-cascade"`|
998
+
999
+
---
1000
+
1001
+
## 7. GapFill (Gap-Fill Action Record)
1002
+
1003
+
### Purpose
1004
+
Records the outcome of each gap-fill evaluation during Phase 5 (gap filling after sanity validation). Used for debug output (`06_filling_gaps.geojson`).
1005
+
1006
+
### Rust Structure
1007
+
1008
+
```rust
1009
+
/// Record of a gap-fill action between two consecutive segments after sanity validation.
1010
+
#[derive(Debug, Clone, Serialize, Deserialize)]
1011
+
pubstructGapFill {
1012
+
/// Index of this consecutive pair (0-based)
1013
+
pubpair_index:usize,
1014
+
1015
+
/// Netelement ID of the segment before the gap
1016
+
pubfrom_netelement_id:String,
1017
+
1018
+
/// Netelement ID of the segment after the gap
1019
+
pubto_netelement_id:String,
1020
+
1021
+
/// Whether a Dijkstra route was found between the two segments
1022
+
pubroute_found:bool,
1023
+
1024
+
/// Netelement IDs inserted to bridge the gap (empty if no route)
1025
+
pubinserted_netelements:Vec<String>,
1026
+
1027
+
/// Warning message (empty if directly connected or successfully filled)
1028
+
pubwarning:String,
1029
+
}
1030
+
```
1031
+
1032
+
### Validation Rules
1033
+
1034
+
| Field | Constraint |
1035
+
|-------|-----------|
1036
+
|`pair_index`| Sequential, starting from 0 |
1037
+
|`from_netelement_id`| Non-empty, must reference an existing netelement |
1038
+
|`to_netelement_id`| Non-empty, must reference an existing netelement |
1039
+
|`inserted_netelements`| Each element must reference an existing netelement |
1040
+
1041
+
---
1042
+
952
1043
## Entity Relationships
953
1044
954
1045
```
@@ -1042,6 +1133,8 @@ With metadata in separate file or header comments.
1042
1133
|`GnssNetElementLink`| Non-empty ID, distance ≥ 0, intrinsic in [0, 1], heading_diff in [0, 180°], probability in [0, 1]|
1043
1134
|`AssociatedNetElement`| Non-empty ID, probability in [0, 1], intrinsics in [0, 1], start_index ≤ end_index |
1044
1135
|`TrainPath`| Non-empty segments, probability in [0, 1], continuous GNSS indices |
1136
+
|`SanityDecision`| Sequential pair_index, non-empty netelement IDs, action is one of the defined enum values |
0 commit comments