Skip to content

Commit 9932b70

Browse files
committed
spec: update with latest algorithm changes
1 parent 3ceed4d commit 9932b70

File tree

4 files changed

+262
-21
lines changed

4 files changed

+262
-21
lines changed

specs/002-train-path-calculation/algorithm.md

Lines changed: 122 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Train Path Calculation Algorithm
22

33
**Feature**: Continuous Train Path Calculation with Network Topology
4-
**Document Version**: 2.0
5-
**Last Updated**: June 2025
4+
**Document Version**: 3.0
5+
**Last Updated**: July 2025
66

77
- [Train Path Calculation Algorithm](#train-path-calculation-algorithm)
88
- [Overview](#overview)
@@ -29,12 +29,23 @@
2929
- [Bridge Netelement Insertion](#bridge-netelement-insertion)
3030
- [Path Probability Calculation](#path-probability-calculation)
3131
- [Output](#output-2)
32+
- [Phase 4: Post-Viterbi Path Validation](#phase-4-post-viterbi-path-validation)
33+
- [Objective](#objective-3)
34+
- [Pass 1: Reachability Validation and Bridge Re-Routing](#pass-1-reachability-validation-and-bridge-re-routing)
35+
- [Pass 2: Oscillation Collapse](#pass-2-oscillation-collapse)
36+
- [Pass 3: Direction Violation Removal](#pass-3-direction-violation-removal)
37+
- [Output](#output-3)
38+
- [Phase 5: Gap Filling](#phase-5-gap-filling)
39+
- [Objective](#objective-4)
40+
- [Process](#process-2)
41+
- [U-Turn Detection](#u-turn-detection)
42+
- [Output](#output-4)
3243
- [Fallback Behavior](#fallback-behavior)
3344
- [Conditions for Fallback](#conditions-for-fallback)
3445
- [Fallback Strategy](#fallback-strategy)
3546
- [Performance Optimization: Resampling](#performance-optimization-resampling)
36-
- [Objective](#objective-3)
37-
- [Process](#process-1)
47+
- [Objective](#objective-5)
48+
- [Process](#process-3)
3849
- [Configuration Parameters](#configuration-parameters)
3950
- [Algorithm Properties](#algorithm-properties)
4051
- [Strengths](#strengths)
@@ -49,11 +60,13 @@ This document describes the HMM-based map matching algorithm for calculating a c
4960

5061
## Algorithm Phases
5162

52-
The path calculation consists of three main phases:
63+
The path calculation consists of five main phases:
5364

5465
1. **Candidate Selection** — Identify potential track segments for each GNSS coordinate
5566
2. **Emission Probability** — Calculate the likelihood that each GNSS position was on each candidate segment (HMM emission model)
56-
3. **Viterbi Decoding & Path Reconstruction** — Decode the globally optimal netelement sequence using a log-space Viterbi algorithm with transition probabilities derived from shortest-path routing, then insert bridge netelements to produce the final continuous path
67+
3. **Viterbi Decoding & Path Reconstruction** — Decode the globally optimal netelement sequence using a log-space Viterbi algorithm with transition probabilities derived from shortest-path routing, then insert bridge netelements to produce the initial continuous path
68+
4. **Post-Viterbi Path Validation** — Three-pass sanity validation: reachability check with bridge re-routing, oscillation collapse, and direction violation removal with cascade detection
69+
5. **Gap Filling** — Re-insert bridge netelements where consecutive segments lost direct connectivity after sanity removals, with U-turn detection to prevent direction reversals
5770

5871
---
5972

@@ -281,7 +294,105 @@ path_probability = min(exp(avg_log_prob), 1.0)
281294
This represents the geometric mean of per-state probabilities, clamped to [0, 1].
282295

283296
### Output
284-
A single optimal `TrainPath` consisting of an ordered list of `AssociatedNetElement` segments with intrinsic coordinate ranges, plus an overall probability score.
297+
An initial `TrainPath` consisting of an ordered list of `AssociatedNetElement` segments with intrinsic coordinate ranges, plus an overall probability score. This path is then refined by Phase 4 (validation) and Phase 5 (gap filling).
298+
299+
---
300+
301+
## Phase 4: Post-Viterbi Path Validation
302+
303+
### Objective
304+
Refine the Viterbi-decoded path by removing segments that are topologically unreachable, collapse oscillation artefacts, and resolve directional inconsistencies. The Viterbi algorithm operates at the candidate-lattice level and may produce locally optimal but globally inconsistent paths — for example, when penalty carry-forward forces transitions through disconnected regions, or when GNSS noise causes the same netelement to appear multiple times with short intermediate detours.
305+
306+
Validation consists of three sequential passes, each producing structured `SanityDecision` records for debug output.
307+
308+
### Pass 1: Reachability Validation and Bridge Re-Routing
309+
310+
Walks the path segments sequentially, checking each consecutive pair for topological reachability via Dijkstra on the directed topology graph.
311+
312+
**For each consecutive pair (A, B):**
313+
1. If A and B are the same netelement → always valid (kept)
314+
2. If any (from_side, to_side) combination yields a Dijkstra path → reachable (kept)
315+
3. Otherwise → **unreachable**:
316+
- Remove B from the path
317+
- Look ahead to the next segment C: attempt Dijkstra from A to C
318+
- If a route exists, insert bridge netelements between A and C
319+
- Record a warning and a `SanityDecision` with action `"removed"` or `"rerouted"`
320+
321+
### Pass 2: Oscillation Collapse
322+
323+
Detects and collapses oscillation patterns where the same netelement appears more than once with a short intermediate detour — e.g., `A → B → C → A` where B and C are noise.
324+
325+
**Detection criteria** — an oscillation is detected when the same netelement `NE` appears at positions `i` and `j` (with `j > i + 1`) and:
326+
- The number of distinct intermediate netelements is ≤ `MAX_OSCILLATION_INTERMEDIATE_NES` (default: 3)
327+
- The intermediate GNSS coverage is less than the first occurrence's coverage, or < 10 GNSS positions in absolute terms
328+
329+
**Collapse action:**
330+
- Merge segments `i` and `j`: extend segment `i`'s GNSS range to cover segment `j`'s range
331+
- Remove all intermediate segments (`i+1` through `j`)
332+
- Record a `SanityDecision` with action `"collapsed-oscillation"`
333+
334+
The process iterates until no more oscillations are found (fixed-point).
335+
336+
**Guard**: Sequences with more than `MAX_OSCILLATION_INTERMEDIATE_NES` distinct intermediate netelements are treated as genuine path segments (the train actually traversed them), even when their total GNSS coverage is small relative to the repeated netelement.
337+
338+
### Pass 3: Direction Violation Removal
339+
340+
Detects and removes segments that create directional inconsistencies (U-turns). Walks the path checking each triple (A, B, C) for consistency using the topology graph.
341+
342+
**Consistency check** (`triple_is_consistent`): A triple (A, B, C) is consistent if there exists any combination of netelement sides such that A has a direct netrelation edge to B AND B has a direct netrelation edge to C, with the exit side from A→B being the same netelement side as the entry for B→C (i.e., the train can traverse B without reversing).
343+
344+
**Removal strategy for inconsistent triples:**
345+
346+
1. **Cascade detection**: If segment B has caused ≥ `MAX_DIRECTION_CASCADE_REMOVALS` (default: 3) neighbour removals (tracked separately as *anchor* removals and *protected* removals) **and** the next segment C would be removable, force-remove B as the likely source of path corruption. Two separate counters prevent conflating unrelated removal patterns:
347+
- *anchor*: incremented for A when B is removed and A→B was connected (A stays, successive Bs get eaten)
348+
- *protected*: incremented for B when C is removed as fallback because B was too significant (B stays, successive Cs get eaten)
349+
350+
2. **Oscillation remnant (A == C)**: Remove C (the second occurrence of A)
351+
352+
3. **Connected A→B (wrong exit)**: Target B for removal. If B exceeds the GNSS threshold, try removing C as a fallback instead
353+
354+
4. **Disconnected A→B (orphan)**: Prefer bridge segments; otherwise remove the smaller of {A, B}
355+
356+
**Removability**: A segment is automatically removable if it is a bridge (zero GNSS span) or has fewer than `DIRECTION_REMOVAL_GNSS_THRESHOLD` (default: 100) GNSS positions. Segments exceeding this threshold are kept with a warning when no smaller alternative exists.
357+
358+
The process iterates until no more violations are found (fixed-point).
359+
360+
### Output
361+
A validated path with unreachable, oscillating, and direction-violating segments removed. Structured `SanityDecision` records for each action taken, exported as `05_path_sanity_decisions.geojson` in debug mode.
362+
363+
---
364+
365+
## Phase 5: Gap Filling
366+
367+
### Objective
368+
After sanity validation may have removed segments, consecutive pairs in the path may no longer be directly connected. Gap filling re-inserts bridge netelements to restore path continuity.
369+
370+
### Process
371+
372+
Walks the validated path and checks each consecutive pair for direct topological connectivity (a zero-weight netrelation edge in the topology graph).
373+
374+
**For each gap (no direct edge between consecutive segments A and B):**
375+
1. Try all (from_side, to_side) combinations via Dijkstra to find the cheapest route
376+
2. Trace intermediate netelements along the shortest path
377+
3. Before inserting bridges, check for U-turns (see below)
378+
4. Insert bridge netelements with zero GNSS span between A and B
379+
5. Record a `GapFill` record for debug output
380+
381+
**When no Dijkstra route exists**: Record a warning and leave the gap as-is.
382+
383+
### U-Turn Detection
384+
385+
Before inserting bridge netelements, the algorithm checks whether the route would create a directional reversal at the target segment. Specifically, if the last bridge netelement, the target segment, and the segment after the target form a directionally inconsistent triple, the gap-fill would force the path to enter the target and immediately reverse.
386+
387+
**When a U-turn is detected:**
388+
- Skip the target segment entirely
389+
- Absorb its GNSS range into the predecessor
390+
- Re-evaluate the gap from the same predecessor to the next segment in the path
391+
392+
This prevents gap filling from introducing new direction violations that would require another round of sanity passes.
393+
394+
### Output
395+
A gap-filled path with bridge netelements restoring continuity. Structured `GapFill` records for each action, exported as `06_filling_gaps.geojson` in debug mode.
285396

286397
---
287398

@@ -347,6 +458,9 @@ The algorithm exposes several tunable parameters:
347458
| Beta (β) | 50.0 meters | Transition probability scale (Newson & Krumm). Controls tolerance for mismatch between route distance and great-circle distance. Higher values are more forgiving of detours. |
348459
| Edge-zone distance | 50.0 meters | Distance threshold from projected point to nearest netelement endpoint. Candidates farther than this from any endpoint are considered interior and cannot transition to a different netelement (transition probability = 0). |
349460
| Turn-angle penalty scale | 30.0 degrees | Controls how aggressively sharp turns at netelement connections are penalised. `exp(-turn_angle / turn_scale)`: smaller values yield stronger penalty for the same angle. |
461+
| Direction removal GNSS threshold | 100 positions | Minimum GNSS positions a segment must span to be protected from automatic removal during direction-violation processing. Segments below this threshold are considered artefacts eligible for removal. |
462+
| Max oscillation intermediate NEs | 3 | Maximum number of distinct intermediate netelements that can be collapsed as an oscillation. Sequences with more intermediates are treated as genuine path segments. |
463+
| Max direction cascade removals | 3 | Maximum number of neighbour removals a single netelement can cause during direction-violation processing before it is force-removed as the likely source of path corruption. |
350464

351465
---
352466

@@ -363,7 +477,7 @@ The algorithm exposes several tunable parameters:
363477

364478
### Limitations
365479

366-
- **Assumes Single Traversal:** Cannot handle loops where the same segment is traversed multiple times
480+
- **Assumes Single Traversal:** Oscillation collapse (Phase 4, Pass 2) handles cases where the same segment appears multiple times due to GNSS noise, but the algorithm assumes the train does not intentionally traverse the same physical segment more than once in a journey
367481
- **Offline Only:** Not designed for real-time streaming processing
368482
- **Requires Quality Topology:** Network data must be accurate and complete
369483
- **Parameter Sensitivity:** The β parameter and edge-zone distance require tuning for different network geometries

specs/002-train-path-calculation/data-model.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@
2828
- [Rust Structure](#rust-structure-3)
2929
- [GeoJSON Representation](#geojson-representation-1)
3030
- [CSV Representation](#csv-representation-1)
31+
- [6. SanityDecision (Post-Viterbi Validation Record)](#6-sanitydecision-post-viterbi-validation-record)
32+
- [7. GapFill (Gap-Fill Action Record)](#7-gapfill-gap-fill-action-record)
3133
- [Entity Relationships](#entity-relationships)
3234
- [Validation Summary](#validation-summary)
3335
- [Backward Compatibility](#backward-compatibility)
@@ -949,6 +951,95 @@ With metadata in separate file or header comments.
949951

950952
---
951953

954+
## 6. SanityDecision (Post-Viterbi Validation Record)
955+
956+
### Purpose
957+
Records the outcome of each consecutive-segment-pair evaluation during post-Viterbi path validation (Phase 4). Used for debug output (`05_path_sanity_decisions.geojson`).
958+
959+
### Rust Structure
960+
961+
```rust
962+
/// Decision record for a single consecutive-segment pair during sanity validation.
963+
#[derive(Debug, Clone, Serialize, Deserialize)]
964+
pub struct SanityDecision {
965+
/// Index of this pair (0 = first consecutive pair)
966+
pub pair_index: usize,
967+
968+
/// Netelement ID of the source segment
969+
pub from_netelement_id: String,
970+
971+
/// Netelement ID of the target segment
972+
pub to_netelement_id: String,
973+
974+
/// Whether the target was reachable from the source
975+
pub reachable: bool,
976+
977+
/// Action taken: "kept", "removed", "rerouted", "collapsed-oscillation",
978+
/// "removed-direction-violation", or "removed-direction-cascade"
979+
pub action: String,
980+
981+
/// Netelement IDs inserted by Dijkstra re-routing (empty if not rerouted),
982+
/// or removed intermediate NEs (for oscillation collapse / direction removals)
983+
pub rerouted_via: Vec<String>,
984+
985+
/// Warning message (empty if reachable and kept)
986+
pub warning: String,
987+
}
988+
```
989+
990+
### Validation Rules
991+
992+
| Field | Constraint |
993+
|-------|-----------|
994+
| `pair_index` | Sequential, starting from 0 |
995+
| `from_netelement_id` | Non-empty, must reference an existing netelement |
996+
| `to_netelement_id` | Non-empty, must reference an existing netelement |
997+
| `action` | One of: `"kept"`, `"removed"`, `"rerouted"`, `"collapsed-oscillation"`, `"removed-direction-violation"`, `"removed-direction-cascade"` |
998+
999+
---
1000+
1001+
## 7. GapFill (Gap-Fill Action Record)
1002+
1003+
### Purpose
1004+
Records the outcome of each gap-fill evaluation during Phase 5 (gap filling after sanity validation). Used for debug output (`06_filling_gaps.geojson`).
1005+
1006+
### Rust Structure
1007+
1008+
```rust
1009+
/// Record of a gap-fill action between two consecutive segments after sanity validation.
1010+
#[derive(Debug, Clone, Serialize, Deserialize)]
1011+
pub struct GapFill {
1012+
/// Index of this consecutive pair (0-based)
1013+
pub pair_index: usize,
1014+
1015+
/// Netelement ID of the segment before the gap
1016+
pub from_netelement_id: String,
1017+
1018+
/// Netelement ID of the segment after the gap
1019+
pub to_netelement_id: String,
1020+
1021+
/// Whether a Dijkstra route was found between the two segments
1022+
pub route_found: bool,
1023+
1024+
/// Netelement IDs inserted to bridge the gap (empty if no route)
1025+
pub inserted_netelements: Vec<String>,
1026+
1027+
/// Warning message (empty if directly connected or successfully filled)
1028+
pub warning: String,
1029+
}
1030+
```
1031+
1032+
### Validation Rules
1033+
1034+
| Field | Constraint |
1035+
|-------|-----------|
1036+
| `pair_index` | Sequential, starting from 0 |
1037+
| `from_netelement_id` | Non-empty, must reference an existing netelement |
1038+
| `to_netelement_id` | Non-empty, must reference an existing netelement |
1039+
| `inserted_netelements` | Each element must reference an existing netelement |
1040+
1041+
---
1042+
9521043
## Entity Relationships
9531044

9541045
```
@@ -1042,6 +1133,8 @@ With metadata in separate file or header comments.
10421133
| `GnssNetElementLink` | Non-empty ID, distance ≥ 0, intrinsic in [0, 1], heading_diff in [0, 180°], probability in [0, 1] |
10431134
| `AssociatedNetElement` | Non-empty ID, probability in [0, 1], intrinsics in [0, 1], start_index ≤ end_index |
10441135
| `TrainPath` | Non-empty segments, probability in [0, 1], continuous GNSS indices |
1136+
| `SanityDecision` | Sequential pair_index, non-empty netelement IDs, action is one of the defined enum values |
1137+
| `GapFill` | Sequential pair_index, non-empty netelement IDs, inserted_netelements reference existing netelements |
10451138

10461139
---
10471140

@@ -1061,6 +1154,8 @@ With metadata in separate file or header comments.
10611154
- `NetRelation`: New model, no breaking changes to existing code
10621155
- `AssociatedNetElement`: New model for path representation
10631156
- `TrainPath`: New model for path representation
1157+
- `SanityDecision`: New model for post-Viterbi validation debug output
1158+
- `GapFill`: New model for gap-fill action debug output
10641159

10651160
---
10661161

0 commit comments

Comments
 (0)