Skip to content

Commit abaf9d2

Browse files
committed
Improve doc page
1 parent e244f77 commit abaf9d2

File tree

1 file changed

+8
-4
lines changed

1 file changed

+8
-4
lines changed

docs/src/benchmarks/dvsp.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ The state also implicitly includes (constant over time):
3535
- Travel duration matrix ``d_{ij}``: time to travel from location ``i`` to location ``j``
3636
- Depot location
3737

38-
**Action Space** ``\mathcal{A}``: The action at time step ``t`` is a set of vehicle routes:
38+
**Action Space** ``\mathcal{A}(s_t)``: The action at time step ``t`` is a set of vehicle routes:
3939
```math
4040
a_t = \{r_1, r_2, \ldots, r_k\}
4141
```
@@ -124,15 +124,19 @@ The greedy policy serves all pending customers as soon as they arrive, without c
124124
```math
125125
\xrightarrow[\text{State}]{s_t}
126126
\fbox{Neural network $\varphi_w$}
127-
\xrightarrow[\text{Priorities}]{\theta}
127+
\xrightarrow[\text{Prizes}]{\theta}
128128
\fbox{Prize-collecting VSP}
129129
\xrightarrow[\text{Routes}]{a_t}
130130
```
131131

132132
**Components**:
133133

134-
1. **Neural Network** ``\varphi_w``: Takes current state features as input and predicts customer priorities ``\theta = (\theta_1, \ldots, \theta_n)``
135-
2. **Optimization Layer**: Solves the prize-collecting vehicle scheduling problem to determine optimal routes given the predicted priorities
134+
1. **Neural Network** ``\varphi_w``: Takes current state features as input and predicts customer prizes ``\theta = (\theta_1, \ldots, \theta_n)``, one value per postponable customer.
135+
2. **Optimization Layer**: Solves the prize-collecting vehicle scheduling problem to determine optimal routes given the predicted prizes, by maximizing total collected prizes minus travel costs:
136+
```math
137+
\max_{a_t\in \mathcal{A}(s_t)} \sum_{r \in a_t} \left( \sum_{i \in r} \theta_i - \sum_{(i,j) \in r} d_{ij} \right)
138+
```
139+
This can be modeled as a flow linear program on a directed acyclic graph (DAG) and is solved using standard LP solvers.
136140
137141
The neural network architecture adapts to the feature dimensionality:
138142
- **2D features**: `Dense(2 => 1)`, applied in parallel to each postponable customer

0 commit comments

Comments
 (0)