You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/benchmarks/dvsp.md
+8-4Lines changed: 8 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ The state also implicitly includes (constant over time):
35
35
- Travel duration matrix ``d_{ij}``: time to travel from location ``i`` to location ``j``
36
36
- Depot location
37
37
38
-
**Action Space**``\mathcal{A}``: The action at time step ``t`` is a set of vehicle routes:
38
+
**Action Space**``\mathcal{A}(s_t)``: The action at time step ``t`` is a set of vehicle routes:
39
39
```math
40
40
a_t = \{r_1, r_2, \ldots, r_k\}
41
41
```
@@ -124,15 +124,19 @@ The greedy policy serves all pending customers as soon as they arrive, without c
124
124
```math
125
125
\xrightarrow[\text{State}]{s_t}
126
126
\fbox{Neural network $\varphi_w$}
127
-
\xrightarrow[\text{Priorities}]{\theta}
127
+
\xrightarrow[\text{Prizes}]{\theta}
128
128
\fbox{Prize-collecting VSP}
129
129
\xrightarrow[\text{Routes}]{a_t}
130
130
```
131
131
132
132
**Components**:
133
133
134
-
1.**Neural Network**``\varphi_w``: Takes current state features as input and predicts customer priorities ``\theta = (\theta_1, \ldots, \theta_n)``
135
-
2.**Optimization Layer**: Solves the prize-collecting vehicle scheduling problem to determine optimal routes given the predicted priorities
134
+
1.**Neural Network**``\varphi_w``: Takes current state features as input and predicts customer prizes ``\theta = (\theta_1, \ldots, \theta_n)``, one value per postponable customer.
135
+
2.**Optimization Layer**: Solves the prize-collecting vehicle scheduling problem to determine optimal routes given the predicted prizes, by maximizing total collected prizes minus travel costs:
0 commit comments