You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/benchmarks/dvsp.md
+39-26Lines changed: 39 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ The Dynamic Vehicle Scheduling Problem (DVSP) is a sequential decision-making pr
6
6
7
7
### Overview
8
8
9
-
In the dynamic vehicle scheduling problem, a fleet operator must decide at each time step which customer requests to serve immediately and which to postpone to future time steps.
9
+
In the dynamic vehicle scheduling problem, a fleet operator must decide at each time step which customer to serve immediately and which to postpone to future time steps.
10
10
The goal is to serve all customers by the end of the planning horizon while minimizing total travel time.
11
11
12
12
This is a simplified version of the more complex Dynamic Vehicle Routing Problem with Time Windows (DVRPTW), focusing on the core sequential decision-making aspects without capacity or time window constraints.
@@ -24,18 +24,18 @@ The dynamic vehicle scheduling problem can be formulated as a finite-horizon Mar
24
24
s_t = (R_t, D_t, t)
25
25
```
26
26
where:
27
-
-``R_t`` are the pending customer requests (not yet served), where each request``r_i \in R_t`` contains:
27
+
-``R_t`` are the pending customer (not yet served), where each customer``r_i \in R_t`` contains:
28
28
-``x_i, y_i``: 2d spatial coordinates of the customer location
29
29
-``\tau_i``: start time when the customer needs to be served
30
30
-``s_i``: service time required to serve the customer
31
-
-``D_t`` indicates which requests must be dispatched this time step (i.e. that cannot be postponed further, otherwise they will be infeasible at the next time step because of their start time)
31
+
-``D_t`` indicates which customers must be dispatched this time step (i.e. that cannot be postponed further, otherwise they will be infeasible at the next time step because of their start time)
32
32
-``t \in \{1, 2, \ldots, T\}`` is the current time step
33
33
34
34
The state also implicitly includes (constant over time):
35
35
- Travel duration matrix ``d_{ij}``: time to travel from location ``i`` to location ``j``
36
36
- Depot location
37
37
38
-
**Action Space**``\mathcal{A}``: The action at time step ``t`` is a set of vehicle routes:
38
+
**Action Space**``\mathcal{A}(s_t)``: The action at time step ``t`` is a set of vehicle routes:
39
39
```math
40
40
a_t = \{r_1, r_2, \ldots, r_k\}
41
41
```
@@ -47,7 +47,7 @@ A route is feasible if:
47
47
48
48
**Transition Dynamics**``\mathcal{P}(s_{t+1} | s_t, a_t)``: After executing routes ``a_t``:
49
49
50
-
1.**Remove served customers** from the pending request set
50
+
1.**Remove served customers** from the pending customer set
51
51
2.**Generate new customer arrivals** according to the underlying exogenous distribution
52
52
3.**Update must-dispatch set** based on postponement rules
53
53
@@ -70,7 +70,7 @@ where ``d_{ij}`` is the travel duration from location ``i`` to location ``j``, a
70
70
71
71
The main benchmark configuration with the following parameters:
72
72
73
-
-`max_requests_per_epoch`: Maximum number of new customer requests per time step (default: 10)
73
+
-`max_requests_per_epoch`: Maximum number of new customers per time step (default: 10)
74
74
-`Δ_dispatch`: Time delay between decision and vehicle dispatch (default: 1.0)
75
75
-`epoch_duration`: Duration of each decision time step (default: 1.0)
76
76
-`two_dimensional_features`: Whether to use simplified 2D features instead of full feature set (default: false)
@@ -82,51 +82,64 @@ Problem instances are generated from static vehicle routing datasets and include
82
82
-**Customer locations**: Spatial coordinates for pickup/delivery points
83
83
-**Depot location**: Central starting and ending point for all routes
84
84
-**Travel times**: Distance/duration matrix between all location pairs
85
-
-**Service requirements**: Time needed to serve each customer
85
+
-**Service times**: Service time each customer
86
86
87
-
The dynamic version samples new customer arrivals from the static instance, drawing new customers by independently sampling their locations and service times.
87
+
The dynamic version samples new customer arrivals from the static instance, drawing new customers by independently sampling:
88
+
- their locations from the set of static customer locations
89
+
- service times, uniformly from the range of service times in the static instance
88
90
89
91
### Features
90
92
91
-
The benchmark provides two feature representations:
92
-
93
-
**Full Features** (14-dimensional):
94
-
- Start times for postponable requests
95
-
- End times (start + service time)
96
-
- Travel time from depot to request
97
-
- Travel time from request to depot
98
-
- Slack time until next time step
99
-
- Quantile-based travel times to other requests (9 quantiles)
93
+
The benchmark provides two feature matrix representations, containing one column per postponable customer in the state:
94
+
95
+
**Full Features** (27-dimensional):
96
+
- Start times for postponable customers (1)
97
+
- End times (start + service time) (2)
98
+
- Travel time from depot to customer (3)
99
+
- Travel time from customer to depot (4)
100
+
- Slack time until next time step (5)
101
+
- % of must-dispatch customers that can reach this customer on time (6)
102
+
- % of customers reachable from this customer on time (7)
103
+
- % of customers that can reach this customer on time (8)
104
+
- % of customers reachable or that can reach this customer on time (9)
105
+
- Quantile-based travel times to other customers (9 quantiles) (10-18)
106
+
- Quantiles of % of reachable new customers (9 quantiles) (19-27)
100
107
101
108
**2D Features** (simplified):
102
-
- Travel time from depot to request
103
-
- Mean travel time to other requests
109
+
- Travel time from depot to customer (1)
110
+
- Mean travel time to other customers (2)
104
111
105
112
## Benchmark Policies
106
113
107
114
### Lazy Policy
108
115
109
-
The lazy policy postpones all possible requests, serving only those that must be dispatched.
116
+
The lazy policy postpones all possible customers, serving only those that must be dispatched.
110
117
111
118
### Greedy Policy
112
119
113
-
The greedy policy serves all pending requests as soon as they arrive, without considering future consequences.
120
+
The greedy policy serves all pending customers as soon as they arrive, without considering future consequences.
114
121
115
122
## Decision-Focused Learning Policy
116
123
117
124
```math
118
125
\xrightarrow[\text{State}]{s_t}
119
126
\fbox{Neural network $\varphi_w$}
120
-
\xrightarrow[\text{Priorities}]{\theta}
127
+
\xrightarrow[\text{Prizes}]{\theta}
121
128
\fbox{Prize-collecting VSP}
122
129
\xrightarrow[\text{Routes}]{a_t}
123
130
```
124
131
125
132
**Components**:
126
133
127
-
1.**Neural Network**``\varphi_w``: Takes current state features as input and predicts customer priorities ``\theta = (\theta_1, \ldots, \theta_n)``
128
-
2.**Optimization Layer**: Solves the prize-collecting vehicle scheduling problem to determine optimal routes given the predicted priorities
134
+
1.**Neural Network**``\varphi_w``: Takes current state features as input and predicts customer prizes ``\theta = (\theta_1, \ldots, \theta_n)``, one value per postponable customer.
135
+
2.**Optimization Layer**: Solves the prize-collecting vehicle scheduling problem to determine optimal routes given the predicted prizes, by maximizing total collected prizes minus travel costs:
0 commit comments