Skip to content

Commit 97c310b

Browse files
authored
Create DATA_DICTIONARY.md
1 parent fc34446 commit 97c310b

File tree

1 file changed

+81
-0
lines changed

1 file changed

+81
-0
lines changed

data/DATA_DICTIONARY.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# ARGOS Dataset — Data Dictionary
2+
3+
This data dictionary describes the fields contained in the released CSV files.
4+
5+
---
6+
7+
## 1. `synthetic_long_horizon.csv`
8+
9+
| Column | Type | Range / Units | Description |
10+
|-------------|---------|------------------------|--------------------------------------------------------------|
11+
| `day` | int | 0–364 (index) | Simulation day index from the start of the horizon. |
12+
| `occupancy` | float | 0.0–1.0 (fraction) | Normalized occupancy rate (1.0 = fully occupied). |
13+
| `fatigue` | float | 0.0–1.0 (index) | Staff fatigue index; higher values indicate more fatigue. |
14+
| `staff_level` | float | 0.0–1.0 (fraction) | Normalized staffing adequacy (1.0 = fully staffed). |
15+
| `revpar` | float | currency units/room | Revenue per available room (RevPAR), in arbitrary units. |
16+
17+
---
18+
19+
## 2. `scenario_high_volatility.csv`
20+
21+
| Column | Type | Range / Units | Description |
22+
|-------------|---------|------------------------|--------------------------------------------------------------|
23+
| `day` | int | 0–179 (index) | Simulation day index. |
24+
| `occupancy` | float | 0.0–1.0 (fraction) | Normalized occupancy with elevated variance. |
25+
| `fatigue` | float | 0.0–1.0 (index) | Staff fatigue index with elevated variance. |
26+
| `revpar` | float | currency units/room | Highly variable RevPAR under volatile market conditions. |
27+
28+
---
29+
30+
## 3. `scenario_staff_shortage.csv`
31+
32+
| Column | Type | Range / Units | Description |
33+
|-------------|---------|------------------------|--------------------------------------------------------------|
34+
| `day` | int | 0–179 (index) | Simulation day index. |
35+
| `occupancy` | float | 0.0–1.0 | Normalized occupancy rate. |
36+
| `fatigue` | float | 0.0–1.0 | Staff fatigue index (typically higher on average). |
37+
| `staff_level` | float | 0.0–1.0 | Normalized staffing level (typically lower on average). |
38+
| `revpar` | float | currency units/room | RevPAR under staff-shortage stress conditions. |
39+
40+
---
41+
42+
## 4. `hyperparam_sweep_results.csv`
43+
44+
| Column | Type | Range / Units | Description |
45+
|------------------|--------|--------------------------|--------------------------------------------------------------|
46+
| `alpha` | float | > 0 (e.g., 0.01–0.1) | Step size used in the ARGOS optimizer. |
47+
| `cag_weight` | float | 0.0–1.0 | Weighting factor for the CAG contour-based direction. |
48+
| `avg_revpar` | float | currency units/room | Average RevPAR across the experiment horizon. |
49+
| `violations_tier1` | int | ≥ 0 | Count of Tier-1 feasibility violations (e.g., overbooking). |
50+
| `fatigue_mean` | float | 0.0–1.0 | Mean staff fatigue index across the horizon. |
51+
52+
---
53+
54+
## 5. `qubo_example_matrix.csv`
55+
56+
Each row corresponds to one dimension of an 8×8 QUBO matrix.
57+
58+
| Column | Type | Range / Units | Description |
59+
|-------------|--------|--------------------------|--------------------------------------------------------------|
60+
| `col_0``col_7` | int | typically -5 to +5 | QUBO coefficients \( Q_{ij} \) for binary decision vector. |
61+
62+
(Actual column names may be generic numeric indices depending on CSV export; they represent the columns of the QUBO matrix.)
63+
64+
---
65+
66+
## 6. `multiunit_traffic_sim.csv`
67+
68+
| Column | Type | Range / Units | Description |
69+
|-------------------|------|------------------------|--------------------------------------------------------------|
70+
| `day` | int | 0–99 (index) | Simulation day index. |
71+
| `hotel_0_traffic` | int | ≥ 0 (counts) | Approximate booking/traffic measure for hotel 0. |
72+
| `hotel_1_traffic` | int | ≥ 0 | Same for hotel 1. |
73+
| `hotel_2_traffic` | int | ≥ 0 | Same for hotel 2. |
74+
| `hotel_3_traffic` | int | ≥ 0 | Same for hotel 3. |
75+
| `hotel_4_traffic` | int | ≥ 0 | Same for hotel 4. |
76+
77+
Traffic values represent relative load and are not tied to any real booking system.
78+
79+
---
80+
81+
All fields are generated synthetically; no direct mapping to any operational KPI, property, or organization exists. This makes the dataset suitable for open distribution and methodological benchmarking.

0 commit comments

Comments
 (0)