|
| 1 | +# Requests Generator |
| 2 | + |
| 3 | +This document describes the design of the **requests generator**, which models a stream of user requests to a given endpoint over time. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## Model Inputs and Output |
| 8 | + |
| 9 | +Following the FastSim philosophy, we accept a small set of input parameters to drive a “what-if” analysis in a pre-production environment. These inputs let you explore reliability and cost implications under different traffic scenarios. |
| 10 | + |
| 11 | +**Inputs** |
| 12 | + |
| 13 | +1. **Average concurrent users** – expected number of users (or sessions) simultaneously hitting the endpoint. |
| 14 | +2. **Average requests per minute per user** – average number of requests each user issues per minute. |
| 15 | +3. **Simulation time** – total duration of the simulation, in seconds. |
| 16 | + |
| 17 | +**Output** |
| 18 | +A continuous sequence of timestamps (seconds) marking individual request arrivals. |
| 19 | + |
| 20 | +--- |
| 21 | + |
| 22 | +## Model Assumptions |
| 23 | + |
| 24 | +* *Concurrent users* and *requests per minute per user* are **random variables**. |
| 25 | +* *Simulation time* is **deterministic**. |
| 26 | + |
| 27 | +We model: |
| 28 | + |
| 29 | +* **Requests per minute per user** as Poisson($\lambda_r$). |
| 30 | +* **Concurrent users** as either Poisson($\lambda_u$) or truncated Normal. |
| 31 | +* **The variables are independent** |
| 32 | + |
| 33 | +```python |
| 34 | +from pydantic import BaseModel |
| 35 | +from typing import Literal |
| 36 | + |
| 37 | +class RVConfig(BaseModel): |
| 38 | + """Configure a random-variable parameter.""" |
| 39 | + mean: float |
| 40 | + distribution: Literal["poisson", "normal", "gaussian"] = "poisson" |
| 41 | + variance: float | None = None # required only for normal/gaussian |
| 42 | + |
| 43 | +class SimulationInput(BaseModel): |
| 44 | + """Define simulation inputs.""" |
| 45 | + avg_active_users: RVConfig |
| 46 | + avg_request_per_minute_per_user: RVConfig |
| 47 | + total_simulation_time: int | None = None |
| 48 | +``` |
| 49 | + |
| 50 | +--- |
| 51 | + |
| 52 | +## Aggregate Request Rate |
| 53 | + |
| 54 | +From the two random inputs we define the **per-second aggregate rate** $\Lambda$: |
| 55 | + |
| 56 | +$$ |
| 57 | +\Lambda |
| 58 | + = \text{concurrent\_users} |
| 59 | + \;\times\; |
| 60 | + \frac{\text{requests\_per\_minute\_per\_user}}{60} |
| 61 | + \quad[\text{requests/s}]. |
| 62 | +$$ |
| 63 | + |
| 64 | +--- |
| 65 | + |
| 66 | +## 1. Poisson → Exponential Refresher |
| 67 | + |
| 68 | +### 1.1 Homogeneous Poisson process |
| 69 | + |
| 70 | +A Poisson process of rate $\lambda$ has |
| 71 | + |
| 72 | +$$ |
| 73 | +\Pr\{N(t)=k\} |
| 74 | + = e^{-\lambda t}\,\frac{(\lambda t)^{k}}{k!},\quad k=0,1,2,\dots |
| 75 | +$$ |
| 76 | + |
| 77 | +### 1.2 Waiting time to first event |
| 78 | + |
| 79 | +Define $T_1=\inf\{t>0:N(t)=1\}$. |
| 80 | +The survival function is |
| 81 | + |
| 82 | +$$ |
| 83 | +\Pr\{T_1>t\} |
| 84 | + = \Pr\{N(t)=0\} |
| 85 | + = e^{-\lambda t}, |
| 86 | +$$ |
| 87 | + |
| 88 | +so the CDF is |
| 89 | + |
| 90 | +$$ |
| 91 | +F_{T_1}(t) = 1 - e^{-\lambda t},\quad t\ge0, |
| 92 | +$$ |
| 93 | + |
| 94 | +and the density $f(t)=\lambda\,e^{-\lambda t}$. Thus |
| 95 | + |
| 96 | +$$ |
| 97 | +T_1 \sim \mathrm{Exp}(\lambda), |
| 98 | +$$ |
| 99 | + |
| 100 | +and by memorylessness every inter-arrival gap $\Delta t_i$ is i.i.d. Exp($\lambda$). |
| 101 | + |
| 102 | +### 1.3 Inverse-CDF sampling |
| 103 | + |
| 104 | +To draw $\Delta t\sim\mathrm{Exp}(\lambda)$: |
| 105 | + |
| 106 | +1. Sample $U\sim\mathcal U(0,1)$. |
| 107 | +2. Solve $U=1-e^{-\lambda\,\Delta t}$;$\Rightarrow\;\Delta t=-\ln(1-U)/\lambda$. |
| 108 | +3. Equivalent compact form: |
| 109 | + $\displaystyle \Delta t = -\,\ln(U)/\lambda$. |
| 110 | + |
| 111 | +--- |
| 112 | + |
| 113 | +## 2. Poisson × Poisson Workload |
| 114 | + |
| 115 | +### 2.1 Notation |
| 116 | + |
| 117 | +| Symbol | Meaning | Law | |
| 118 | +| --------------------------------- | --------------------------------------- | -------- | |
| 119 | +| $U\sim\mathrm{Pois}(\lambda_u)$ | active users in current 1-minute window | Poisson | |
| 120 | +| $R_i\sim\mathrm{Pois}(\lambda_r)$ | requests per minute by user *i* | Poisson | |
| 121 | +| $N=\sum_{i=1}^U R_i$ | total requests in that minute | compound | |
| 122 | +| $\Lambda=N/60$ | aggregate rate (requests / second) | compound | |
| 123 | + |
| 124 | +The procedure here rely heavily on the independence of our random variables. |
| 125 | + |
| 126 | +### 2.2 Conditional sum ⇒ Poisson |
| 127 | + |
| 128 | +Given $U=u$: |
| 129 | + |
| 130 | +$$ |
| 131 | +N\mid U=u |
| 132 | +=\sum_{i=1}^{u}R_i |
| 133 | +\;\sim\;\mathrm{Pois}(u\,\lambda_r). |
| 134 | +$$ |
| 135 | + |
| 136 | +### 2.3 Unconditional law of $N$ |
| 137 | + |
| 138 | +By the law of total probability: |
| 139 | + |
| 140 | +$$ |
| 141 | +\Pr\{N=n\} |
| 142 | +=\sum_{u=0}^{\infty} |
| 143 | +\Pr\{U=u\}\; |
| 144 | +\Pr\{N=n\mid U=u\} |
| 145 | +\;=\; |
| 146 | +e^{-\lambda_u}\,\frac1{n!} |
| 147 | +\sum_{u=0}^{\infty} |
| 148 | +\frac{\lambda_u^u}{u!}\, |
| 149 | +e^{-u\lambda_r}\,(u\lambda_r)^n. |
| 150 | +$$ |
| 151 | + |
| 152 | +This is the **Poisson–Poisson compound** (Borel–Tanner) distribution. |
| 153 | + |
| 154 | +--- |
| 155 | + |
| 156 | +## 3. Exact Hierarchical Sampler |
| 157 | + |
| 158 | +Rather than invert the discrete CDF above, we exploit the conditional structure: |
| 159 | + |
| 160 | +```python |
| 161 | +# Hierarchical sampler code snippet |
| 162 | +now = 0.0 # virtual clock (s) |
| 163 | +window_end = 0.0 # end of the current user window |
| 164 | +Lambda = 0.0 # aggregate rate Λ (req/s) |
| 165 | + |
| 166 | +while now < simulation_time: |
| 167 | + # (Re)sample U at the start of each window |
| 168 | + if now >= window_end: |
| 169 | + window_end = now + float(sampling_window_s) |
| 170 | + users = poisson_variable_generator(mean_concurrent_user, rng) |
| 171 | + Lambda = users * mean_req_per_sec_per_user |
| 172 | + |
| 173 | + # No users → fast-forward to next window |
| 174 | + if Lambda <= 0.0: |
| 175 | + now = window_end |
| 176 | + continue |
| 177 | + |
| 178 | + # Exponential gap from a protected uniform value |
| 179 | + u_raw = max(uniform_variable_generator(rng), 1e-15) |
| 180 | + delta_t = -math.log(1.0 - u_raw) / Lambda |
| 181 | + |
| 182 | + # End simulation if the next event exceeds the horizon |
| 183 | + if now + delta_t > simulation_time: |
| 184 | + break |
| 185 | + |
| 186 | + # If the gap crosses the window boundary, jump to it |
| 187 | + if now + delta_t >= window_end: |
| 188 | + now = window_end |
| 189 | + continue |
| 190 | + |
| 191 | + now += delta_t |
| 192 | + yield delta_t |
| 193 | +``` |
| 194 | + |
| 195 | +Because each conditional step matches the exact Poisson→Exponential law, this two-stage algorithm reproduces the same joint distribution as analytically inverting the compound CDF, but with minimal computation. |
| 196 | + |
| 197 | +--- |
| 198 | + |
| 199 | +## 4. Validity of the hierarchical sampler |
| 200 | + |
| 201 | +The validity of the hierarchical sampler relies on a structural property of the model: |
| 202 | + |
| 203 | +$$ |
| 204 | +N \;=\; \sum_{i=1}^{U} R_i, |
| 205 | +$$ |
| 206 | + |
| 207 | +where each $R_i \sim \mathrm{Pois}(\lambda_r)$ is independent of the others and of $U$. Because the Poisson family is closed under convolution, |
| 208 | + |
| 209 | +$$ |
| 210 | +N \,\big|\, U=u \;\sim\; \mathrm{Pois}\!\bigl(u\,\lambda_r\bigr). |
| 211 | +$$ |
| 212 | + |
| 213 | +This result has two important consequences: |
| 214 | + |
| 215 | +1. **Deterministic conditional rate** – Given $U=u$, the aggregate request arrivals constitute a homogeneous Poisson process with the *deterministic* rate |
| 216 | + |
| 217 | + $$ |
| 218 | + \Lambda = \frac{u\,\lambda_r}{60}. |
| 219 | + $$ |
| 220 | + |
| 221 | + All inter-arrival gaps are therefore i.i.d. exponential with parameter $\Lambda$, allowing us to use the standard inverse–CDF formula for each gap. |
| 222 | + |
| 223 | +2. **Layered uncertainty handling** – The randomness associated with $U$ is handled in an outer step (sampling $U$ once per window), while the inner step leverages the well-known Poisson→Exponential correspondence. This two-level construction reproduces exactly the joint distribution obtained by first drawing $\Lambda = N/60$ from the compound Poisson law and then drawing gaps conditional on $\Lambda$. |
| 224 | + |
| 225 | +If the total count could **not** be written as a sum of independent Poisson variables, the conditional distribution of $N$ would no longer be Poisson and the exponential-gap shortcut would not apply. In that situation one would need to work directly with the (generally more complex) mixed distribution of $\Lambda$ or adopt another specialized sampling scheme. |
| 226 | + |
| 227 | + |
| 228 | + |
| 229 | +## 5. Equivalence to CDF Inversion |
| 230 | + |
| 231 | +By the law of total probability, for any event set $A$: |
| 232 | + |
| 233 | +$$ |
| 234 | +\Pr\{(\Lambda,\Delta t_1,\dots)\in A\} |
| 235 | +=\sum_{u=0}^\infty |
| 236 | +\Pr\{U=u\}\; |
| 237 | +\Pr\{(\Lambda,\Delta t_1,\dots)\in A\mid U=u\}. |
| 238 | +$$ |
| 239 | + |
| 240 | +Step 1 samples $\Pr\{U=u\}$, step 2–3 sample the conditional exponential gaps. Because these two factors exactly match the mixture definition of the compound CDF, the hierarchical sampler **is** an exact implementation of two-stage CDF inversion, avoiding any explicit inversion of an infinite series. |
| 241 | + |
| 242 | +--- |
| 243 | + |
| 244 | +## 6. Gaussian × Poisson Variant |
| 245 | + |
| 246 | +If concurrent users follow a truncated Normal, |
| 247 | + |
| 248 | +$$ |
| 249 | +U\sim \max\{0,\;\mathcal N(\mu_u,\sigma_u^2)\}, |
| 250 | +$$ |
| 251 | + |
| 252 | +steps 2–3 remain unchanged; only step 1 draws $U$ from a continuous law. The resulting mixture is continuous, yet the hierarchical sampler remains exact. |
| 253 | + |
| 254 | +--- |
| 255 | + |
| 256 | +## 7. Time Window |
| 257 | + |
| 258 | +The sampling window length governs how often we re-sample $U$. It should reflect the timescale over which user count fluctuations become significant. Our default is **60 s**, but you can adjust this parameter in your configuration before each simulation. |
| 259 | + |
| 260 | +--- |
| 261 | + |
| 262 | +## Limitations of the Requests Model |
| 263 | + |
| 264 | +1. **Independence assumption** |
| 265 | + Assumes per-user streams and $U$ are independent. Real traffic often exhibits user-behavior correlations (e.g., flash crowds). |
| 266 | + |
| 267 | +2. **Exponential inter-arrival times** |
| 268 | + Implies memorylessness; cannot capture self-throttling or long-range dependence found in real workloads. |
| 269 | + |
| 270 | +3. **No diurnal/trend component** |
| 271 | + User count $U$ is IID per window. To model seasonality or trends, you must vary $\lambda_u(t)$ externally. |
| 272 | + |
| 273 | +4. **No burst-control or rate-limiting** |
| 274 | + Does not simulate client-side throttling or server back-pressure. Any rate-limit logic must be added externally. |
| 275 | + |
| 276 | +5. **Gaussian truncation artifacts** |
| 277 | + In the Gaussian–Poisson variant, truncating negatives to zero and rounding can under-estimate extreme user counts. |
| 278 | + |
| 279 | + |
| 280 | +**Key takeaway:** By structuring the generator as |
| 281 | +$\Lambda = U\,\lambda_r/60$ with a two-stage Poisson→Exponential sampler, FastSim efficiently reproduces compound Poisson traffic dynamics without any complex CDF inversion. |
0 commit comments