Skip to content

Commit 5c6f5db

Browse files
GioeleB00Copilot
andauthored
Features/event generator documentation test improvements (#2)
* moving file to correct folder * Documentation added * Update requests_generator.md * test Added, introduced constants for the sampling window * Update tests/unit/sampler/test_poisson_posson.py Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]>
1 parent 4a7081e commit 5c6f5db

File tree

14 files changed

+784
-126
lines changed

14 files changed

+784
-126
lines changed

documentation/FASTSIM_VISION.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
## 1 Why FastSim?
2+
3+
FastAPI + Uvicorn gives Python teams a lightning-fast async stack, yet sizing it for production still means guess-work, costly cloud load-tests or late surprises. **FastSim** fills that gap by becoming a **digital twin** of your actual service:
4+
5+
* It **replays** your FastAPI + Uvicorn event-loop behavior in SimPy, generating exactly the same kinds of asynchronous steps (parsing, CPU work, I/O, LLM calls) that happen in real code.
6+
* It **models** your infrastructure primitives—CPU cores (via a SimPy `Resource`), database pools, rate-limiters, even GPU inference quotas—so you can see queue lengths, scheduling delays, resource utilization, and end-to-end latency.
7+
* It **outputs** the very metrics you’d scrape in production (p50/p95/p99 latency, ready-queue lag, current & max concurrency, throughput, cost per LLM call), but entirely offline, in seconds.
8+
9+
With FastSim you can ask *“What happens if traffic doubles on Black Friday?”*, *“How many cores to keep p95 < 100 ms?”* or *“Is our LLM-driven endpoint ready for prime time?”*—and get quantitative answers **before** you deploy.
10+
11+
**Outcome:** data-driven capacity planning, early performance tuning, and far fewer “surprises” once you hit production.
12+
13+
---
14+
15+
## 2 Project Goals
16+
17+
| # | Goal | Practical Outcome |
18+
| - | ------------------------- | ------------------------------------------------------------------------ |
19+
| 1 | **Pre-production sizing** | Know core-count, pool-size, replica-count to hit SLA. |
20+
| 2 | **Scenario lab** | Explore traffic models, endpoint mixes, latency distributions, RTT, etc. |
21+
| 3 | **Twin metrics** | Produce the same metrics you’ll scrape in prod (latency, queue, CPU). |
22+
| 4 | **Rapid iteration** | One YAML/JSON config or REST call → full report. |
23+
| 5 | **Educational value** | Visualise how GIL lag, queue length, concurrency react to load. |
24+
25+
---
26+
27+
## 3 Who benefits & why (detailed)
28+
29+
| Audience | Pain-point solved | FastSim value |
30+
| ------------------------------ | --------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
31+
| **Backend engineers** | Unsure if 4 vCPU container survives a marketing spike | Run *what-if* load, tweak CPU cores / pool size, get p95 & max-concurrency before merging. |
32+
| **DevOps / SRE** | Guesswork in capacity planning; cost of over-provisioning | Simulate 1 → N replicas, autoscaler thresholds, DB-pool size; pick the cheapest config meeting SLA. |
33+
| **ML / LLM product teams** | LLM inference cost & latency hard to predict | Model the LLM step with a price + latency distribution; estimate \$/req and GPU batch gains without real GPU. |
34+
| **Educators / Trainers** | Students struggle to “see” event-loop internals | Visualise GIL ready-queue lag, CPU vs I/O steps, effect of blocking code—perfect for live demos and labs. |
35+
| **Consultants / Architects** | Need a quick PoC of new designs for clients | Drop endpoint definitions in YAML and demo throughput / latency under projected load in minutes. |
36+
| **Open-source community** | Lacks a lightweight Python simulator for ASGI workloads | Extensible codebase; easy to plug in new resources (rate-limit, cache) or traffic models (spike, uniform ramp). |
37+
| **System-design interviewees** | Hard to quantify trade-offs in whiteboard interviews | Prototype real-time metrics—queue lengths, concurrency, latency distributions—to demonstrate in interviews how your design scales and where bottlenecks lie. |
38+
39+
---
40+
41+
**Bottom-line:** FastSim turns abstract architecture diagrams into concrete numbers—*before* spinning up expensive cloud environments—so you can build, validate and discuss your designs with full confidence.
Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,281 @@
1+
# Requests Generator
2+
3+
This document describes the design of the **requests generator**, which models a stream of user requests to a given endpoint over time.
4+
5+
---
6+
7+
## Model Inputs and Output
8+
9+
Following the FastSim philosophy, we accept a small set of input parameters to drive a “what-if” analysis in a pre-production environment. These inputs let you explore reliability and cost implications under different traffic scenarios.
10+
11+
**Inputs**
12+
13+
1. **Average concurrent users** – expected number of users (or sessions) simultaneously hitting the endpoint.
14+
2. **Average requests per minute per user** – average number of requests each user issues per minute.
15+
3. **Simulation time** – total duration of the simulation, in seconds.
16+
17+
**Output**
18+
A continuous sequence of timestamps (seconds) marking individual request arrivals.
19+
20+
---
21+
22+
## Model Assumptions
23+
24+
* *Concurrent users* and *requests per minute per user* are **random variables**.
25+
* *Simulation time* is **deterministic**.
26+
27+
We model:
28+
29+
* **Requests per minute per user** as Poisson($\lambda_r$).
30+
* **Concurrent users** as either Poisson($\lambda_u$) or truncated Normal.
31+
* **The variables are independent**
32+
33+
```python
34+
from pydantic import BaseModel
35+
from typing import Literal
36+
37+
class RVConfig(BaseModel):
38+
"""Configure a random-variable parameter."""
39+
mean: float
40+
distribution: Literal["poisson", "normal", "gaussian"] = "poisson"
41+
variance: float | None = None # required only for normal/gaussian
42+
43+
class SimulationInput(BaseModel):
44+
"""Define simulation inputs."""
45+
avg_active_users: RVConfig
46+
avg_request_per_minute_per_user: RVConfig
47+
total_simulation_time: int | None = None
48+
```
49+
50+
---
51+
52+
## Aggregate Request Rate
53+
54+
From the two random inputs we define the **per-second aggregate rate** $\Lambda$:
55+
56+
$$
57+
\Lambda
58+
= \text{concurrent\_users}
59+
\;\times\;
60+
\frac{\text{requests\_per\_minute\_per\_user}}{60}
61+
\quad[\text{requests/s}].
62+
$$
63+
64+
---
65+
66+
## 1. Poisson → Exponential Refresher
67+
68+
### 1.1 Homogeneous Poisson process
69+
70+
A Poisson process of rate $\lambda$ has
71+
72+
$$
73+
\Pr\{N(t)=k\}
74+
= e^{-\lambda t}\,\frac{(\lambda t)^{k}}{k!},\quad k=0,1,2,\dots
75+
$$
76+
77+
### 1.2 Waiting time to first event
78+
79+
Define $T_1=\inf\{t>0:N(t)=1\}$.
80+
The survival function is
81+
82+
$$
83+
\Pr\{T_1>t\}
84+
= \Pr\{N(t)=0\}
85+
= e^{-\lambda t},
86+
$$
87+
88+
so the CDF is
89+
90+
$$
91+
F_{T_1}(t) = 1 - e^{-\lambda t},\quad t\ge0,
92+
$$
93+
94+
and the density $f(t)=\lambda\,e^{-\lambda t}$. Thus
95+
96+
$$
97+
T_1 \sim \mathrm{Exp}(\lambda),
98+
$$
99+
100+
and by memorylessness every inter-arrival gap $\Delta t_i$ is i.i.d. Exp($\lambda$).
101+
102+
### 1.3 Inverse-CDF sampling
103+
104+
To draw $\Delta t\sim\mathrm{Exp}(\lambda)$:
105+
106+
1. Sample $U\sim\mathcal U(0,1)$.
107+
2. Solve $U=1-e^{-\lambda\,\Delta t}$;$\Rightarrow\;\Delta t=-\ln(1-U)/\lambda$.
108+
3. Equivalent compact form:
109+
$\displaystyle \Delta t = -\,\ln(U)/\lambda$.
110+
111+
---
112+
113+
## 2. Poisson × Poisson Workload
114+
115+
### 2.1 Notation
116+
117+
| Symbol | Meaning | Law |
118+
| --------------------------------- | --------------------------------------- | -------- |
119+
| $U\sim\mathrm{Pois}(\lambda_u)$ | active users in current 1-minute window | Poisson |
120+
| $R_i\sim\mathrm{Pois}(\lambda_r)$ | requests per minute by user *i* | Poisson |
121+
| $N=\sum_{i=1}^U R_i$ | total requests in that minute | compound |
122+
| $\Lambda=N/60$ | aggregate rate (requests / second) | compound |
123+
124+
The procedure here rely heavily on the independence of our random variables.
125+
126+
### 2.2 Conditional sum ⇒ Poisson
127+
128+
Given $U=u$:
129+
130+
$$
131+
N\mid U=u
132+
=\sum_{i=1}^{u}R_i
133+
\;\sim\;\mathrm{Pois}(u\,\lambda_r).
134+
$$
135+
136+
### 2.3 Unconditional law of $N$
137+
138+
By the law of total probability:
139+
140+
$$
141+
\Pr\{N=n\}
142+
=\sum_{u=0}^{\infty}
143+
\Pr\{U=u\}\;
144+
\Pr\{N=n\mid U=u\}
145+
\;=\;
146+
e^{-\lambda_u}\,\frac1{n!}
147+
\sum_{u=0}^{\infty}
148+
\frac{\lambda_u^u}{u!}\,
149+
e^{-u\lambda_r}\,(u\lambda_r)^n.
150+
$$
151+
152+
This is the **Poisson–Poisson compound** (Borel–Tanner) distribution.
153+
154+
---
155+
156+
## 3. Exact Hierarchical Sampler
157+
158+
Rather than invert the discrete CDF above, we exploit the conditional structure:
159+
160+
```python
161+
# Hierarchical sampler code snippet
162+
now = 0.0 # virtual clock (s)
163+
window_end = 0.0 # end of the current user window
164+
Lambda = 0.0 # aggregate rate Λ (req/s)
165+
166+
while now < simulation_time:
167+
# (Re)sample U at the start of each window
168+
if now >= window_end:
169+
window_end = now + float(sampling_window_s)
170+
users = poisson_variable_generator(mean_concurrent_user, rng)
171+
Lambda = users * mean_req_per_sec_per_user
172+
173+
# No users → fast-forward to next window
174+
if Lambda <= 0.0:
175+
now = window_end
176+
continue
177+
178+
# Exponential gap from a protected uniform value
179+
u_raw = max(uniform_variable_generator(rng), 1e-15)
180+
delta_t = -math.log(1.0 - u_raw) / Lambda
181+
182+
# End simulation if the next event exceeds the horizon
183+
if now + delta_t > simulation_time:
184+
break
185+
186+
# If the gap crosses the window boundary, jump to it
187+
if now + delta_t >= window_end:
188+
now = window_end
189+
continue
190+
191+
now += delta_t
192+
yield delta_t
193+
```
194+
195+
Because each conditional step matches the exact Poisson→Exponential law, this two-stage algorithm reproduces the same joint distribution as analytically inverting the compound CDF, but with minimal computation.
196+
197+
---
198+
199+
## 4. Validity of the hierarchical sampler
200+
201+
The validity of the hierarchical sampler relies on a structural property of the model:
202+
203+
$$
204+
N \;=\; \sum_{i=1}^{U} R_i,
205+
$$
206+
207+
where each $R_i \sim \mathrm{Pois}(\lambda_r)$ is independent of the others and of $U$. Because the Poisson family is closed under convolution,
208+
209+
$$
210+
N \,\big|\, U=u \;\sim\; \mathrm{Pois}\!\bigl(u\,\lambda_r\bigr).
211+
$$
212+
213+
This result has two important consequences:
214+
215+
1. **Deterministic conditional rate** – Given $U=u$, the aggregate request arrivals constitute a homogeneous Poisson process with the *deterministic* rate
216+
217+
$$
218+
\Lambda = \frac{u\,\lambda_r}{60}.
219+
$$
220+
221+
All inter-arrival gaps are therefore i.i.d. exponential with parameter $\Lambda$, allowing us to use the standard inverse–CDF formula for each gap.
222+
223+
2. **Layered uncertainty handling** – The randomness associated with $U$ is handled in an outer step (sampling $U$ once per window), while the inner step leverages the well-known Poisson→Exponential correspondence. This two-level construction reproduces exactly the joint distribution obtained by first drawing $\Lambda = N/60$ from the compound Poisson law and then drawing gaps conditional on $\Lambda$.
224+
225+
If the total count could **not** be written as a sum of independent Poisson variables, the conditional distribution of $N$ would no longer be Poisson and the exponential-gap shortcut would not apply. In that situation one would need to work directly with the (generally more complex) mixed distribution of $\Lambda$ or adopt another specialized sampling scheme.
226+
227+
228+
229+
## 5. Equivalence to CDF Inversion
230+
231+
By the law of total probability, for any event set $A$:
232+
233+
$$
234+
\Pr\{(\Lambda,\Delta t_1,\dots)\in A\}
235+
=\sum_{u=0}^\infty
236+
\Pr\{U=u\}\;
237+
\Pr\{(\Lambda,\Delta t_1,\dots)\in A\mid U=u\}.
238+
$$
239+
240+
Step 1 samples $\Pr\{U=u\}$, step 2–3 sample the conditional exponential gaps. Because these two factors exactly match the mixture definition of the compound CDF, the hierarchical sampler **is** an exact implementation of two-stage CDF inversion, avoiding any explicit inversion of an infinite series.
241+
242+
---
243+
244+
## 6. Gaussian × Poisson Variant
245+
246+
If concurrent users follow a truncated Normal,
247+
248+
$$
249+
U\sim \max\{0,\;\mathcal N(\mu_u,\sigma_u^2)\},
250+
$$
251+
252+
steps 2–3 remain unchanged; only step 1 draws $U$ from a continuous law. The resulting mixture is continuous, yet the hierarchical sampler remains exact.
253+
254+
---
255+
256+
## 7. Time Window
257+
258+
The sampling window length governs how often we re-sample $U$. It should reflect the timescale over which user count fluctuations become significant. Our default is **60 s**, but you can adjust this parameter in your configuration before each simulation.
259+
260+
---
261+
262+
## Limitations of the Requests Model
263+
264+
1. **Independence assumption**
265+
Assumes per-user streams and $U$ are independent. Real traffic often exhibits user-behavior correlations (e.g., flash crowds).
266+
267+
2. **Exponential inter-arrival times**
268+
Implies memorylessness; cannot capture self-throttling or long-range dependence found in real workloads.
269+
270+
3. **No diurnal/trend component**
271+
User count $U$ is IID per window. To model seasonality or trends, you must vary $\lambda_u(t)$ externally.
272+
273+
4. **No burst-control or rate-limiting**
274+
Does not simulate client-side throttling or server back-pressure. Any rate-limit logic must be added externally.
275+
276+
5. **Gaussian truncation artifacts**
277+
In the Gaussian–Poisson variant, truncating negatives to zero and rounding can under-estimate extreme user counts.
278+
279+
280+
**Key takeaway:** By structuring the generator as
281+
$\Lambda = U\,\lambda_r/60$ with a two-stage Poisson→Exponential sampler, FastSim efficiently reproduces compound Poisson traffic dynamics without any complex CDF inversion.

documentation/tests_documentation/integration_tests/test_sampler_helper.md renamed to documentation/tests_documentation/unit_tests/test_sampler_helper.md

File renamed without changes.

documentation/tests_documentation/integration_tests/test_simulation_input.md renamed to documentation/tests_documentation/unit_tests/test_simulation_input.md

File renamed without changes.

src/app/config/constants.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@
66
class TimeDefaults(IntEnum):
77
"""Default time-related constants (all in seconds)."""
88

9-
MIN_TO_SEC = 60 # 1 minute → 60 s
10-
SAMPLING_WINDOW = 60 # keep U(t) constant for 60 s
11-
SIMULATION_HORIZON = 3_600 # run 1 h if user gives no other value
9+
MIN_TO_SEC = 60 # 1 minute → 60 s
10+
USER_SAMPLING_WINDOW = 60 # keep U(t) constant for 60 s, default
11+
SIMULATION_TIME = 3_600 # run 1 h if user gives no other value
12+
MIN_SIMULATION_TIME = 1800 # min simulation time
13+
MIN_USER_SAMPLING_WINDOW = 1 # 1 second
14+
MAX_USER_SAMPLING_WINDOW = 120 # 2 minutes

src/app/core/event_samplers/gaussian_poisson.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@
2222
def gaussian_poisson_sampling(
2323
input_data: SimulationInput,
2424
*,
25-
sampling_window_s: int = TimeDefaults.SAMPLING_WINDOW.value,
2625
rng: np.random.Generator | None = None,
2726
) -> Generator[float, None, None]:
2827
"""
@@ -41,10 +40,12 @@ def gaussian_poisson_sampling(
4140
rng = rng or np.random.default_rng()
4241

4342
simulation_time = input_data.total_simulation_time
43+
user_sampling_window = input_data.user_sampling_window
4444
# pydantic in the validation assign a value and mypy is not
4545
# complaining because a None cannot be compared in the loop
4646
# to a float
4747
assert simulation_time is not None
48+
assert user_sampling_window is not None
4849

4950
# λ_u : mean concurrent users per window
5051
mean_concurrent_user = float(input_data.avg_active_users.mean)
@@ -68,7 +69,7 @@ def gaussian_poisson_sampling(
6869
while now < simulation_time:
6970
# (Re)sample U at the start of each window
7071
if now >= window_end:
71-
window_end = now + float(sampling_window_s)
72+
window_end = now + float(user_sampling_window)
7273
users = truncated_gaussian_generator(
7374
mean_concurrent_user,
7475
variance_concurrent_user,

src/app/core/event_samplers/poisson_poisson.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@
1919
def poisson_poisson_sampling(
2020
input_data: SimulationInput,
2121
*,
22-
sampling_window_s: int = TimeDefaults.SAMPLING_WINDOW.value,
2322
rng: np.random.Generator | None = None,
2423
) -> Generator[float, None, None]:
2524
"""
@@ -38,10 +37,12 @@ def poisson_poisson_sampling(
3837
rng = rng or np.random.default_rng()
3938

4039
simulation_time = input_data.total_simulation_time
40+
user_sampling_window = input_data.user_sampling_window
4141
# pydantic in the validation assign a value and mypy is not
4242
# complaining because a None cannot be compared in the loop
4343
# to a float
4444
assert simulation_time is not None
45+
assert user_sampling_window is not None
4546

4647
# λ_u : mean concurrent users per window
4748
mean_concurrent_user = float(input_data.avg_active_users.mean)
@@ -60,7 +61,7 @@ def poisson_poisson_sampling(
6061
while now < simulation_time:
6162
# (Re)sample U at the start of each window
6263
if now >= window_end:
63-
window_end = now + float(sampling_window_s)
64+
window_end = now + float(user_sampling_window)
6465
users = poisson_variable_generator(mean_concurrent_user, rng)
6566
lam = users * mean_req_per_sec_per_user
6667

0 commit comments

Comments
 (0)