AsyncFlow-Sim · GioeleB00 · Jul 15, 2025 · Jul 15, 2025 · Jul 15, 2025 · Jul 15, 2025
diff --git a/documentation/backend_documentation/input_structure_for_the_simulation.md b/documentation/backend_documentation/input_structure_for_the_simulation.md
diff --git a/documentation/backend_documentation/metrics_to_measure.md b/documentation/backend_documentation/metrics_to_measure.md
@@ -0,0 +1,49 @@
+### **FastSim — simulation's metrics**
+
+Metrics are the lifeblood of any simulation, transforming a series of abstract events into concrete, actionable insights about system performance, resource utilization, and potential bottlenecks. FastSim provides a flexible and robust metrics collection system designed to give you a multi-faceted view of your system's behavior under load.
+
+To achieve this, FastSim categorizes metrics into three distinct types based on their collection methodology:
+
+1.  **Sampled Metrics (`SampledMetricName`):** These metrics provide a **time-series view** of the system's state. They are captured at fixed, regular intervals throughout the simulation's duration (e.g., every second). This methodology is ideal for understanding trends, observing oscillations, and measuring the continuous utilization of finite resources like CPU and RAM. Think of them as periodic snapshots of your system's health.
+
+2.  **Event-based Metrics (`EventMetricName`):** These metrics are recorded **only when a specific event occurs**. Their collection is asynchronous and irregular, triggered by discrete happenings within the simulation, such as the completion of a request. This methodology is perfect for measuring the properties of individual transactions, such as end-to-end latency, where an average value is less important than understanding the full distribution of outcomes.
+
+3.  **Aggregated Metrics (`AggregatedMetricName`):** These are not collected directly during the simulation but are **calculated after the simulation ends**. They provide high-level statistical summaries (like mean, median, and percentiles) derived from the raw data collected by Event-based metrics. They distill thousands of individual data points into a handful of key performance indicators (KPIs) that are easy to interpret.
+
+The following sections provide a detailed breakdown of each metric within these categories, explaining what they measure and the rationale for their importance.
+
+---
+
+### **1. Sampled Metrics: A Time-Series Perspective**
+
+Sampled metrics are configured in the `SimulationSettings` payload. Enabling them allows you to plot the evolution of system resources over time, which is crucial for identifying saturation points and transient performance issues.
+
+| Metric Name (`SampledMetricName`) | Description & Rationale |
+| :--- | :--- |
+| **`READY_QUEUE_LEN`** | **What it is:** The number of tasks in the `asyncio` event loop's "ready" queue waiting for their turn to run on the CPU. <br><br> **Rationale:** This is arguably the most critical indicator of **CPU saturation**. In a single-threaded Python process, only one coroutine can run at a time (held by the GIL). If this queue length is consistently greater than zero, it means tasks are ready to do work but are forced to wait because the CPU is busy. A long or growing queue is a definitive sign that your application is CPU-bound and that the CPU is a primary bottleneck. |
+| **`CORE_BUSY`** | **What it is:** The number of server CPU cores that are currently executing a task. <br><br> **Rationale:** This provides a direct measure of **CPU utilization**. When plotted over time, it shows how effectively you are using your provisioned processing power. If `CORE_BUSY` is consistently at its maximum value (equal to `server_resources.cpu_cores`), the system is CPU-saturated. Conversely, if it's consistently low while latency is high, the bottleneck is likely elsewhere (e.g., I/O). It perfectly complements `READY_QUEUE_LEN` to form a complete picture of CPU health. |
+| **`EVENT_LOOP_IO_SLEEP`** | **What it is:** A measure indicating if the event loop is idle, polling for I/O operations to complete. <br><br> **Rationale:** This metric helps you determine if your system is **I/O-bound**. If the event loop spends a significant amount of time in this state, it means the CPU is underutilized because it has no ready tasks to run and is instead waiting for external systems (like databases, caches, or downstream APIs) to respond. High values for this metric coupled with low CPU utilization are a clear signal to investigate and optimize the performance of your I/O operations. |
+| **`RAM_IN_USE`** | **What it is:** The total amount of memory (in MB) currently allocated by all active requests within a server. <br><br> **Rationale:** Essential for **capacity planning and stability analysis**. This metric allows you to visualize your system's memory footprint under load. You can identify which endpoints cause memory spikes and ensure your provisioned RAM is sufficient. A steadily increasing `RAM_IN_USE` value that never returns to a baseline is the classic signature of a **memory leak**, a critical bug this metric helps you detect. |
+| **`THROUGHPUT_RPS`** | **What it is:** The number of requests successfully completed per second, calculated over the last sampling window. <br><br> **Rationale:** This is a fundamental measure of **system performance and capacity**. It answers the question: "How much work is my system actually doing?" Plotting throughput against user load or other resource metrics is key to understanding your system's scaling characteristics. A drop in throughput often correlates with a spike in latency or resource saturation, helping you pinpoint the exact moment a bottleneck began to affect performance. |
+
+---
+
+### **2. Event-based Metrics: A Per-Transaction Perspective**
+
+Event-based metrics are also enabled in the `SimulationSettings` payload. They generate a collection of raw data points, one for each relevant event, which is ideal for statistical analysis of transactional performance.
+
+| Metric Name (`EventMetricName`) | Description & Rationale |
+| :--- | :--- |
+| **`RQS_LATENCY`** | **What it is:** The total end-to-end duration, in seconds, for a single request to be fully processed. <br><br> **Rationale:** This is the **primary user-facing performance metric**. Users directly experience latency. While a simple average can be useful, it often hides critical problems. By collecting the latency for *every single request*, FastSim allows for the calculation of statistical distributions and, most importantly, **tail-latency percentiles (p95, p99)**. These percentiles represent the worst-case experience for your users and are crucial for evaluating Service Level Objectives (SLOs) and ensuring a consistent user experience. |
+| **`LLM_COST`** | **What it is:** The estimated monetary cost (e.g., in USD) incurred by a single call to an external Large Language Model (LLM) API during a request. <br><br> **Rationale:** In modern AI-powered applications, API calls to third-party services like LLMs can be a major operational expense. This metric moves beyond technical performance to measure **financial performance**. By tracking cost on a per-event basis, you can attribute expenses to specific endpoints or user behaviors, identify unnecessarily costly operations, and make informed decisions to optimize your application's cost-effectiveness. |
+
+---
+
+### **3. Aggregated Metrics: High-Level Summaries**
+
+**Important:** Aggregated metrics are **not configured in the input payload**. They are automatically calculated by the FastSim engine at the end of a simulation run, based on the raw data collected from the enabled Event-based metrics.
+
+| Metric Name (`AggregatedMetricName`) | Description & Rationale |
+| :--- | :--- |
+| **`LATENCY_STATS`** | **What it is:** A statistical summary of the entire collection of `RQS_LATENCY` data points. This typically includes the mean, median (p50), standard deviation, and high-end percentiles (p95, p99, p99.9). <br><br> **Rationale:** This provides a comprehensive and easily digestible summary of your system's latency profile. While the raw data is essential, these summary statistics answer high-level questions quickly. The mean tells you the average experience, the median protects against outliers, and the p95/p99 values tell you the latency that 95% or 99% of your users will beat—a critical KPI for reliability and user satisfaction. |
+| **`LLM_STATS`** | **What it is:** A statistical summary of the `LLM_COST` data points. This can include total cost over the simulation, average cost per request, and cost distribution. <br><br> **Rationale:** This gives you a bird's-eye view of the financial implications of your system's design. Instead of looking at individual transaction costs, `LLM_STATS` provides the bottom line: the total operational cost during the simulation period. This is invaluable for budgeting, forecasting, and validating the financial viability of new features. |
diff --git a/src/app/api/simulation.py b/src/app/api/simulation.py
@@ -4,13 +4,13 @@
 from fastapi import APIRouter
 
 from app.core.simulation.simulation_run import run_simulation
-from app.schemas.requests_generator_input import RqsGeneratorInput
+from app.schemas.full_simulation_input import SimulationPayload
 from app.schemas.simulation_output import SimulationOutput
 
 router = APIRouter()
 
 @router.post("/simulation")
-async def event_loop_simulation(input_data: RqsGeneratorInput) -> SimulationOutput:
+async def event_loop_simulation(input_data: SimulationPayload) -> SimulationOutput:
     """Run the simulation and return aggregate KPIs."""
     rng = np.random.default_rng()
     return run_simulation(input_data, rng=rng)

diff --git a/src/app/config/constants.py b/src/app/config/constants.py
@@ -182,3 +182,43 @@ class SystemEdges(StrEnum):
     """
 
     NETWORK_CONNECTION = "network_connection"
+
+# ======================================================================
+# CONSTANTS FOR SAMPLED METRICS
+# ======================================================================
+
+class SampledMetricName(StrEnum):
+  """
+  define the metrics sampled every fixed amount of
+  time to create a time series
+  """
+
+  READY_QUEUE_LEN = "ready_queue_len" #length of the event loop ready q
+  CORE_BUSY = "core_busy"
+  EVENT_LOOP_IO_SLEEP = "event_loop_io_sleep"
+  RAM_IN_USE = "ram_in_use"
+  THROUGHPUT_RPS = "throughput_rps"
+
+# ======================================================================
+# CONSTANTS FOR EVENT METRICS
+# ======================================================================
+
+class EventMetricName(StrEnum):
+  """
+  define the metrics triggered by event with no
+  time series
+  """
+
+  RQS_LATENCY = "rqs_latency"
+  LLM_COST = "llm_cost"
+
+
+# ======================================================================
+# CONSTANTS FOR AGGREGATED METRICS
+# ======================================================================
+
+class AggregatedMetricName(StrEnum):
+  """aggregated metrics to calculate at the end of simulation"""
+
+  LATENCY_STATS = "latency_stats"
+  LLM_STATS = "llm_stats"
diff --git a/src/app/core/event_samplers/gaussian_poisson.py b/src/app/core/event_samplers/gaussian_poisson.py
@@ -17,10 +17,12 @@
     uniform_variable_generator,
 )
 from app.schemas.requests_generator_input import RqsGeneratorInput
+from app.schemas.simulation_settings_input import SimulationSettings
 
 
 def gaussian_poisson_sampling(
     input_data: RqsGeneratorInput,
+    sim_settings: SimulationSettings,
     *,
     rng: np.random.Generator | None = None,
 ) -> Generator[float, None, None]:
@@ -35,11 +37,11 @@ def gaussian_poisson_sampling(
          Λ = U * (mean_req_per_minute_per_user / 60)  [req/s].
     3. While inside the current window, draw gaps
          Δt ~ Exponential(Λ)   using inverse-CDF.
-    4. Stop once the virtual clock exceeds *simulation_time*.
+    4. Stop once the virtual clock exceeds *total_simulation_time*.
     """
     rng = rng or np.random.default_rng()
 
-    simulation_time = input_data.total_simulation_time
+    simulation_time = sim_settings.total_simulation_time
     user_sampling_window = input_data.user_sampling_window
 
     # λ_u : mean concurrent users per window

diff --git a/src/app/core/event_samplers/poisson_poisson.py b/src/app/core/event_samplers/poisson_poisson.py
@@ -14,10 +14,12 @@
     uniform_variable_generator,
 )
 from app.schemas.requests_generator_input import RqsGeneratorInput
+from app.schemas.simulation_settings_input import SimulationSettings
 
 
 def poisson_poisson_sampling(
     input_data: RqsGeneratorInput,
+    sim_settings: SimulationSettings,
     *,
     rng: np.random.Generator | None = None,
 ) ->  Generator[float, None, None]:
@@ -32,11 +34,11 @@ def poisson_poisson_sampling(
          Λ = U * (mean_req_per_minute_per_user / 60)  [req/s].
     3. While inside the current window, draw gaps
          Δt ~ Exponential(Λ)   using inverse-CDF.
-    4. Stop once the virtual clock exceeds *simulation_time*.
+    4. Stop once the virtual clock exceeds *total_simulation_time*.
     """
     rng = rng or np.random.default_rng()
 
-    simulation_time = input_data.total_simulation_time
+    simulation_time = sim_settings.total_simulation_time
     user_sampling_window = input_data.user_sampling_window
 
     # λ_u : mean concurrent users per window

diff --git a/src/app/core/helpers.py b/src/app/core/helpers.py
@@ -0,0 +1,38 @@
+"""helpers for the simulation"""
+
+from collections.abc import Iterable
+
+from app.config.constants import EventMetricName, SampledMetricName
+
+
+def alloc_sample_metric(
+    enabled_sample_metrics: Iterable[SampledMetricName],
+    ) -> dict[str, list[float | int]]:
+    """
+    After the pydantic validation of the whole input we
+    instantiate a dictionary to collect the sampled metrics the
+    user want to measure
+    """
+    # t is the alignment parameter for example assume
+    # the snapshot for the sampled metrics are done every 10ms
+    # t = [10,20,30,40....] to each t will correspond a measured
+    # metric corresponding to that time interval
+
+    dict_sampled_metrics: dict[str, list[float | int]] = {"t": []}
+    for key in enabled_sample_metrics:
+        dict_sampled_metrics[key] = []
+    return dict_sampled_metrics
+
+
+def alloc_event_metric(
+    enabled_event_metrics: Iterable[EventMetricName],
+    ) -> dict[str, list[float | int]]:
+    """
+    After the pydantic validation of the whole input we
+    instantiate a dictionary to collect the event metrics the
+    user want to measure
+    """
+    dict_event_metrics: dict[str, list[float | int]]  = {}
+    for key in enabled_event_metrics:
+        dict_event_metrics[key] = []
+    return dict_event_metrics
diff --git a/src/app/core/simulation/requests_generator.py b/src/app/core/simulation/requests_generator.py
@@ -17,10 +17,12 @@
     import numpy as np
 
     from app.schemas.requests_generator_input import RqsGeneratorInput
+    from app.schemas.simulation_settings_input import SimulationSettings
 
 
 def requests_generator(
     input_data: RqsGeneratorInput,
+    sim_settings: SimulationSettings,
     *,
     rng: np.random.Generator | None = None,
 ) -> Generator[float, None, None]:
@@ -41,12 +43,14 @@ def requests_generator(
         #Gaussian-Poisson model
         return gaussian_poisson_sampling(
             input_data=input_data,
+            sim_settings=sim_settings,
             rng=rng,
 
         )
 
     # Poisson + Poisson
     return poisson_poisson_sampling(
         input_data=input_data,
+        sim_settings=sim_settings,
         rng=rng,
     )
diff --git a/src/app/core/simulation/simulation_run.py b/src/app/core/simulation/simulation_run.py
@@ -14,28 +14,32 @@
 
     import numpy as np
 
-    from app.schemas.requests_generator_input import RqsGeneratorInput
+    from app.schemas.full_simulation_input import SimulationPayload
+
+
 
 
 
 
 def run_simulation(
-    input_data: RqsGeneratorInput,
+    input_data: SimulationPayload,
     *,
     rng: np.random.Generator,
 ) -> SimulationOutput:
     """Simulation executor in Simpy"""
-    gaps: Generator[float, None, None] = requests_generator(input_data, rng=rng)
+    sim_settings = input_data.sim_settings
+
+    requests_generator_input = input_data.rqs_input
+
+    gaps: Generator[float, None, None] = requests_generator(
+        requests_generator_input,
+        sim_settings,
+        rng=rng)
     env = simpy.Environment()
 
-    simulation_time = input_data.total_simulation_time
-    # pydantic in the validation assign a value and mypy is not
-    # complaining because a None cannot be compared in the loop
-    # to a float
-    assert simulation_time is not None
 
     total_request_per_time_period = {
-        "simulation_time": simulation_time,
+        "simulation_time": sim_settings.total_simulation_time,
         "total_requests": 0,
         }
 
@@ -47,10 +51,10 @@ def arrival_process(
             total_request_per_time_period["total_requests"] += 1
 
     env.process(arrival_process(env))
-    env.run(until=simulation_time)
+    env.run(until=sim_settings.total_simulation_time)
 
     return SimulationOutput(
         total_requests=total_request_per_time_period,
-        metric_2=str(input_data.avg_request_per_minute_per_user.mean),
-        metric_n=str(input_data.avg_active_users.mean),
+        metric_2=str(requests_generator_input.avg_request_per_minute_per_user.mean),
+        metric_n=str(requests_generator_input.avg_active_users.mean),
     )
diff --git a/src/app/schemas/full_simulation_input.py b/src/app/schemas/full_simulation_input.py
@@ -3,6 +3,7 @@
 from pydantic import BaseModel
 
 from app.schemas.requests_generator_input import RqsGeneratorInput
+from app.schemas.simulation_settings_input import SimulationSettings
 from app.schemas.system_topology_schema.full_system_topology_schema import TopologyGraph
 
 
@@ -11,3 +12,4 @@ class SimulationPayload(BaseModel):
 
     rqs_input: RqsGeneratorInput
     topology_graph: TopologyGraph
+    sim_settings: SimulationSettings
diff --git a/src/app/schemas/requests_generator_input.py b/src/app/schemas/requests_generator_input.py
@@ -12,13 +12,6 @@ class RqsGeneratorInput(BaseModel):
 
     avg_active_users: RVConfig
     avg_request_per_minute_per_user: RVConfig
-    total_simulation_time: int = Field(
-        default=TimeDefaults.SIMULATION_TIME,
-        ge=TimeDefaults.MIN_SIMULATION_TIME,
-        description=(
-            f"Simulation time in seconds (>= {TimeDefaults.MIN_SIMULATION_TIME})."
-        ),
-    )
 
     user_sampling_window: int = Field(
         default=TimeDefaults.USER_SAMPLING_WINDOW,

diff --git a/src/app/schemas/simulation_settings_input.py b/src/app/schemas/simulation_settings_input.py
@@ -0,0 +1,31 @@
+"""define a class with the global settings for the simulation"""
+
+from pydantic import BaseModel, Field
+
+from app.config.constants import EventMetricName, SampledMetricName, TimeDefaults
+
+
+class SimulationSettings(BaseModel):
+    """Global parameters that apply to the whole run."""
+
+    total_simulation_time: int = Field(
+        default=TimeDefaults.SIMULATION_TIME,
+        ge=TimeDefaults.MIN_SIMULATION_TIME,
+        description="Simulation horizon in seconds.",
+    )
+
+    enabled_sample_metrics: set[SampledMetricName] = Field(
+        default_factory=lambda: {
+            SampledMetricName.READY_QUEUE_LEN,
+            SampledMetricName.CORE_BUSY,
+            SampledMetricName.RAM_IN_USE,
+        },
+        description="Which time-series KPIs to collect by default.",
+    )
+    enabled_event_metrics: set[EventMetricName] = Field(
+        default_factory=lambda: {
+            EventMetricName.RQS_LATENCY,
+        },
+        description="Which per-event KPIs to collect by default.",
+    )
+