diff --git a/documentation/backend_documentation/input_structure_for_the_simulation.md b/documentation/backend_documentation/input_structure_for_the_simulation.md new file mode 100644 index 0000000..e31c860 --- /dev/null +++ b/documentation/backend_documentation/input_structure_for_the_simulation.md @@ -0,0 +1,404 @@ +### **FastSim — Request-Generator Input Configuration** + +A **single, self-consistent contract** links three layers of the codebase: + +1. **Global Constants** – `TimeDefaults`, `Distribution` +2. **Random Variable Schema** – `RVConfig` +3. **Traffic-Generator Payload** – `RqsGeneratorInput` + +Understanding how these layers interact is key to crafting valid and predictable traffic profiles, preventing common configuration errors before the simulation begins. + +--- + +### 1. Global Constants + +| Constant Set | Purpose | Key Values | +| :--- | :--- | :--- | +| **`TimeDefaults`** (`IntEnum`) | Defines default values and validation bounds for time-based fields. | `SIMULATION_TIME = 3600 s`, `MIN_SIMULATION_TIME = 1800 s`, `USER_SAMPLING_WINDOW = 60 s`, `MIN_USER_SAMPLING_WINDOW = 1 s`, `MAX_USER_SAMPLING_WINDOW = 120 s` | +| **`Distribution`** (`StrEnum`) | Defines the canonical names of probability distributions supported by the generator. | `"poisson"`, `"normal"`, `"log_normal"`, `"exponential"` | + +***Why use constants?*** + +* **Consistency:** They are referenced by validators; changing a value in one place updates the entire validation tree. +* **Safety:** They guarantee that a typo, such as `"Poisson"`, raises an error instead of silently failing or switching to an unintended default. + +--- + +### 2. Random Variable Schema (`RVConfig`) + +```python +class RVConfig(BaseModel): + """class to configure random variables""" + + mean: float + distribution: Distribution = Distribution.POISSON + variance: float | None = None + + @field_validator("mean", mode="before") + def ensure_mean_is_numeric( + cls, # noqa: N805 + v: object, + ) -> float: + """Ensure `mean` is numeric, then coerce to float.""" + err_msg = "mean must be a number (int or float)" + if not isinstance(v, (float, int)): + raise ValueError(err_msg) # noqa: TRY004 + return float(v) + + @model_validator(mode="after") # type: ignore[arg-type] + def default_variance(cls, model: "RVConfig") -> "RVConfig": # noqa: N805 + """Set variance = mean when distribution == 'normal' and variance is missing.""" + if model.variance is None and model.distribution == Distribution.NORMAL: + model.variance = model.mean + return model + +``` + +#### Validation Logic + +| Check | Pydantic Hook | Rule | +| :--- | :--- | :--- | +| *Mean must be numeric* | `@field_validator("mean", before)` | Rejects strings and nulls; coerces `int` to `float`. | +| *Autofill variance* | `@model_validator(after)` | If `distribution == "normal"` **and** `variance` is not provided, sets `variance = mean`. | +| *Positivity enforcement* | `PositiveFloat` / `PositiveInt` | Pydantic's constrained types are used on fields like `mean` where negative values are invalid, rejecting them before business logic runs. | + +> **Self-Consistency:** Every random draw in the simulation engine relies on a validated `RVConfig` instance. This avoids redundant checks and defensive code downstream. + +--- + +### 3. Traffic-Generator Payload (`RqsGeneratorInput`) + +| Field | Type | Validation Tied to Constants | +| :--- | :--- | :--- | +| `avg_active_users` | `RVConfig` | No extra constraints needed; the inner schema guarantees correctness. | +| `avg_request_per_minute_per_user` | `RVConfig` | Same as above. | +| `total_simulation_time` | `int` | `ge=TimeDefaults.MIN_SIMULATION_TIME`
default=`TimeDefaults.SIMULATION_TIME` | +| `user_sampling_window` | `int` | `ge=TimeDefaults.MIN_USER_SAMPLING_WINDOW`
`le=TimeDefaults.MAX_USER_SAMPLING_WINDOW`
default=`TimeDefaults.USER_SAMPLING_WINDOW` | + +#### How the Generator Uses Each Field + +The simulation evolves based on a simple, powerful loop: + +1. **Timeline Partitioning** (`user_sampling_window`): The simulation timeline is divided into fixed-length windows. For each window: +2. **Active User Sampling** (`avg_active_users`): A single value is drawn to determine the concurrent user population, `U(t)`, for that window. +3. **Request Rate Calculation** (`avg_request_per_minute_per_user`): Each of the `U(t)` users contributes to the total request rate, yielding an aggregate load for the window. +4. **Termination** (`total_simulation_time`): The loop stops once the cumulative simulated time reaches this value. + +Because every numeric input is range-checked upfront, **the runtime engine never needs to defend itself** against invalid data like zero-length windows or negative rates, making the event-loop lean and predictable. + +--- + +### 4. End-to-End Example (Fully Explicit) + +```json +{ + "avg_active_users": { + "mean": 100, + "distribution": "poisson" + }, + "avg_request_per_minute_per_user": { + "mean": 4.0, + "distribution": "normal", + "variance": null + }, + "total_simulation_time": 5400, + "user_sampling_window": 45 +}``` + +#### What the Validators Do + +1. `mean` is numeric ✔️ +2. `distribution` string matches an enum member ✔️ +3. `total_simulation_time` ≥ 1800 ✔️ +4. `user_sampling_window` is in the range ✔️ +5. `variance` is `null` with a `normal` distribution ⇒ **auto-set to 4.0** ✔️ + +The payload is accepted. The simulator will run for $5400 / 45 = 120$ simulation windows. + +--- + +### 5. Common Error Example + +```json +{ + "avg_active_users": { "mean": "many" }, + "avg_request_per_minute_per_user": { "mean": -2 }, + "total_simulation_time": 600, + "user_sampling_window": 400 +} +``` + +| # | Fails On | Error Message (Abridged) | +| :- | :--- | :--- | +| 1 | Numeric check | `Input should be a valid number` | +| 2 | Positivity check | `Input should be greater than 0` | +| 3 | Minimum time check | `Input should be at least 1800` | +| 4 | Maximum window check | `Input should be at most 120` | + +--- + +### Takeaways + +* **Single Source of Truth:** Enums centralize all literal values, eliminating magic strings. +* **Layered Validation:** The `Constants → RVConfig → Request Payload` hierarchy ensures that only well-formed traffic profiles reach the simulation engine. +* **Safe Defaults:** Omitting optional fields never leads to undefined behavior; defaults are sourced directly from the `TimeDefaults` constants. + +This robust, layered approach allows you to configure the generator with confidence, knowing that any malformed scenario will be rejected early with explicit, actionable error messages. + + +### **FastSim Topology Input Schema** + +The topology schema is the blueprint of the digital twin, defining the structure, resources, behavior, and network connections of the system you wish to simulate. It describes: + +1. **What work** each request performs (`Endpoint` → `Step`). +2. **What components** exist in the system (`Server`, `Client`). +3. **Which resources** each component possesses (`ServerResources`). +4. **How** components are interconnected (`Edge`). + +To ensure simulation integrity and prevent runtime errors, FastSim uses Pydantic to rigorously validate the entire topology upfront. Every inconsistency is rejected at load-time. The following sections detail the schema's layered design, from the most granular operation to the complete system graph. + +--- +### **A Controlled Vocabulary: The Role of Constants** + +To ensure that input configurations are unambiguous and robust, the topology schema is built upon a controlled vocabulary defined by a series of Python `Enum` classes. Instead of relying on raw strings or "magic values" (e.g., `"cpu_bound_operation"`), which are prone to typos and inconsistencies, the schema uses these enumerations to define the finite set of legal values for categories like operation kinds, metrics, and node types. + +This design choice provides three critical benefits: + +1. **Strong Type-Safety:** By using `StrEnum` and `IntEnum`, Pydantic models can validate input payloads with absolute certainty. Any value not explicitly defined in the corresponding `Enum` is immediately rejected. This prevents subtle configuration errors that would be difficult to debug at simulation time. +2. **Developer Experience and Error Prevention:** This approach provides powerful auto-completion and static analysis. IDEs, `mypy`, and linters can catch invalid values during development, providing immediate feedback long before the code is executed. +3. **Single Source of Truth:** All valid categories are centralized in the `app.config.constants` module. This makes the system easier to maintain and extend. To add a new resource type or metric, a developer only needs to update the `Enum` definition, and the change propagates consistently to validation logic, the simulation engine, and any other component that uses it. + +The key enumerations that govern the topology schema include: + +| Constant Enum | Purpose | +| :--- | :--- | +| **`EndpointStepIO`, `EndpointStepCPU`, `EndpointStepRAM`** | Define the exhaustive list of valid `kind` values for a `Step`. | +| **`Metrics`** | Specify the legal dictionary keys within a `Step`'s `step_metrics`, enforcing the one-to-one link between a `kind` and its metric. | +| **`SystemNodes` and `SystemEdges`** | Enumerate the allowed categories for nodes and their connections in the high-level `TopologyGraph`. | + +### **Design Philosophy: A "Micro-to-Macro" Approach** + +The schema is built on a compositional, "micro-to-macro" principle. We start by defining the smallest indivisible units of work (`Step`) and progressively assemble them into larger, more complex structures (`Endpoint`, `Server`, and finally the `TopologyGraph`). + +This layered approach provides several key advantages: +* **Modularity and Reusability:** An `Endpoint` is just a sequence of `Steps`. You can reorder, add, or remove steps without redefining the core operations themselves. +* **Local Reasoning, Global Safety:** Each model is responsible for its own internal consistency (e.g., a `Step` ensures its metric is valid for its kind). Parent models then enforce the integrity of the connections *between* these components (e.g., the `TopologyGraph` ensures all `Edges` connect to valid `Nodes`). +* **Clarity and Maintainability:** The hierarchy makes the system description intuitive to read and write. It’s clear how atomic operations roll up into endpoints, which are hosted on servers connected by a network. +* **Robustness:** All structural and referential errors are caught before the simulation begins, guaranteeing that the SimPy engine operates on a valid, self-consistent model. + +--- + +### **1. The Atomic Unit: `Step`** + +A `Step` represents a single, indivisible operation executed by an asynchronous coroutine within an endpoint. It is the fundamental building block of all work in the simulation. + +Each `Step` has a `kind` (the category of work) and `step_metrics` (the resources it consumes). + +```python +class Step(BaseModel): + """ + A single, indivisible operation. + It must be quantified by exactly ONE metric. + """ + kind: EndpointStepIO | EndpointStepCPU | EndpointStepRAM + step_metrics: dict[Metrics, PositiveFloat | PositiveInt] + + @model_validator(mode="after") + def ensure_coherence_kind_metrics(cls, model: "Step") -> "Step": + metrics_keys = set(model.step_metrics) + + # Enforce that a step performs one and only one type of work. + if len(metrics_keys) != 1: + raise ValueError("step_metrics must contain exactly one entry") + + # Enforce that the metric is appropriate for the kind of work. + if isinstance(model.kind, EndpointStepCPU): + if metrics_keys != {Metrics.CPU_TIME}: + raise ValueError(f"CPU step requires metric '{Metrics.CPU_TIME}'") + + elif isinstance(model.kind, EndpointStepRAM): + if metrics_keys != {Metrics.NECESSARY_RAM}: + raise ValueError(f"RAM step requires metric '{Metrics.NECESSARY_RAM}'") + + elif isinstance(model.kind, EndpointStepIO): + if metrics_keys != {Metrics.IO_WAITING_TIME}: + raise ValueError(f"I/O step requires metric '{Metrics.IO_WAITING_TIME}'") + + return model +``` + +> **Design Rationale:** The strict one-to-one mapping between a `Step` and a single metric is a core design choice. It simplifies the simulation engine immensely, as each `Step` can be deterministically routed to a request on a single SimPy resource (a CPU queue, a RAM container, or an I/O event). This avoids the complexity of modeling operations that simultaneously consume multiple resource types. + +--- + +### **2. Composing Workflows: `Endpoint`** + +An `Endpoint` defines a complete, user-facing operation (e.g., an API call like `/predict`) as an ordered sequence of `Steps`. + +```python +class Endpoint(BaseModel): + """A higher-level API call, executed as a strict sequence of steps.""" + endpoint_name: str + steps: list[Step] + + @field_validator("endpoint_name", mode="before") + def name_to_lower(cls, v: str) -> str: + """Standardize endpoint name to be lowercase for consistency.""" + return v.lower() +``` + +> **Design Rationale:** The simulation processes the `steps` list in the exact order provided. The total latency and resource consumption of an endpoint call is the sequential sum of its individual `Step` delays. This directly models the execution flow of a typical web request handler. + +--- + +### **3. Defining Components: System Nodes** + +Nodes are the macro-components of your architecture where work is performed and resources are located. + +#### **`ServerResources` and `Server`** +A `Server` node hosts endpoints and owns a set of physical resources. These resources are mapped directly to specific SimPy primitives, which govern how requests queue and contend for service. + +```python +class ServerResources(BaseModel): + """Quantifiable resources available on a server node.""" + cpu_cores: PositiveInt = Field(ge=ServerResourcesDefaults.MINIMUM_CPU_CORES) + ram_mb: PositiveInt = Field(ge=ServerResourcesDefaults.MINIMUM_RAM_MB) + db_connection_pool: PositiveInt | None = None + +class Server(BaseModel): + """A node that hosts endpoints and owns resources.""" + id: str + type: SystemNodes = SystemNodes.SERVER + server_resources: ServerResources + endpoints: list[Endpoint] +``` + +> **Design Rationale: Mapping to SimPy Primitives** +> * `cpu_cores` maps to a `simpy.Resource`. This models a classic semaphore where only `N` processes can execute concurrently, and others must wait in a queue. It perfectly represents CPU-bound tasks competing for a limited number of cores. +> * `ram_mb` maps to a `simpy.Container`. A container models a divisible resource where processes can request and return variable amounts. This is ideal for memory, as multiple requests can simultaneously hold different amounts of RAM without exclusively locking the entire memory pool. + +#### **`Client`** +The `Client` is a special, resource-less node that serves as the origin point for all requests generated during the simulation. + +#### **Node Aggregation and Validation (`TopologyNodes`)** +All `Server` and `Client` nodes are collected in the `TopologyNodes` model, which performs a critical validation check: ensuring all component IDs are unique across the entire system. + +--- + +### **4. Connecting the Components: `Edge`** + +An `Edge` represents a directed network link between two nodes, defining how requests flow through the system. + +```python +class Edge(BaseModel): + """A directed connection in the topology graph.""" + source: str + target: str + latency: RVConfig + probability: float = Field(1.0, ge=0.0, le=1.0) + edge_type: SystemEdges = SystemEdges.NETWORK_CONNECTION +``` + +> **Design Rationale:** +> * **Stochastic Latency:** Latency is not a fixed number but an `RVConfig` object. This allows you to model realistic network conditions using various probability distributions (e.g., log-normal for internet RTTs, exponential for failure retries), making the simulation far more accurate. +> * **Probabilistic Routing:** The `probability` field enables modeling of simple load balancing or A/B testing scenarios where traffic from a single `source` can be split across multiple `target` nodes. + +--- + +### **5. The Complete System: `TopologyGraph`** + +The `TopologyGraph` is the root of the configuration. It aggregates all `nodes` and `edges` and performs the final, most critical validation: ensuring referential integrity. + +```python +class TopologyGraph(BaseModel): + """The complete system definition, uniting all nodes and edges.""" + nodes: TopologyNodes + edges: list[Edge] + + @model_validator(mode="after") + def edge_refs_valid(cls, model: "TopologyGraph") -> "TopologyGraph": + """Ensure every edge connects two valid, existing nodes.""" + valid_ids = {s.id for s in model.nodes.servers} | {model.nodes.client.id} + for e in model.edges: + if e.source not in valid_ids or e.target not in valid_ids: + raise ValueError(f"Edge '{e.source}->{e.target}' references an unknown node.") + return model +``` +> **Design Rationale:** This final check guarantees that the topology is a valid, connected graph. By confirming that every `edge.source` and `edge.target` corresponds to a defined node `id`, it prevents the simulation from starting with a broken or nonsensical configuration, embodying the "fail-fast" principle. + +--- + +### **End-to-End Example** + +Here is a minimal, complete JSON configuration that defines a single client and a single API server. + +```jsonc +{ + "nodes": { + // The client node is the source of all generated requests. + "client": { + "id": "user_browser", + "type": "client" + }, + // A list of all server nodes in the system. + "servers": [ + { + "id": "api_server_node", + "type": "server", + "server_resources": { + "cpu_cores": 2, + "ram_mb": 2048 + }, + "endpoints": [ + { + "endpoint_name": "/predict", + "steps": [ + { + "kind": "initial_parsing", + "step_metrics": { "cpu_time": 0.005 } + }, + { + "kind": "io_db", + "step_metrics": { "io_waiting_time": 0.050 } + }, + { + "kind": "cpu_bound_operation", + "step_metrics": { "cpu_time": 0.015 } + } + ] + } + ] + } + ] + }, + "edges": [ + // A network link from the client to the API server. + { + "source": "user_browser", + "target": "api_server_node", + "latency": { + "distribution": "log_normal", + "mean": 0.05, + "std_dev": 0.01 + }, + "probability": 1.0 + } + ] +}``` + + + + +> **YAML friendly:** +> The topology schema is 100 % agnostic to the wire format. +> You can encode the same structure in **YAML** with identical field +> names and value types—Pydantic will parse either JSON or YAML as long +> as the keys and data types respect the schema. +> No additional changes or converters are required. +``` + + + +### **Key Takeaway** + +This rigorously validated, compositional schema is the foundation of FastSim's reliability. By defining a clear vocabulary of constants (`Metrics`, `SystemNodes`) and enforcing relationships with Pydantic validators, the schema guarantees that every simulation run starts from a **complete and self-consistent** system description. This allows you to refactor simulation logic or extend the model with new resources (e.g., GPU memory) with full confidence that existing configurations remain valid and robust. \ No newline at end of file diff --git a/documentation/backend_documentation/requests_generator.md b/documentation/backend_documentation/requests_generator.md index 556c4f4..95190fa 100644 --- a/documentation/backend_documentation/requests_generator.md +++ b/documentation/backend_documentation/requests_generator.md @@ -40,7 +40,7 @@ class RVConfig(BaseModel): distribution: Literal["poisson", "normal", "gaussian"] = "poisson" variance: float | None = None # required only for normal/gaussian -class SimulationInput(BaseModel): +class RqsGeneratorInput(BaseModel): """Define simulation inputs.""" avg_active_users: RVConfig avg_request_per_minute_per_user: RVConfig diff --git a/src/app/api/simulation.py b/src/app/api/simulation.py index 55081c6..e025ad6 100644 --- a/src/app/api/simulation.py +++ b/src/app/api/simulation.py @@ -4,13 +4,13 @@ from fastapi import APIRouter from app.core.simulation.simulation_run import run_simulation -from app.schemas.simulation_input import SimulationInput +from app.schemas.requests_generator_input import RqsGeneratorInput from app.schemas.simulation_output import SimulationOutput router = APIRouter() @router.post("/simulation") -async def event_loop_simulation(input_data: SimulationInput) -> SimulationOutput: +async def event_loop_simulation(input_data: RqsGeneratorInput) -> SimulationOutput: """Run the simulation and return aggregate KPIs.""" rng = np.random.default_rng() return run_simulation(input_data, rng=rng) diff --git a/src/app/config/constants.py b/src/app/config/constants.py index 03448de..23c7936 100644 --- a/src/app/config/constants.py +++ b/src/app/config/constants.py @@ -1,23 +1,184 @@ -"""Application constants and configuration values.""" +""" +Application-wide constants and configuration values. + +This module groups all the *static* enumerations used by the FastSim backend +so that: + +* JSON / YAML payloads can be strictly validated with Pydantic. +* Front-end and simulation engine share a single source of truth. +* Ruff, mypy and IDEs can leverage the strong typing provided by Enum classes. + +**IMPORTANT:** Changing any enum *value* is a breaking-change for every +stored configuration file. Add new members whenever possible instead of +renaming existing ones. +""" from enum import IntEnum, StrEnum +# ====================================================================== +# CONSTANTS FOR THE REQUEST-GENERATOR COMPONENT +# ====================================================================== + class TimeDefaults(IntEnum): - """Default time-related constants (all in seconds).""" + """ + Default time-related constants (expressed in **seconds**). + + These values are used when the user omits an explicit parameter. They also + serve as lower / upper bounds for validation for the requests generator. + """ - MIN_TO_SEC = 60 # 1 minute → 60 s - USER_SAMPLING_WINDOW = 60 # keep U(t) constant for 60 s, default - SIMULATION_TIME = 3_600 # run 1 h if user gives no other value - MIN_SIMULATION_TIME = 1800 # min simulation time - MIN_USER_SAMPLING_WINDOW = 1 # 1 second - MAX_USER_SAMPLING_WINDOW = 120 # 2 minutes + MIN_TO_SEC = 60 # 1 minute → 60 s + USER_SAMPLING_WINDOW = 60 # keep U(t) constant for 60 s + SIMULATION_TIME = 3_600 # run 1 h if user gives no value + MIN_SIMULATION_TIME = 1_800 # enforce at least 30 min + MIN_USER_SAMPLING_WINDOW = 1 # 1 s minimum + MAX_USER_SAMPLING_WINDOW = 120 # 2 min maximum class Distribution(StrEnum): - """Allowed probability distributions for an RVConfig.""" + """ + Probability distributions accepted by :class:`~app.schemas.RVConfig`. + + The *string value* is exactly the identifier that must appear in JSON + payloads. The simulation engine will map each name to the corresponding + random sampler (e.g. ``numpy.random.poisson``). + """ POISSON = "poisson" NORMAL = "normal" + LOG_NORMAL = "log_normal" + EXPONENTIAL = "exponential" + +# ====================================================================== +# CONSTANTS FOR ENDPOINT STEP DEFINITION (REQUEST-HANDLER) +# ====================================================================== + +# The JSON received by the API for an endpoint step is expected to look like: +# +# { +# "endpoint_name": "/predict", +# "kind": "io_llm", +# "metrics": { +# "cpu_time": 0.150, +# "necessary_ram": 256 +# } +# } +# +# The Enum classes below guarantee that only valid *kind* and *metric* keys +# are accepted by the Pydantic schema. + + +class EndpointStepIO(StrEnum): + """ + I/O-bound operation categories that can occur inside an endpoint *step*. + + .. list-table:: + :header-rows: 1 + + * - Constant + - Meaning (executed by coroutine) + * - ``TASK_SPAWN`` + - Spawns an additional ``asyncio.Task`` and returns immediately. + * - ``LLM`` + - Performs a remote Large-Language-Model inference call. + * - ``WAIT`` + - Passive, *non-blocking* wait for I/O completion; no new task spawned. + * - ``DB`` + - Round-trip to a relational / NoSQL database. + * - ``CACHE`` + - Access to a local or distributed cache layer. + + The *value* of each member (``"io_llm"``, ``"io_db"``, …) is the exact + identifier expected in external JSON. + """ + + TASK_SPAWN = "io_task_spawn" + LLM = "io_llm" + WAIT = "io_wait" + DB = "io_db" + CACHE = "io_cache" + + +class EndpointStepCPU(StrEnum): + """ + CPU-bound operation categories inside an endpoint step. + + Use these when the coroutine keeps the Python interpreter busy + (GIL-bound or compute-heavy code) rather than waiting for I/O. + """ + + INITIAL_PARSING = "initial_parsing" + CPU_BOUND_OPERATION = "cpu_bound_operation" + + +class EndpointStepRAM(StrEnum): + """ + Memory-related operations inside a step. + + Currently limited to a single category, but kept as an Enum so that future + resource types (e.g. GPU memory) can be added without schema changes. + """ + + RAM = "ram" + + +class Metrics(StrEnum): + """ + Keys used inside the ``metrics`` dictionary of a *step*. + + * ``NETWORK_LATENCY`` - Mean latency (seconds) incurred on a network edge + *outside* the service (used mainly for validation when steps model + short in-service hops). + * ``CPU_TIME`` - Service time (seconds) during which the coroutine occupies + the CPU / GIL. + * ``NECESSARY_RAM`` - Peak memory (MB) required by the step. + """ + + NETWORK_LATENCY = "network_latency" + CPU_TIME = "cpu_time" + IO_WAITING_TIME = "io_waiting_time" + NECESSARY_RAM = "necessary_ram" + +# ====================================================================== +# CONSTANTS FOR THE RESOURCES OF A SERVER +# ====================================================================== + +class ServerResourcesDefaults: + """Resources available for a single server""" + + CPU_CORES = 1 + MINIMUM_CPU_CORES = 1 + RAM_MB = 1024 + MINIMUM_RAM_MB = 256 + DB_CONNECTION_POOL = None + +# ====================================================================== +# CONSTANTS FOR THE MACRO-TOPOLOGY GRAPH +# ====================================================================== + +class SystemNodes(StrEnum): + """ + High-level node categories of the system topology graph. + + Each member represents a *macro-component* that may have its own SimPy + resources (CPU cores, DB pool, etc.). + """ + + SERVER = "server" + CLIENT = "client" + LOAD_BALANCER = "load_balancer" + API_GATEWAY = "api_gateway" + DATABASE = "database" + CACHE = "cache" + + +class SystemEdges(StrEnum): + """ + Edge categories connecting different :class:`SystemNodes`. + Currently only network links are modeled; new types (IPC queue, message + bus, stream) can be added without impacting existing payloads. + """ + NETWORK_CONNECTION = "network_connection" diff --git a/src/app/core/event_samplers/gaussian_poisson.py b/src/app/core/event_samplers/gaussian_poisson.py index 98f9121..9c83b4f 100644 --- a/src/app/core/event_samplers/gaussian_poisson.py +++ b/src/app/core/event_samplers/gaussian_poisson.py @@ -16,11 +16,11 @@ truncated_gaussian_generator, uniform_variable_generator, ) -from app.schemas.requests_generator_input import SimulationInput +from app.schemas.requests_generator_input import RqsGeneratorInput def gaussian_poisson_sampling( - input_data: SimulationInput, + input_data: RqsGeneratorInput, *, rng: np.random.Generator | None = None, ) -> Generator[float, None, None]: diff --git a/src/app/core/event_samplers/poisson_poisson.py b/src/app/core/event_samplers/poisson_poisson.py index d25c4dd..ebb1970 100644 --- a/src/app/core/event_samplers/poisson_poisson.py +++ b/src/app/core/event_samplers/poisson_poisson.py @@ -13,11 +13,11 @@ poisson_variable_generator, uniform_variable_generator, ) -from app.schemas.requests_generator_input import SimulationInput +from app.schemas.requests_generator_input import RqsGeneratorInput def poisson_poisson_sampling( - input_data: SimulationInput, + input_data: RqsGeneratorInput, *, rng: np.random.Generator | None = None, ) -> Generator[float, None, None]: diff --git a/src/app/core/simulation/requests_generator.py b/src/app/core/simulation/requests_generator.py index e19d8f0..810218f 100644 --- a/src/app/core/simulation/requests_generator.py +++ b/src/app/core/simulation/requests_generator.py @@ -16,11 +16,11 @@ import numpy as np - from app.schemas.requests_generator_input import SimulationInput + from app.schemas.requests_generator_input import RqsGeneratorInput def requests_generator( - input_data: SimulationInput, + input_data: RqsGeneratorInput, *, rng: np.random.Generator | None = None, ) -> Generator[float, None, None]: diff --git a/src/app/core/simulation/simulation_run.py b/src/app/core/simulation/simulation_run.py index d3f52f6..3e1672a 100644 --- a/src/app/core/simulation/simulation_run.py +++ b/src/app/core/simulation/simulation_run.py @@ -14,13 +14,13 @@ import numpy as np - from app.schemas.simulation_input import SimulationInput + from app.schemas.requests_generator_input import RqsGeneratorInput def run_simulation( - input_data: SimulationInput, + input_data: RqsGeneratorInput, *, rng: np.random.Generator, ) -> SimulationOutput: diff --git a/src/app/schemas/full_simulation_input.py b/src/app/schemas/full_simulation_input.py new file mode 100644 index 0000000..0e1d7a9 --- /dev/null +++ b/src/app/schemas/full_simulation_input.py @@ -0,0 +1,13 @@ +"""Definition of the full input for the simulation""" + +from pydantic import BaseModel + +from app.schemas.requests_generator_input import RqsGeneratorInput +from app.schemas.system_topology_schema.full_system_topology_schema import TopologyGraph + + +class SimulationPayload(BaseModel): + """Full input structure to perform a simulation""" + + rqs_input: RqsGeneratorInput + topology_graph: TopologyGraph diff --git a/src/app/schemas/random_variables_config.py b/src/app/schemas/random_variables_config.py new file mode 100644 index 0000000..8d17c5f --- /dev/null +++ b/src/app/schemas/random_variables_config.py @@ -0,0 +1,31 @@ +"""Definition of the schema for a Random variable""" + +from pydantic import BaseModel, field_validator, model_validator + +from app.config.constants import Distribution + + +class RVConfig(BaseModel): + """class to configure random variables""" + + mean: float + distribution: Distribution = Distribution.POISSON + variance: float | None = None + + @field_validator("mean", mode="before") + def ensure_mean_is_numeric( + cls, # noqa: N805 + v: object, + ) -> float: + """Ensure `mean` is numeric, then coerce to float.""" + err_msg = "mean must be a number (int or float)" + if not isinstance(v, (float, int)): + raise ValueError(err_msg) # noqa: TRY004 + return float(v) + + @model_validator(mode="after") # type: ignore[arg-type] + def default_variance(cls, model: "RVConfig") -> "RVConfig": # noqa: N805 + """Set variance = mean when distribution == 'normal' and variance is missing.""" + if model.variance is None and model.distribution == Distribution.NORMAL: + model.variance = model.mean + return model diff --git a/src/app/schemas/requests_generator_input.py b/src/app/schemas/requests_generator_input.py index 9090fed..88812e0 100644 --- a/src/app/schemas/requests_generator_input.py +++ b/src/app/schemas/requests_generator_input.py @@ -1,37 +1,13 @@ """Define the schemas for the simulator""" -from pydantic import BaseModel, Field, field_validator, model_validator +from pydantic import BaseModel, Field -from app.config.constants import Distribution, TimeDefaults +from app.config.constants import TimeDefaults +from app.schemas.random_variables_config import RVConfig -class RVConfig(BaseModel): - """class to configure random variables""" - - mean: float - distribution: Distribution = Distribution.POISSON - variance: float | None = None - - @field_validator("mean", mode="before") - def ensure_mean_is_numeric( - cls, # noqa: N805 - v: object, - ) -> float: - """Ensure `mean` is numeric, then coerce to float.""" - err_msg = "mean must be a number (int or float)" - if not isinstance(v, (float, int)): - raise ValueError(err_msg) # noqa: TRY004 - return float(v) - - @model_validator(mode="after") # type: ignore[arg-type] - def default_variance(cls, model: "RVConfig") -> "RVConfig": # noqa: N805 - """Set variance = mean when distribution == 'normal' and variance is missing.""" - if model.variance is None and model.distribution == Distribution.NORMAL: - model.variance = model.mean - return model - -class SimulationInput(BaseModel): +class RqsGeneratorInput(BaseModel): """Define the expected variables for the simulation""" avg_active_users: RVConfig diff --git a/src/app/schemas/simulation_output.py b/src/app/schemas/simulation_output.py index b02c172..26d1adb 100644 --- a/src/app/schemas/simulation_output.py +++ b/src/app/schemas/simulation_output.py @@ -7,6 +7,7 @@ class SimulationOutput(BaseModel): """Define the output of the simulation""" total_requests: dict[str, int | float] + # TO DEFINE metric_2: str #...... metric_n: str diff --git a/src/app/schemas/system_topology_schema/endpoint_schema.py b/src/app/schemas/system_topology_schema/endpoint_schema.py new file mode 100644 index 0000000..abe53da --- /dev/null +++ b/src/app/schemas/system_topology_schema/endpoint_schema.py @@ -0,0 +1,97 @@ +"""Defining the input schema for the requests handler""" + +from pydantic import ( + BaseModel, + PositiveFloat, + PositiveInt, + field_validator, + model_validator, +) + +from app.config.constants import ( + EndpointStepCPU, + EndpointStepIO, + EndpointStepRAM, + Metrics, +) + + +class Step(BaseModel): + """ + Steps to be executed inside an endpoint in terms of + the resources needed to accomplish the single step + """ + + kind: EndpointStepIO | EndpointStepCPU | EndpointStepRAM + step_metrics: dict[Metrics, PositiveFloat | PositiveInt] + + @field_validator("step_metrics", mode="before") + def ensure_non_empty( + cls, # noqa: N805 + v: dict[Metrics, PositiveFloat | PositiveInt], + ) -> dict[Metrics, PositiveFloat | PositiveInt]: + """Ensure the dict step metrics exist""" + if not v: + msg = "step_metrics cannot be empty" + raise ValueError(msg) + return v + + @model_validator(mode="after") # type: ignore[arg-type] + def ensure_coherence_kind_metrics( + cls, # noqa: N805 + model: "Step", + ) -> "Step": + """ + Validation to couple kind and metrics only when they are + valid for example ram cannot have associated a cpu time + """ + metrics_keys = set(model.step_metrics) + + # Control of the length of the set to be sure only on key is passed + if len(metrics_keys) != 1: + msg = "step_metrics must contain exactly one entry" + raise ValueError(msg) + + # Coherence CPU bound operation and metric + if isinstance(model.kind, EndpointStepCPU): + if metrics_keys != {Metrics.CPU_TIME}: + msg = ( + "The metric to quantify a CPU BOUND step" + f"must be {Metrics.CPU_TIME}" + ) + raise ValueError(msg) + + # Coherence RAM operation and metric + elif isinstance(model.kind, EndpointStepRAM): + if metrics_keys != {Metrics.NECESSARY_RAM}: + msg = ( + "The metric to quantify a RAM step" + f"must be {Metrics.NECESSARY_RAM}" + ) + raise ValueError(msg) + + # Coherence I/O operation and metric + elif metrics_keys != {Metrics.IO_WAITING_TIME}: + msg = ( + "The metric to quantify an I/O step" + f"must be {Metrics.IO_WAITING_TIME}" + ) + raise ValueError(msg) + + return model + + + + +class Endpoint(BaseModel): + """full endpoint structure to be validated with pydantic""" + + endpoint_name: str + steps: list[Step] + + @field_validator("endpoint_name", mode="before") + def name_to_lower(cls, v: str) -> str: # noqa: N805 + """Standardize endpoint name to be lowercase""" + return v.lower() + + diff --git a/src/app/schemas/system_topology_schema/full_system_topology_schema.py b/src/app/schemas/system_topology_schema/full_system_topology_schema.py new file mode 100644 index 0000000..b53f08e --- /dev/null +++ b/src/app/schemas/system_topology_schema/full_system_topology_schema.py @@ -0,0 +1,217 @@ +""" +Define the topology of the system as a directed graph +where nodes represents macro structure (server, client ecc ecc) +and edges how these strcutures are connected and the network +latency necessary for the requests generated to move from +one structure to another +""" + +from pydantic import ( + BaseModel, + ConfigDict, + Field, + PositiveInt, + field_validator, + model_validator, +) + +from app.config.constants import ( + ServerResourcesDefaults, + SystemEdges, + SystemNodes, +) +from app.schemas.random_variables_config import RVConfig +from app.schemas.system_topology_schema.endpoint_schema import Endpoint + +#------------------------------------------------------------- +# Definition of the nodes structure for the graph representing +# the topoogy of the system defined for the simulation +#------------------------------------------------------------- + +# ------------------------------------------------------------- +# CLIENT +# ------------------------------------------------------------- + +class Client(BaseModel): + """Definition of the client class""" + + id: str + type: SystemNodes = SystemNodes.CLIENT + + @field_validator("type", mode="after") + def ensure_type_is_standard(cls, v: SystemNodes) -> SystemNodes: # noqa: N805 + """Ensure the type of the client is standard""" + if v != SystemNodes.CLIENT: + msg = f"The type should have a standard value: {SystemNodes.CLIENT}" + raise ValueError(msg) + return v + +# ------------------------------------------------------------- +# SERVER RESOURCES EXAMPLE +# ------------------------------------------------------------- + +class ServerResources(BaseModel): + """ + Defines the quantifiable resources available on a server node. + Each attribute maps directly to a SimPy resource primitive. + """ + + cpu_cores: PositiveInt = Field( + ServerResourcesDefaults.CPU_CORES, + ge = ServerResourcesDefaults.MINIMUM_CPU_CORES, + description="Number of CPU cores available for processing.", + ) + db_connection_pool: PositiveInt | None = Field( + ServerResourcesDefaults.DB_CONNECTION_POOL, + description="Size of the database connection pool, if applicable.", + ) + + # Risorse modellate come simpy.Container (livello) + ram_mb: PositiveInt = Field( + ServerResourcesDefaults.RAM_MB, + ge = ServerResourcesDefaults.MINIMUM_RAM_MB, + description="Total available RAM in Megabytes.") + + # for the future + # disk_iops_limit: PositiveInt | None = None + # network_throughput_mbps: PositiveInt | None = None + +# ------------------------------------------------------------- +# SERVER +# ------------------------------------------------------------- + +class Server(BaseModel): + """ + definition of the server class: + - id: is the server identifier + - type: is the type of node in the structure + - server resources: is a dictionary to define the resources + of the machine where the server is living + - endpoints: is the list of all endpoints in a server + """ + + id: str + type: SystemNodes = SystemNodes.SERVER + #Later define a valide structure for the keys of server resources + server_resources : ServerResources + endpoints : list[Endpoint] + + @field_validator("type", mode="after") + def ensure_type_is_standard(cls, v: SystemNodes) -> SystemNodes: # noqa: N805 + """Ensure the type of the server is standard""" + if v != SystemNodes.SERVER: + msg = f"The type should have a standard value: {SystemNodes.SERVER}" + raise ValueError(msg) + return v + +# ------------------------------------------------------------- +# NODES CLASS WITH ALL POSSIBLE OBJECTS REPRESENTED BY A NODE +# ------------------------------------------------------------- + +class TopologyNodes(BaseModel): + """ + Definition of the nodes class: + - server: represent all servers implemented in the system + - client: is a simple object with just a name representing + the origin of the graph + """ + + servers: list[Server] + client: Client + + @model_validator(mode="after") # type: ignore[arg-type] + def unique_ids( + cls, # noqa: N805 + model: "TopologyNodes", + ) -> "TopologyNodes": + """Check that all id are unique""" + ids = [server.id for server in model.servers] + [model.client.id] + if len(ids) != len(set(ids)): + msg = "Node ids must be unique" + raise ValueError(msg) + return model + + model_config = ConfigDict(extra="forbid") + + +#------------------------------------------------------------- +# Definition of the edges structure for the graph representing +# the topoogy of the system defined for the simulation +#------------------------------------------------------------- + +class Edge(BaseModel): + """ + A directed connection in the topology graph. + + Attributes + ---------- + source : str + Identifier of the source node (where the request comes from). + target : str + Identifier of the destination node (where the request goes to). + latency : RVConfig + Random-variable configuration for network latency on this link. + probability : float + Probability of taking this edge when there are multiple outgoing links. + Must be in [0.0, 1.0]. Defaults to 1.0 (always taken). + edge_type : SystemEdges + Category of the link (e.g. network, queue, stream). + + """ + + source: str + target: str + latency: RVConfig + probability: float = Field(1.0, ge=0.0, le=1.0) + edge_type: SystemEdges = SystemEdges.NETWORK_CONNECTION + + @model_validator(mode="after") # type: ignore[arg-type] + def check_src_trgt_different(cls, model: "Edge") -> "Edge": # noqa: N805 + """Ensure source is different from target""" + if model.source == model.target: + msg = "source and target must be different nodes" + raise ValueError(msg) + return model + + +#------------------------------------------------------------- +# Definition of the Graph structure representing +# the topogy of the system defined for the simulation +#------------------------------------------------------------- + +class TopologyGraph(BaseModel): + """ + data collection for the whole graph representing + the full system + """ + + nodes: TopologyNodes + edges: list[Edge] + + @model_validator(mode="after") # type: ignore[arg-type] + def edge_refs_valid(cls, model: "TopologyGraph") -> "TopologyGraph": # noqa: N805 + """ + Ensure that **every** edge points to valid nodes. + + The validator is executed *after* the entire ``TopologyGraph`` model has + been built, so all servers and the client already exist in ``m.nodes``. + + Steps + ----- + 1. Build the set ``valid_ids`` containing: + * all ``Server.id`` values, **plus** + * the single ``Client.id``. + 2. Iterate through each ``Edge`` in ``m.edges`` and raise + :class:`ValueError` if either ``edge.source`` or ``edge.target`` is + **not** present in ``valid_ids``. + + Returning the (unchanged) model signals that the integrity check passed. + """ + valid_ids = {s.id for s in model.nodes.servers} | {model.nodes.client.id} + for e in model.edges: + if e.source not in valid_ids or e.target not in valid_ids: + msg = f"Edge {e.source}->{e.target} references unknown node" + raise ValueError(msg) + return model + + diff --git a/tests/unit/input_sructure/test_endpoint_input.py b/tests/unit/input_sructure/test_endpoint_input.py new file mode 100644 index 0000000..7e166dd --- /dev/null +++ b/tests/unit/input_sructure/test_endpoint_input.py @@ -0,0 +1,125 @@ +"""Unit tests for the Endpoint and Step Pydantic schemas.""" + +from __future__ import annotations + +import pytest +from pydantic import ValidationError + +from app.config.constants import ( + EndpointStepCPU, + EndpointStepIO, + EndpointStepRAM, + Metrics, +) +from app.schemas.system_topology_schema.endpoint_schema import Endpoint, Step + + +# --------------------------------------------------------------------------- # +# Helper functions to build minimal valid Step objects +# --------------------------------------------------------------------------- # +def cpu_step(value: float = 0.1) -> Step: + """Return a minimal valid CPU-bound Step.""" + return Step( + kind=EndpointStepCPU.CPU_BOUND_OPERATION, + step_metrics={Metrics.CPU_TIME: value}, + ) + + +def ram_step(value: int = 128) -> Step: + """Return a minimal valid RAM Step.""" + return Step( + kind=EndpointStepRAM.RAM, + step_metrics={Metrics.NECESSARY_RAM: value}, + ) + + +def io_step(value: float = 0.05) -> Step: + """Return a minimal valid I/O Step.""" + return Step( + kind=EndpointStepIO.WAIT, + step_metrics={Metrics.IO_WAITING_TIME: value}, + ) + + +# --------------------------------------------------------------------------- # +# Positive test cases +# --------------------------------------------------------------------------- # +def test_valid_cpu_step() -> None: + """Test that a CPU step with correct 'cpu_time' metric passes validation.""" + step = cpu_step() + # The metric value must match the input + assert step.step_metrics[Metrics.CPU_TIME] == 0.1 + + +def test_valid_ram_step() -> None: + """Test that a RAM step with correct 'necessary_ram' metric passes validation.""" + step = ram_step() + assert step.step_metrics[Metrics.NECESSARY_RAM] == 128 + + +def test_valid_io_step() -> None: + """Test that an I/O step with correct 'io_waiting_time' metric passes validation.""" + step = io_step() + assert step.step_metrics[Metrics.IO_WAITING_TIME] == 0.05 + + +def test_endpoint_with_mixed_steps() -> None: + """Test that an Endpoint with multiple valid Step instances normalizes the name.""" + ep = Endpoint( + endpoint_name="/Predict", + steps=[cpu_step(), ram_step(), io_step()], + ) + # endpoint_name should be lowercased by the validator + assert ep.endpoint_name == "/predict" + # All steps should be present in the list + assert len(ep.steps) == 3 + + +# --------------------------------------------------------------------------- # +# Negative test cases +# --------------------------------------------------------------------------- # +@pytest.mark.parametrize( + ("kind", "bad_metrics"), + [ + # CPU step with RAM metric + (EndpointStepCPU.CPU_BOUND_OPERATION, {Metrics.NECESSARY_RAM: 64}), + # RAM step with CPU metric + (EndpointStepRAM.RAM, {Metrics.CPU_TIME: 0.2}), + # I/O step with CPU metric + (EndpointStepIO.DB, {Metrics.CPU_TIME: 0.05}), + ], +) +def test_incoherent_kind_metric_pair( + kind: EndpointStepCPU | EndpointStepRAM | EndpointStepIO, + bad_metrics: dict[Metrics, float | int], +) -> None: + """Test that mismatched kind and metric combinations raise ValidationError.""" + with pytest.raises(ValidationError): + Step(kind=kind, step_metrics=bad_metrics) + + +def test_multiple_metrics_not_allowed() -> None: + """Test that providing multiple metrics in a single Step raises ValidationError.""" + with pytest.raises(ValidationError): + Step( + kind=EndpointStepCPU.CPU_BOUND_OPERATION, + step_metrics={ + Metrics.CPU_TIME: 0.1, + Metrics.NECESSARY_RAM: 64, + }, + ) + + +def test_empty_metrics_rejected() -> None: + """Test that an empty metrics dict is rejected by the validator.""" + with pytest.raises(ValidationError): + Step(kind=EndpointStepCPU.CPU_BOUND_OPERATION, step_metrics={}) + + +def test_wrong_metric_name_for_io() -> None: + """Test that an I/O step with a non-I/O metric key is rejected.""" + with pytest.raises(ValidationError): + Step( + kind=EndpointStepIO.CACHE, + step_metrics={Metrics.NECESSARY_RAM: 64}, + ) diff --git a/tests/unit/input_sructure/test_full_topology_input.py b/tests/unit/input_sructure/test_full_topology_input.py new file mode 100644 index 0000000..9e17573 --- /dev/null +++ b/tests/unit/input_sructure/test_full_topology_input.py @@ -0,0 +1,170 @@ +"""Unit-tests for the **topology schemas** (Client, ServerResources, …). + +Every section below is grouped by the object under test, separated by +clear comment banners so that long files remain navigable. + +The tests aim for: +* 100 % branch-coverage on custom validators. +* mypy strict-compatibility (full type hints, no Any). +* ruff compliance (imports ordered, no unused vars, ≤ 88-char lines). +""" + +from __future__ import annotations + +import pytest +from pydantic import ValidationError + +from app.config.constants import ( + EndpointStepCPU, + Metrics, + ServerResourcesDefaults, + SystemEdges, + SystemNodes, +) +from app.schemas.random_variables_config import RVConfig +from app.schemas.system_topology_schema.endpoint_schema import Endpoint, Step +from app.schemas.system_topology_schema.full_system_topology_schema import ( + Client, + Edge, + Server, + ServerResources, + TopologyGraph, + TopologyNodes, +) + + +# --------------------------------------------------------------------------- # +# Client +# --------------------------------------------------------------------------- # +def test_valid_client() -> None: + """A client with correct `type` should validate.""" + cli = Client(id="frontend", type=SystemNodes.CLIENT) + assert cli.type is SystemNodes.CLIENT + + +def test_invalid_client_type() -> None: + """Wrong `type` enum on Client must raise ValidationError.""" + with pytest.raises(ValidationError): + Client(id="wrong", type=SystemNodes.SERVER) + + +# --------------------------------------------------------------------------- # +# ServerResources +# --------------------------------------------------------------------------- # +def test_server_resources_defaults() -> None: + """Default values must match the constant table.""" + res = ServerResources() # all defaults + assert res.cpu_cores == ServerResourcesDefaults.CPU_CORES + assert res.ram_mb == ServerResourcesDefaults.RAM_MB + assert res.db_connection_pool is ServerResourcesDefaults.DB_CONNECTION_POOL + + +def test_server_resources_min_constraints() -> None: + """cpu_cores and ram_mb < minimum should fail validation.""" + with pytest.raises(ValidationError): + ServerResources(cpu_cores=0, ram_mb=128) # too small + + +# --------------------------------------------------------------------------- # +# Server +# --------------------------------------------------------------------------- # +def _dummy_endpoint() -> Endpoint: + """Return a minimal valid Endpoint needed to build a Server.""" + step = Step( + kind=EndpointStepCPU.CPU_BOUND_OPERATION, + step_metrics={Metrics.CPU_TIME: 0.1}, + ) + return Endpoint(endpoint_name="/ping", steps=[step]) + + +def test_valid_server() -> None: + """Server with correct type, resources and endpoint list.""" + srv = Server( + id="api-1", + type=SystemNodes.SERVER, + server_resources=ServerResources(cpu_cores=2, ram_mb=1024), + endpoints=[_dummy_endpoint()], + ) + assert srv.id == "api-1" + + +def test_invalid_server_type() -> None: + """Server with wrong `type` enum must be rejected.""" + with pytest.raises(ValidationError): + Server( + id="oops", + type=SystemNodes.CLIENT, + server_resources=ServerResources(), + endpoints=[_dummy_endpoint()], + ) + + +# --------------------------------------------------------------------------- # +# TopologyNodes +# --------------------------------------------------------------------------- # +def _single_node_topology() -> TopologyNodes: + """Helper that returns a valid TopologyNodes with one server and one client.""" + srv = Server( + id="svc-A", + server_resources=ServerResources(), + endpoints=[_dummy_endpoint()], + ) + cli = Client(id="browser") + return TopologyNodes(servers=[srv], client=cli) + + +def test_unique_ids_validator() -> None: + """Duplicate node IDs should trigger the unique_ids validator.""" + nodes = _single_node_topology() + # duplicate client ID + dup_srv = nodes.servers[0].model_copy(update={"id": "browser"}) + with pytest.raises(ValidationError): + TopologyNodes(servers=[dup_srv], client=nodes.client) + + +# --------------------------------------------------------------------------- # +# Edge +# --------------------------------------------------------------------------- # +def test_edge_source_equals_target_fails() -> None: + """Edge with identical source/target must raise.""" + latency_cfg = RVConfig(mean=0.05) + with pytest.raises(ValidationError): + Edge( + source="same", + target="same", + latency=latency_cfg, + edge_type=SystemEdges.NETWORK_CONNECTION, + ) + + +# --------------------------------------------------------------------------- # +# TopologyGraph +# --------------------------------------------------------------------------- # +def _latency() -> RVConfig: + """A tiny helper for RVConfig latency objects.""" + return RVConfig(mean=0.02) + + +def test_valid_topology_graph() -> None: + """End-to-end happy-path graph passes validation.""" + nodes = _single_node_topology() + edge = Edge( + source="browser", + target="svc-A", + latency=_latency(), + probability=1.0, + ) + graph = TopologyGraph(nodes=nodes, edges=[edge]) + assert len(graph.edges) == 1 + + +def test_edge_refers_unknown_node() -> None: + """Edge pointing to a non-existent node ID must fail.""" + nodes = _single_node_topology() + bad_edge = Edge( + source="browser", + target="ghost-srv", + latency=_latency(), + ) + with pytest.raises(ValidationError): + TopologyGraph(nodes=nodes, edges=[bad_edge]) diff --git a/tests/unit/simulation/test_requests_generator_input.py b/tests/unit/input_sructure/test_requests_generator_input.py similarity index 95% rename from tests/unit/simulation/test_requests_generator_input.py rename to tests/unit/input_sructure/test_requests_generator_input.py index bbd84ae..9fb49a5 100644 --- a/tests/unit/simulation/test_requests_generator_input.py +++ b/tests/unit/input_sructure/test_requests_generator_input.py @@ -2,7 +2,8 @@ from pydantic import ValidationError from app.config.constants import Distribution, TimeDefaults -from app.schemas.requests_generator_input import RVConfig, SimulationInput +from app.schemas.random_variables_config import RVConfig +from app.schemas.requests_generator_input import RqsGeneratorInput # -------------------------------------------------------------------------- # TEST RANDOM VARIABLE CONFIGURATION @@ -84,7 +85,7 @@ def test_invalid_distribution_raises() -> None: def test_default_user_sampling_window() -> None: """When user_sampling_window is omitted, it defaults to USER_SAMPLING_WINDOW.""" - inp = SimulationInput( + inp = RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, @@ -97,7 +98,7 @@ def test_default_user_sampling_window() -> None: def test_explicit_user_sampling_window_kept() -> None: """An explicit user_sampling_window value is preserved unchanged.""" custom_window = 30 - inp = SimulationInput( + inp = RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, @@ -112,7 +113,7 @@ def test_user_sampling_window_not_int_raises() -> None: """A non-integer user_sampling_window raises a ValidationError.""" with pytest.raises(ValidationError) as excinfo: - SimulationInput( + RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, @@ -136,7 +137,7 @@ def test_user_sampling_window_above_max_raises() -> None: """ too_large = TimeDefaults.MAX_USER_SAMPLING_WINDOW + 1 with pytest.raises(ValidationError) as excinfo: - SimulationInput( + RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, @@ -161,7 +162,7 @@ def test_user_sampling_window_above_max_raises() -> None: def test_default_total_simulation_time() -> None: """When total_simulation_time is omitted, it defaults to SIMULATION_TIME.""" - inp = SimulationInput( + inp = RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, @@ -174,7 +175,7 @@ def test_default_total_simulation_time() -> None: def test_explicit_total_simulation_time_kept() -> None: """An explicit total_simulation_time value is preserved unchanged.""" custom_time = 3_000 - inp = SimulationInput( + inp = RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, @@ -189,7 +190,7 @@ def test_total_simulation_time_not_int_raises() -> None: """A non-integer total_simulation_time raises a ValidationError.""" with pytest.raises(ValidationError) as excinfo: - SimulationInput( + RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, @@ -213,7 +214,7 @@ def test_total_simulation_time_below_minimum_raises() -> None: """ too_small = TimeDefaults.MIN_SIMULATION_TIME - 1 with pytest.raises(ValidationError) as excinfo: - SimulationInput( + RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": Distribution.POISSON}, avg_request_per_minute_per_user={ "mean": 1.0, diff --git a/tests/unit/sampler/test_gaussian_poisson.py b/tests/unit/sampler/test_gaussian_poisson.py index f0abbf8..b464007 100644 --- a/tests/unit/sampler/test_gaussian_poisson.py +++ b/tests/unit/sampler/test_gaussian_poisson.py @@ -10,7 +10,8 @@ from app.config.constants import TimeDefaults from app.core.event_samplers.gaussian_poisson import gaussian_poisson_sampling -from app.schemas.requests_generator_input import RVConfig, SimulationInput +from app.schemas.random_variables_config import RVConfig +from app.schemas.requests_generator_input import RqsGeneratorInput # --------------------------------------------------------------------------- # Fixture @@ -18,9 +19,9 @@ @pytest.fixture -def base_input() -> SimulationInput: - """Return a minimal, valid SimulationInput for the Gaussian-Poisson sampler.""" - return SimulationInput( +def base_input() -> RqsGeneratorInput: + """Return a minimal, valid RqsGeneratorInput for the Gaussian-Poisson sampler.""" + return RqsGeneratorInput( avg_active_users=RVConfig( mean=10.0, variance=4.0, distribution="normal", ), @@ -35,14 +36,14 @@ def base_input() -> SimulationInput: # --------------------------------------------------------------------------- -def test_returns_generator_type(base_input: SimulationInput) -> None: +def test_returns_generator_type(base_input: RqsGeneratorInput) -> None: """The function must return a generator object.""" rng = np.random.default_rng(0) gen = gaussian_poisson_sampling(base_input, rng=rng) assert isinstance(gen, GeneratorType) -def test_generates_positive_gaps(base_input: SimulationInput) -> None: +def test_generates_positive_gaps(base_input: RqsGeneratorInput) -> None: """ With nominal parameters the sampler should emit at least a few positive gaps and no gap must be non-positive. @@ -67,7 +68,7 @@ def test_generates_positive_gaps(base_input: SimulationInput) -> None: def test_zero_users_produces_no_events( monkeypatch: pytest.MonkeyPatch, - base_input: SimulationInput, + base_input: RqsGeneratorInput, ) -> None: """ If every Gaussian draw returns 0 users, Λ == 0, diff --git a/tests/unit/sampler/test_poisson_posson.py b/tests/unit/sampler/test_poisson_posson.py index 5d89686..48d4f9a 100644 --- a/tests/unit/sampler/test_poisson_posson.py +++ b/tests/unit/sampler/test_poisson_posson.py @@ -11,13 +11,14 @@ from app.config.constants import TimeDefaults from app.core.event_samplers.poisson_poisson import poisson_poisson_sampling -from app.schemas.requests_generator_input import RVConfig, SimulationInput +from app.schemas.random_variables_config import RVConfig +from app.schemas.requests_generator_input import RqsGeneratorInput @pytest.fixture -def base_input() -> SimulationInput: - """Return a minimal-valid SimulationInput for the sampler tests.""" - return SimulationInput( +def base_input() -> RqsGeneratorInput: + """Return a minimal-valid RqsGeneratorInput for the sampler tests.""" + return RqsGeneratorInput( # 1 average concurrent user … avg_active_users={"mean": 1.0, "distribution": "poisson"}, # … sending on average 60 req/min → 1 req/s @@ -32,7 +33,7 @@ def base_input() -> SimulationInput: # --------------------------------------------------------------------- -def test_sampler_returns_generator(base_input: SimulationInput) -> None: +def test_sampler_returns_generator(base_input: RqsGeneratorInput) -> None: """The function must return a real generator object.""" rng = np.random.default_rng(0) gen = poisson_poisson_sampling(base_input, rng=rng) @@ -40,7 +41,7 @@ def test_sampler_returns_generator(base_input: SimulationInput) -> None: assert isinstance(gen, GeneratorType) -def test_all_gaps_are_positive(base_input: SimulationInput) -> None: +def test_all_gaps_are_positive(base_input: RqsGeneratorInput) -> None: """Every yielded inter-arrival gap Δt must be > 0.""" rng = np.random.default_rng(1) gaps: list[float] = list( @@ -56,7 +57,7 @@ def test_all_gaps_are_positive(base_input: SimulationInput) -> None: # --------------------------------------------------------------------- -def test_sampler_is_reproducible_with_fixed_seed(base_input: SimulationInput) -> None: +def test_sampler_is_reproducible_with_fixed_seed(base_input: RqsGeneratorInput) -> None: """Same seed ⇒ identical first N gaps.""" seed = 42 n_samples = 15 @@ -86,12 +87,12 @@ def test_sampler_is_reproducible_with_fixed_seed(base_input: SimulationInput) -> # --------------------------------------------------------------------- -def test_zero_users_produces_no_events(base_input: SimulationInput) -> None: +def test_zero_users_produces_no_events(base_input: RqsGeneratorInput) -> None: """ With mean concurrent users == 0 the Poisson draw is almost surely 0, so Λ = 0 and the generator should yield no events. """ - input_data = SimulationInput( + input_data = RqsGeneratorInput( avg_active_users=RVConfig(mean=0.0, distribution="poisson"), avg_request_per_minute_per_user=RVConfig(mean=60.0, distribution="poisson"), total_simulation_time=TimeDefaults.MIN_SIMULATION_TIME, @@ -108,7 +109,7 @@ def test_zero_users_produces_no_events(base_input: SimulationInput) -> None: # --------------------------------------------------------------------- -def test_cumulative_time_never_exceeds_horizon(base_input: SimulationInput) -> None: +def test_cumulative_time_never_exceeds_horizon(base_input: RqsGeneratorInput) -> None: """ΣΔt (virtual clock) must stay strictly below total_simulation_time.""" rng = np.random.default_rng(7) gaps = list(poisson_poisson_sampling(base_input, rng=rng)) diff --git a/tests/unit/simulation/test_requests_generator.py b/tests/unit/simulation/test_requests_generator.py index 5164413..9b77baf 100644 --- a/tests/unit/simulation/test_requests_generator.py +++ b/tests/unit/simulation/test_requests_generator.py @@ -11,7 +11,7 @@ from app.config.constants import TimeDefaults from app.core.simulation.requests_generator import requests_generator from app.core.simulation.simulation_run import run_simulation -from app.schemas.requests_generator_input import SimulationInput +from app.schemas.requests_generator_input import RqsGeneratorInput if TYPE_CHECKING: @@ -24,9 +24,9 @@ # -------------------------------------------------------------- @pytest.fixture -def base_input() -> SimulationInput: - """Return a SimulationInput with a 120-second simulation horizon.""" - return SimulationInput( +def base_input() -> RqsGeneratorInput: + """Return a RqsGeneratorInput with a 120-second simulation horizon.""" + return RqsGeneratorInput( avg_active_users={"mean": 1.0}, avg_request_per_minute_per_user={"mean": 2.0}, total_simulation_time=TimeDefaults.MIN_SIMULATION_TIME, @@ -37,7 +37,7 @@ def base_input() -> SimulationInput: # -------------------------------------------------------------- def test_default_requests_generator_uses_poisson_poisson_sampling( - base_input: SimulationInput, + base_input: RqsGeneratorInput, ) -> None: """ Verify that when avg_active_users.distribution is the default 'poisson', @@ -69,7 +69,7 @@ def test_requests_generator_dispatches_to_correct_sampler( - 'poisson' → poisson_poisson_sampling - 'normal' → gaussian_poisson_sampling """ - input_data = SimulationInput( + input_data = RqsGeneratorInput( avg_active_users={"mean": 1.0, "distribution": dist}, avg_request_per_minute_per_user={"mean": 1.0}, total_simulation_time=TimeDefaults.MIN_SIMULATION_TIME, @@ -87,7 +87,7 @@ def test_requests_generator_dispatches_to_correct_sampler( # -------------------------------------------------------------- def test_run_simulation_counts_events_up_to_horizon( - monkeypatch: pytest.MonkeyPatch, base_input: SimulationInput, + monkeypatch: pytest.MonkeyPatch, base_input: RqsGeneratorInput, ) -> None: """ Verify that all events whose cumulative inter-arrival times @@ -96,7 +96,7 @@ def test_run_simulation_counts_events_up_to_horizon( yield 4 events by t=10. """ def fake_requests_generator_fixed( - data: SimulationInput, *, rng: np.random.Generator, + data: RqsGeneratorInput, *, rng: np.random.Generator, ) -> Iterator[float]: # Replace the complex Poisson-Poisson sampler with a deterministic sequence. yield from [1.0, 2.0, 3.0, 4.0] @@ -118,14 +118,14 @@ def fake_requests_generator_fixed( def test_run_simulation_includes_event_at_exact_horizon( - monkeypatch: pytest.MonkeyPatch, base_input: SimulationInput, + monkeypatch: pytest.MonkeyPatch, base_input: RqsGeneratorInput, ) -> None: """ Confirm that an event scheduled exactly at the simulation horizon is not processed, since SimPy stops at t == horizon. """ def fake_generator_at_horizon( - data: SimulationInput, *, rng: np.random.Generator, + data: RqsGeneratorInput, *, rng: np.random.Generator, ) -> Iterator[float]: # mypy assertion, pydantic guaranteed @@ -146,14 +146,14 @@ def fake_generator_at_horizon( def test_run_simulation_excludes_event_beyond_horizon( - monkeypatch: pytest.MonkeyPatch, base_input: SimulationInput, + monkeypatch: pytest.MonkeyPatch, base_input: RqsGeneratorInput, ) -> None: """ Ensure that events scheduled after the simulation horizon are not counted. """ def fake_generator_beyond_horizon( - data: SimulationInput, *, rng: np.random.Generator, + data: RqsGeneratorInput, *, rng: np.random.Generator, ) -> Iterator[float]: # mypy assertion, pydantic guaranteed @@ -173,14 +173,14 @@ def fake_generator_beyond_horizon( def test_run_simulation_zero_events_when_generator_empty( - monkeypatch: pytest.MonkeyPatch, base_input: SimulationInput, + monkeypatch: pytest.MonkeyPatch, base_input: RqsGeneratorInput, ) -> None: """ Check that run_simulation reports zero requests when no inter-arrival times are yielded. """ def fake_generator_empty( - data: SimulationInput, *, rng: np.random.Generator, + data: RqsGeneratorInput, *, rng: np.random.Generator, ) -> Iterator[float]: # Empty generator yields nothing. if False: