Skip to content

Commit 230bc56

Browse files
committed
code refactor + test added for runtime + documentation
1 parent 2d529aa commit 230bc56

40 files changed

+1182
-675
lines changed

README.md

Lines changed: 158 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,86 +1,176 @@
1-
## How to Start the Backend with Docker (Development)
1+
Certamente. Ecco il contenuto del `README.md` visualizzato direttamente qui.
22

3-
To spin up the backend and its supporting services in development mode:
4-
5-
1. **Install & run Docker** on your machine.
6-
2. **Clone** the repository and `cd` into its root.
7-
3. Execute:
8-
9-
```bash
10-
bash ./scripts/init-docker-dev.sh
11-
```
12-
13-
This will launch:
14-
15-
* A **PostgreSQL** container
16-
* A **Backend** container that mounts your local `src/` folder with live-reload
17-
18-
---
19-
20-
## Development Architecture & Philosophy
21-
22-
We split responsibilities between Docker-managed services and local workflows:
3+
-----
4+
5+
# **FastSim Project Overview**
6+
7+
## **1. Why FastSim?**
8+
9+
FastAPI + Uvicorn gives Python teams a lightning-fast async stack, yet sizing it for production still means guesswork, costly cloud load-tests, or late surprises. **FastSim** fills that gap by becoming a **digital twin** of your actual service:
10+
11+
* It **replicates** your FastAPI + Uvicorn event-loop behavior in SimPy, generating the same kinds of asynchronous steps (parsing, CPU work, I/O, LLM calls) that happen in real code.
12+
* It **models** your infrastructure primitives—CPU cores (via a SimPy `Resource`), database pools, rate-limiters, and even GPU inference quotas—so you can see queue lengths, scheduling delays, resource utilization, and end-to-end latency.
13+
* It **outputs** the very metrics you would scrape in production (p50/p95/p99 latency, ready-queue lag, concurrency, throughput, cost per LLM call), but entirely offline, in seconds.
14+
15+
With FastSim you can ask, *“What happens if traffic doubles on Black Friday?”*, *“How many cores are needed to keep p95 latency below 100 ms?”*, or *“Is our LLM-driven endpoint ready for prime time?”*—and get quantitative answers **before** you deploy.
16+
17+
**Outcome:** Data-driven capacity planning, early performance tuning, and far fewer surprises in production.
18+
19+
## **2. Project Goals**
20+
21+
| \# | Goal | Practical Outcome |
22+
| :--- | :--- | :--- |
23+
| 1 | **Pre-production sizing** | Know the required core count, pool size, and replica count to meet your SLA. |
24+
| 2 | **Scenario analysis** | Explore various traffic models, endpoint mixes, latency distributions, and RTT. |
25+
| 3 | **Twin metrics** | Produce the same metrics you’ll scrape in production (latency, queue length, CPU utilization). |
26+
| 4 | **Rapid iteration** | A single YAML/JSON configuration or REST call generates a full performance report. |
27+
| 5 | **Educational value** | Visualize how GIL contention, queue length, and concurrency react to load. |
28+
29+
## **3. Who Benefits & Why**
30+
31+
| Audience | Pain-Point Solved | FastSim Value |
32+
| :--- | :--- | :--- |
33+
| **Backend Engineers** | Unsure if a 4-vCPU container can survive a marketing traffic spike. | Run *what-if* scenarios, tweak CPU cores or pool sizes, and get p95 latency and max-concurrency metrics before merging code. |
34+
| **DevOps / SRE** | Guesswork in capacity planning; high cost of over-provisioning. | Simulate 1 to N replicas, autoscaler thresholds, and database pool sizes to find the most cost-effective configuration that meets the SLA. |
35+
| **ML / LLM Product Teams** | LLM inference cost and latency are difficult to predict. | Model the LLM step with a price and latency distribution to estimate cost-per-request and the benefits of GPU batching without needing real GPUs. |
36+
| **Educators / Trainers** | Students struggle to visualize event-loop internals. | Visualize GIL ready-queue lag, CPU vs. I/O steps, and the effect of blocking code—perfect for live demos and labs. |
37+
| **Consultants / Architects** | Need a quick proof-of-concept for new client designs. | Define endpoints in YAML and demonstrate throughput and latency under projected load in minutes. |
38+
| **Open-Source Community** | Lacks a lightweight Python simulator for ASGI workloads. | An extensible codebase makes it easy to plug in new resources (e.g., rate-limiters, caches) or traffic models (e.g., spike, uniform ramp). |
39+
| **System-Design Interviewees** | Hard to quantify trade-offs in whiteboard interviews. | Prototype real-time metrics—queue lengths, concurrency, latency distributions—to demonstrate how your design scales and where bottlenecks lie. |
40+
41+
## **4. About This Documentation**
42+
43+
This project contains extensive documentation covering its vision, architecture, and technical implementation. The documents are designed to be read in sequence to build a comprehensive understanding of the project.
44+
45+
### **How to Read This Documentation**
46+
47+
For the best understanding of FastSim, we recommend reading the documentation in the following order:
48+
49+
1. **README.md (This Document)**: Start here for a high-level overview of the project's purpose, goals, target audience, and development workflow. It provides the essential context for all other documents.
50+
2. **dev_worflow_guide**: This document details the github workflow for the development
51+
3. **simulation_input**: This document details the technical contract for configuring a simulation. It explains the `SimulationPayload` and its components (`rqs_input`, `topology_graph`, `sim_settings`). This is essential reading for anyone who will be creating or modifying simulation configurations.
52+
4. **runtime_and_resources**: A deep dive into the simulation's internal engine. It explains how the validated input is transformed into live SimPy processes (Actors, Resources, State). This is intended for advanced users or contributors who want to understand *how* the simulation works under the hood.
53+
5. **requests_generator**: This document covers the mathematical and algorithmic details behind the traffic generation model. It is for those interested in the statistical foundations of the simulator.
54+
6. **Simulation Metrics**: A comprehensive guide to all output metrics. It explains what each metric measures, how it's collected, and why it's important for performance analysis.
55+
56+
Optional **fastsim_vision**: a more detailed document about the project vision
57+
58+
you can find the documentation at the root of the project in the folder `documentation/`
59+
60+
## **5. Development Workflow & Architecture Guide**
61+
62+
This section outlines the standardized development workflow, repository architecture, and branching strategy for the FastSim backend.
63+
64+
### **Technology Stack**
65+
66+
* **Backend**: FastAPI
67+
* **Backend Package Manager**: Poetry
68+
* **Frontend**: React + JavaScript
69+
* **Database**: PostgreSQL
70+
* **Caching**: Redis
71+
* **Containerization**: Docker
72+
73+
### **Backend Service (`FastSim-backend`)**
74+
75+
The repository hosts the entire FastAPI backend, which exposes the REST API, runs the discrete-event simulation, communicates with the database, and provides metrics.
76+
77+
```
78+
fastsim-backend/
79+
├── Dockerfile
80+
├── docker_fs/
81+
│ ├── docker-compose.dev.yml
82+
│ └── docker-compose.prod.yml
83+
├── scripts/
84+
│ ├── init-docker-dev.sh
85+
│ └── quality-check.sh
86+
├── alembic/
87+
│ ├── env.py
88+
│ └── versions/
89+
├── documentation/
90+
│ └── backend_documentation/
91+
├── tests/
92+
│ ├── unit/
93+
│ └── integration/
94+
├── src/
95+
│ └── app/
96+
│ ├── api/
97+
│ ├── config/
98+
│ ├── db/
99+
│ ├── metrics/
100+
│ ├── resources/
101+
│ ├── runtime/
102+
│ │ ├── rqs_state.py
103+
│ │ └── actors/
104+
│ ├── samplers/
105+
│ ├── schemas/
106+
│ ├── main.py
107+
│ └── simulation_run.py
108+
├── poetry.lock
109+
├── pyproject.toml
110+
└── README.md
111+
```
112+
113+
### **How to Start the Backend with Docker (Development)**
23114

24-
### 🐳 Docker-Compose Dev
25-
26-
* **Containers** host external services (PostgreSQL) and run the FastAPI app.
27-
* Your **local `src/` directory** is mounted into the backend container for hot-reload.
28-
* **No tests, migrations, linting, or type checks** run inside these containers during development.
29-
30-
**Why?**
31-
32-
* **Fater feedback** on code changes
33-
* **Full IDE support** (debugging, autocomplete, refactoring)
34-
* **Speed**—no rebuilding images on every change
35-
36-
---
37-
38-
### Local Quality & Testing Workflow
39-
40-
All code quality tools, migrations, and tests execute on your host machine:
41-
42-
| Task | Command | Notes |
43-
| --------------------- | ---------------------------------------- | ------------------------------------------------- |
44-
| **Lint & format** | `poetry run ruff check src tests` | Style and best-practice validations |
45-
| **Type checking** | `poetry run mypy src tests` | Static type enforcement |
46-
| **Unit tests** | `poetry run pytest -m "not integration"` | Fast, isolated tests—no DB required |
47-
| **Integration tests** | `poetry run pytest -m integration` | Real-DB tests against Docker’s PostgreSQL |
48-
| **DB migrations** | `poetry run alembic upgrade head` | Applies migrations to your local Docker-hosted DB |
115+
To spin up the backend and its supporting services in development mode:
49116

50-
> **Rationale:**
51-
> Running tests or Alembic migrations inside Docker images would force you to mount the full source tree, install dev dependencies in each build, and copy over configs—**slowing down** your feedback loop and **limiting** IDE features.
117+
1. **Install & run Docker** on your machine.
118+
2. **Clone** the repository and `cd` into its root.
119+
3. Execute:
120+
```bash
121+
bash ./scripts/init-docker-dev.sh
122+
```
123+
This will launch a **PostgreSQL** container and a **Backend** container that mounts your local `src/` folder with live-reload enabled.
52124

53-
---
125+
### **Development Architecture & Philosophy**
54126

55-
## CI/CD with GitHub Actions
127+
We split responsibilities between Docker-managed services and local workflows.
56128

57-
We maintain two jobs on the `develop` branch:
129+
* **Docker-Compose for Development**: Containers host external services (PostgreSQL) and run the FastAPI app. Your local `src/` directory is mounted into the backend container for hot-reloading. No tests, migrations, or linting run inside these containers during development.
130+
* **Local Quality & Testing Workflow**: All code quality tools, migrations, and tests are executed on your host machine for faster feedback and full IDE support.
58131

59-
### 🔍 Quick (on Pull Requests)
132+
| Task | Command | Notes |
133+
| :--- | :--- | :--- |
134+
| **Lint & format** | `poetry run ruff check src tests` | Style and best-practice validations |
135+
| **Type checking** | `poetry run mypy src tests` | Static type enforcement |
136+
| **Unit tests** | `poetry run pytest -m "not integration"` | Fast, isolated tests—no DB required |
137+
| **Integration tests** | `poetry run pytest -m integration` | Real-DB tests against Docker’s PostgreSQL |
138+
| **DB migrations** | `poetry run alembic upgrade head` | Applies migrations to your local Docker-hosted DB |
60139

61-
* Ruff & MyPy
62-
* Unit tests only
63-
* **No database**
140+
**Rationale**: Running tests or Alembic migrations inside Docker images would slow down your feedback loop and limit IDE features by requiring you to mount the full source tree and install dev dependencies in each build.
64141

65-
### 🛠️ Full (on pushes to `develop`)
142+
## **6. CI/CD with GitHub Actions**
66143

67-
* All **Quick** checks
68-
* Start a **PostgreSQL** service container
69-
* Run **Alembic** migrations
70-
* Execute **unit + integration** tests
71-
* Build the **Docker** image
72-
* **Smoke-test** the `/health` endpoint
144+
We maintain two jobs on the `develop` branch to ensure code quality and stability.
73145

74-
> **Guarantee:** Every commit in `develop` is style-checked, type-safe, DB-tested, and Docker-ready.
146+
### **Quick (on Pull Requests)**
75147

76-
---
148+
* Ruff & MyPy checks
149+
* Unit tests only
150+
* **No database required**
77151

78-
## Summary
152+
### **Full (on pushes to `develop`)**
79153

80-
1. **Docker-Compose** for services & hot-reload of the app code
81-
2. **Local** execution of migrations, tests, and QA for speed and IDE integration
82-
3. **CI pipeline** split into quick PR checks and full develop-branch validation
154+
* All checks from the "Quick" suite
155+
* Starts a **PostgreSQL** service container
156+
* Runs **Alembic** migrations
157+
* Executes the **full test suite** (unit + integration)
158+
* Builds the **Docker** image
159+
* **Smoke-tests** the `/health` endpoint of the built container
83160

161+
**Guarantee**: Every commit in `develop` is style-checked, type-safe, database-tested, and Docker-ready.
84162

163+
## **7. Limitations – v0.1 (First Public Release)**
85164

165+
1. **Network Delay Model**
166+
* Only pure transport latency is simulated.
167+
* Bandwidth-related effects (e.g., payload size, link speed, congestion) are NOT accounted for.
168+
2. **Concurrency Model**
169+
* The service exposes **async-only endpoints**.
170+
* Execution runs on a single `asyncio` event-loop thread.
171+
* No thread-pool workers or multi-process setups are supported yet; therefore, concurrency is limited to coroutine scheduling (cooperative, single-thread).
172+
3. **CPU Core Allocation**
173+
* Every server instance is pinned to **one physical CPU core**.
174+
* Horizontal scaling must be achieved via multiple containers/VMs, not via multi-core utilization inside a single process.
86175

176+
These constraints will be revisited in future milestones once kernel-level context-switching costs, I/O bandwidth modeling, and multi-process orchestration are integrated.

0 commit comments

Comments
 (0)