|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +## Project Overview |
| 4 | + |
| 5 | +ROCK (Reinforcement Open Construction Kit) is a sandbox environment management framework for agentic AI / reinforcement learning. It provides multi-protocol sandbox lifecycle management with Docker, Ray, and Kubernetes deployment backends. |
| 6 | + |
| 7 | +Package name: `rl-rock`, version 1.3.0, Python 3.10–3.12. |
| 8 | + |
| 9 | +## Quick Reference |
| 10 | + |
| 11 | +```bash |
| 12 | +# Setup |
| 13 | +make init # Create venv, install deps, hooks, preflight checks |
| 14 | +uv sync --all-extras --all-groups # Install all dependencies |
| 15 | + |
| 16 | +# Test |
| 17 | +uv run pytest -m "not need_ray and not need_admin and not need_admin_and_network" --reruns 1 # Fast tests |
| 18 | +uv run pytest -m "need_ray" --reruns 1 # Ray tests |
| 19 | +uv run pytest -m "need_admin" --reruns 1 # Admin tests |
| 20 | + |
| 21 | +# Lint / Format |
| 22 | +uv run ruff check --fix . # Lint with autofix |
| 23 | +uv run ruff format . # Format |
| 24 | +``` |
| 25 | + |
| 26 | +## Architecture |
| 27 | + |
| 28 | +### Services (Entry Points) |
| 29 | + |
| 30 | +| Command | Module | Role | |
| 31 | +|-----------|---------------------------|----------------------------------------------------| |
| 32 | +| `rock` | `rock.cli.main` | CLI tool (admin start, sandbox build/push/run) | |
| 33 | +| `admin` | `rock.admin.main` | FastAPI orchestrator — sandbox lifecycle via API | |
| 34 | +| `rocklet` | `rock.rocklet.server` | FastAPI proxy — runs inside containers, executes commands | |
| 35 | +| `envhub` | `rock.envhub.server` | Environment repository CRUD server | |
| 36 | + |
| 37 | +### Core Module Map |
| 38 | + |
| 39 | +``` |
| 40 | +rock/ |
| 41 | +├── admin/ # Admin service: API routers, Ray service, scheduler, metrics |
| 42 | +├── sandbox/ # SandboxManager, Operators (Ray/K8s), SandboxActor |
| 43 | +├── deployments/ # AbstractDeployment → Docker/Ray/Local/Remote, configs, validator |
| 44 | +├── rocklet/ # Lightweight sandbox runtime server |
| 45 | +├── sdk/ # Client SDK: Sandbox client, agent integrations, EnvHub client |
| 46 | +├── envhub/ # Environment hub service with SQLModel database |
| 47 | +├── actions/ # Request/response models for sandbox and env actions |
| 48 | +├── config.py # Dataclass-based config (RayConfig, K8sConfig, RuntimeConfig, ...) |
| 49 | +├── env_vars.py # Environment variables with lazy defaults via __getattr__ |
| 50 | +├── cli/ # CLI commands (argparse) |
| 51 | +└── utils/ # Docker wrapper, Redis/Nacos providers, HTTP, retry, crypto, etc. |
| 52 | +``` |
| 53 | + |
| 54 | +### Key Patterns |
| 55 | + |
| 56 | +- **Operator pattern**: `AbstractOperator` → `RayOperator` / `K8sOperator` — decouples scheduling from execution |
| 57 | +- **Deployment hierarchy**: `AbstractDeployment` → `DockerDeployment` → `RayDeployment`, plus `LocalDeployment`, `RemoteDeployment` |
| 58 | +- **Actor pattern (Ray)**: `SandboxActor` (remote, detached) wraps a `DockerDeployment` instance |
| 59 | +- **Config flow**: `SandboxManager` → `DeploymentManager.init_config()` (normalize config) → `Operator.submit()` (orchestrate) |
| 60 | +- **Validation**: `SandboxValidator` / `DockerSandboxValidator` → `DockerUtil` (shell out to docker CLI) |
| 61 | + |
| 62 | +## Code Conventions |
| 63 | + |
| 64 | +### Style |
| 65 | + |
| 66 | +- Line length: 120 (`ruff`) |
| 67 | +- Lint rules: `E, F, I, W, UP` (pycodestyle, pyflakes, isort, warnings, pyupgrade) |
| 68 | +- Ignored: `E501` (line length), `F811` (redefinition), `E741` (ambiguous names) |
| 69 | +- Import order: stdlib → third-party → local, managed by `ruff` isort |
| 70 | + |
| 71 | +### Naming |
| 72 | + |
| 73 | +- Classes: `PascalCase` (`SandboxManager`, `RayOperator`) |
| 74 | +- Functions/methods: `snake_case` |
| 75 | +- Constants: `UPPER_SNAKE_CASE` |
| 76 | +- Private: `_leading_underscore` |
| 77 | + |
| 78 | +### Logging |
| 79 | + |
| 80 | +- Always use `from rock.logger import init_logger; logger = init_logger(__name__)` |
| 81 | +- Context vars for distributed tracing: `sandbox_id_ctx_var`, `trace_id_ctx_var` |
| 82 | + |
| 83 | +### Error Handling |
| 84 | + |
| 85 | +- Custom exceptions: `BadRequestRockError` (4xxx), `InternalServerRockError` (5xxx), `CommandRockError` (6xxx) |
| 86 | +- Status codes defined in `rock._codes` |
| 87 | + |
| 88 | +### Data Models |
| 89 | + |
| 90 | +- Pydantic v2 for API request/response models and deployment configs |
| 91 | +- `dataclass` for internal configs (`RockConfig`, `RayConfig`, etc.) |
| 92 | +- SQLModel for database ORM (`envhub`) |
| 93 | + |
| 94 | +### Async |
| 95 | + |
| 96 | +- `asyncio_mode = "auto"` in pytest — all async tests run automatically |
| 97 | +- FastAPI async handlers throughout |
| 98 | +- Ray async operations via `async_ray_get()`, `async_ray_get_actor()` |
| 99 | + |
| 100 | +## Testing |
| 101 | + |
| 102 | +### Structure |
| 103 | + |
| 104 | +``` |
| 105 | +tests/ |
| 106 | +├── unit/ # Fast isolated tests |
| 107 | +│ ├── conftest.py # Fixtures: rock_config, redis_provider (FakeRedis), sandbox_manager, ray_* |
| 108 | +│ ├── sandbox/ # SandboxManager tests |
| 109 | +│ ├── rocklet/ # Rocklet tests |
| 110 | +│ └── admin/ # Admin tests |
| 111 | +├── integration/ # Tests needing external services (Docker, network) |
| 112 | +│ └── conftest.py # SKIP_IF_NO_DOCKER, rocklet/admin remote servers |
| 113 | +└── conftest.py # Global config |
| 114 | +``` |
| 115 | + |
| 116 | +### Markers |
| 117 | + |
| 118 | +| Marker | Purpose | |
| 119 | +|----------------------------|--------------------------------------| |
| 120 | +| `@pytest.mark.need_ray` | Requires running Ray cluster | |
| 121 | +| `@pytest.mark.need_admin` | Requires admin service | |
| 122 | +| `@pytest.mark.need_admin_and_network` | Requires admin + network | |
| 123 | +| `@pytest.mark.slow` | Long-running tests | |
| 124 | +| `@pytest.mark.integration` | Integration tests | |
| 125 | + |
| 126 | +Strict markers enabled (`--strict-markers`). All markers must be registered in `pyproject.toml`. |
| 127 | + |
| 128 | +### CI Pipeline (`.github/workflows/python-ci.yml`) |
| 129 | + |
| 130 | +Runs in 4 phases on `self-hosted` runner: |
| 131 | +1. Fast tests (no external deps) |
| 132 | +2. Ray-dependent tests |
| 133 | +3. Admin-dependent tests |
| 134 | +4. Network-dependent tests |
| 135 | + |
| 136 | +## Configuration |
| 137 | + |
| 138 | +### Environment Variables |
| 139 | + |
| 140 | +All defined in `rock/env_vars.py` with lazy evaluation via module `__getattr__`. Key variables: |
| 141 | + |
| 142 | +| Variable | Default | Purpose | |
| 143 | +|-------------------------|------------------------|-----------------------------| |
| 144 | +| `ROCK_ADMIN_ENV` | `dev` | Environment: local/dev/test | |
| 145 | +| `ROCK_WORKER_ENV_TYPE` | `local` | Worker type: local/docker/uv/pip | |
| 146 | +| `ROCK_CONFIG` | (none) | Path to YAML config file | |
| 147 | +| `ROCK_BASE_URL` | `http://localhost:8080` | Admin service URL | |
| 148 | +| `ROCK_LOGGING_LEVEL` | `INFO` | Log level | |
| 149 | +| `ROCK_TIME_ZONE` | `Asia/Shanghai` | Timezone | |
| 150 | +| `ROCK_RAY_NAMESPACE` | `xrl-sandbox` | Ray namespace | |
| 151 | + |
| 152 | +### YAML Config (`rock-conf/`) |
| 153 | + |
| 154 | +Loaded by `RockConfig.from_env()`. Files: `rock-local.yml`, `rock-dev.yml`, `rock-test.yml`. |
| 155 | + |
| 156 | +Key sections: `ray`, `k8s`, `runtime` (operator_type, standard_spec, max_allowed_spec), `redis`, `proxy_service`, `scheduler`. |
| 157 | + |
| 158 | +## Git Workflow |
| 159 | + |
| 160 | +### Branch & PR Rules |
| 161 | + |
| 162 | +1. **先建 Issue** — 任何代码变更必须先创建 GitHub Issue 描述问题或需求 |
| 163 | +2. **创建分支** — 从 `master` 拉分支开发 |
| 164 | +3. **PR 必须关联 Issue** — CI 会通过 `pr-issue-link-check.yml` 检查,未关联的 PR 会被阻断 |
| 165 | + |
| 166 | +PR 关联 Issue 的方式(二选一): |
| 167 | +- PR body 中使用关键字:`fixes #123`、`closes #123`、`resolves #123`、`refs #123` |
| 168 | +- PR title 中包含 issue 编号:`[FEATURE] Add new feature (#123)` |
| 169 | + |
| 170 | +```bash |
| 171 | +# 典型流程 |
| 172 | +# 1. 在 GitHub 上创建 Issue |
| 173 | +# 2. 本地创建分支 |
| 174 | +git checkout -b feat/my-feature master |
| 175 | + |
| 176 | +# 3. 开发、提交 |
| 177 | +git add <files> |
| 178 | +git commit -m "feat: add my feature" |
| 179 | + |
| 180 | +# 4. 推送并创建 PR(body 中关联 issue) |
| 181 | +git push -u origin feat/my-feature |
| 182 | +gh pr create --title "feat: add my feature" --body "fixes #123" |
| 183 | +``` |
| 184 | + |
| 185 | +### Pre-commit Hooks |
| 186 | + |
| 187 | +- `ruff --fix` (lint) + `ruff format` (format) |
| 188 | +- Custom hook: prevents mixing internal (`xrl/`, `intetest/`) and external files in one commit |
| 189 | + |
| 190 | +## Dependencies |
| 191 | + |
| 192 | +Core: FastAPI, Pydantic v2, Ray 2.43.0, Redis, Docker CLI, Kubernetes client, APScheduler, OpenTelemetry, httpx. |
| 193 | + |
| 194 | +Package manager: `uv` (Rust-based). Build: setuptools + wheel. |
0 commit comments