- Prioritize a TDD mindset even when authoring specs: capture expected behaviors, tests, and coverage goals before describing implementation details.
- When delivering work, outline the tests that must be written and ensure they are executed first; only proceed to implementation once failing tests exist.
- Check lint status first with
_flake8.ps1; if issues appear, run_autopep8.ps1, apply any remaining manual fixes, then rerun_flake8.ps1to confirm a clean result before handoff. - Maintain code coverage above 95%. If new work threatens this threshold, expand tests until the target is met.
- Create empty
__init__.pyfiles in new Python packages/directories to ensure proper module recognition. - Avoid
__all__or explicit export statements at the end of Python files; rely on natural module structure. - Use descriptive variable names that clearly indicate their purpose and data type (e.g.,
user_idinstead ofuid,is_authenticatedinstead ofauth). - Document any assumptions, verification steps, and follow-up actions so future agents can continue seamlessly.
- Prefer using centralized config src\config\config.py instead of os.getenv
- Source code should follow SOLID principles
- The project uses a modular architecture with clear separation:
src/config/for configuration logic,src/telemetry/for observability, and core modules (main.py,proxy.py,cli.py,utils.py) for orchestration. - Configuration flows through: environment loading → CLI parsing → YAML generation → proxy startup. Always respect this pipeline when making changes.
- The proxy wraps LiteLLM and exposes an OpenAI-compatible API. Changes to model handling should consider both LiteLLM's capabilities and OpenAI client expectations.
- Telemetry uses FastAPI middleware patterns. When adding instrumentation, follow the existing middleware structure in
src/telemetry/middleware.py.
- Use context managers for temporary files and resources (see
utils.pyfor examples). - Model specifications are defined in
src/config/models.pyasModelSpecdataclasses. Always validate new model additions against this structure. - Environment variable parsing follows the pattern:
MODEL_<KEY>_<PROPERTY>for multi-model configs. Maintain this convention for consistency. - Import style: Use
from __future__ import annotationsfor type hints, prefer explicit imports over wildcards, use relative imports withinsrc/.
- Unit tests go in
tests/unit/, integration tests intests/integration/. - Mock external dependencies in unit tests (LiteLLM, HTTP clients, file I/O).
- Integration tests should check for required environment variables and skip gracefully if missing.
- Use
pytest-mockfor mocking,pytest-covfor coverage reporting. - Run full test suite with
pytestbefore considering work complete.
- The Docker setup uses volume mounts for live reloading during development (
docker-compose.yml). - Production image uses editable install (
pip install -e .) to support hot-reloading. - Environment variables can be passed via
.envfiles or--env-fileflag. - The entrypoint script (
entrypoint.sh) handles signal forwarding and graceful shutdown.
- Check
htmlcov/index.htmlfor coverage reports after running tests. - Use
--print-configflag to debug configuration generation without starting the proxy. - Telemetry logs are structured JSON - parse them for debugging request flows.
- When proxy startup fails, check: environment variables, model API keys, LiteLLM version compatibility.