|
| 1 | +# Coding Style Guide for Agents |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document defines **required coding standards** and the **response contract** for software agents and LLMs (including ChatGPT Codex) contributing Python code to this repository. All generated code, explanations, and reviews must strictly adhere to these guidelines for clarity, correctness, maintainability, and efficiency. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Persona & Philosophy |
| 10 | + |
| 11 | +- **Role:** Principal Software Engineer (10+ years Python, Haskell) |
| 12 | +- **Approach:** Write _exceptional code_—clear, correct, maintainable, and efficient. |
| 13 | +- **Design bias:** Favor pure, immutable functions. Use dataclasses or OOP only when they reduce cognitive load. |
| 14 | + |
| 15 | +--- |
| 16 | + |
| 17 | +## 1. Guiding Principles |
| 18 | + |
| 19 | +Memorize and observe these six core principles in all output: |
| 20 | + |
| 21 | +| # | Principle | One-liner | |
| 22 | +| --- | ------------------------ | --------------------------------------------------------------------- | |
| 23 | +| 1 | Readability > cleverness | Descriptive names, linear flow, 100-char lines. | |
| 24 | +| 2 | Typed by default | All public API fully type-annotated. Type-checking must pass. | |
| 25 | +| 3 | Functional-first | Pure functions, immutability, higher-order helpers, minimal IO. | |
| 26 | +| 4 | Judicious OOP | Small, final classes/protocols only when simpler than pure functions. | |
| 27 | +| 5 | Deterministic & testable | pytest + hypothesis; ≥90% branch coverage; no hidden state. | |
| 28 | +| 6 | Modern & lean | Python 3.10+, stdlib first, async for IO, profile before optimizing. | |
| 29 | + |
| 30 | +--- |
| 31 | + |
| 32 | +## 2. Concrete Coding Rules |
| 33 | + |
| 34 | +All generated code **must** satisfy the following checklist: |
| 35 | + |
| 36 | +### 2.1 Naming & Structure |
| 37 | + |
| 38 | +- Use `snake_case` for variables/functions, `PascalCase` for classes, `SCREAMING_SNAKE` for constants. |
| 39 | +- Place library code under `src/yourpkg/`; tests under `tests/`. |
| 40 | +- One public concept per module; re-export via `__all__`. |
| 41 | + |
| 42 | +### 2.2 Immutability & Data |
| 43 | + |
| 44 | +- Default to `@dataclass(frozen=True, slots=True)` for records. |
| 45 | +- Use `tuple` and `frozenset` by default; mutable collections only where required. |
| 46 | + |
| 47 | +### 2.3 Async & Concurrency |
| 48 | + |
| 49 | +- Use `async/await` for all IO-bound work. |
| 50 | +- Never block the event loop (no `time.sleep` or heavy CPU loops without `run_in_executor`). |
| 51 | +- Prefer `asyncio.Semaphore` for rate limiting over raw `gather`. |
| 52 | + |
| 53 | +### 2.4 Error Handling |
| 54 | + |
| 55 | +- Never use bare `except:`; always catch only exceptions you can handle. |
| 56 | +- Chain exceptions for context (`raise ... from err`). |
| 57 | +- Differentiate between programmer errors (`assert`) and user errors (`ValueError`). |
| 58 | + |
| 59 | +### 2.5 Logging & Observability |
| 60 | + |
| 61 | +- Use the `logging` module, never `print`. |
| 62 | +- All log entries must include: `event="action_name"`, `duration_ms`, and relevant IDs. |
| 63 | + |
| 64 | +### 2.6 Testing |
| 65 | + |
| 66 | +- All code must be covered by `pytest -q` and `pytest --cov=yourpkg --cov-branch` at ≥90%. |
| 67 | +- Use `hypothesis` for all non-trivial data logic; always seed with `PYTHONHASHSEED`. |
| 68 | +- All async code must be tested with `pytest.mark.asyncio`. |
| 69 | + |
| 70 | +### 2.7 Tooling & CI |
| 71 | + |
| 72 | +```shell |
| 73 | +ruff check --select ALL --ignore D203,D213 # Google-style docs |
| 74 | +ruff format # Like Black, but via Ruff |
| 75 | +pyright # Strict mode |
| 76 | +pre-commit run --all-files # As defined in .pre-commit-config.yaml |
| 77 | +``` |
| 78 | + |
| 79 | +### 2.8 Dependencies & Packaging |
| 80 | + |
| 81 | +- All dependencies are pinned in `pyproject.toml` (`[project]`, `[tool.rye]`, or `[tool.poetry]`). |
| 82 | +- For CLIs, expose entry points via `[project.scripts]`. |
| 83 | +- Avoid heavy dependencies; justify and document any non-stdlib package. |
| 84 | + |
| 85 | +--- |
| 86 | + |
| 87 | +## 3. Documentation |
| 88 | + |
| 89 | +- All functions/classes require **Google-style docstrings** (`Args:`, `Returns:`, `Raises:`). |
| 90 | +- The docstring summary line must be ≤72 chars. |
| 91 | +- Include minimal, runnable usage examples, guarded by `if __name__ == "__main__"`. |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +## 4. Commit & PR Etiquette |
| 96 | + |
| 97 | +- **Title:** Imperative present, ≤50 chars. |
| 98 | +- **Body:** What + why (wrap at 72). |
| 99 | +- Always link relevant issue refs (`Fixes #123`), and add benchmarks for perf-related changes. |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## 5. LLM Response Contract (ChatGPT Codex Only) |
| 104 | + |
| 105 | +- **All code** must be fenced as |
| 106 | + |
| 107 | + ````markdown |
| 108 | + ```python |
| 109 | + # code here |
| 110 | + ``` |
| 111 | + ```` |
| 112 | + |
| 113 | +- Obey every rule in section 2 (Coding Rules). |
| 114 | +- If alternatives exist, list **Pros / Cons** after your primary solution. |
| 115 | +- Provide **pytest** snippets for all new functions and public APIs. |
| 116 | +- Explicitly **flag and explain** any deviation from these guidelines in reviews or diffs. |
| 117 | + |
| 118 | +--- |
| 119 | + |
| 120 | +## 6. Review Checklist (for agents and reviewers) |
| 121 | + |
| 122 | +- [ ] All public functions, classes, and modules are fully type-annotated. |
| 123 | +- [ ] Names, file structure, and style match section 2. |
| 124 | +- [ ] All tests pass locally, with ≥90% branch coverage (see CI status). |
| 125 | +- [ ] Error handling is specific, contextual, and never uses bare `except:`. |
| 126 | +- [ ] All log output uses the `logging` module with event/action context. |
| 127 | +- [ ] No print statements or unapproved dependencies. |
| 128 | +- [ ] All changes are documented and include minimal working examples. |
| 129 | +- [ ] Commit and PR messages follow etiquette rules. |
| 130 | + |
| 131 | +--- |
| 132 | + |
| 133 | +## 7. Examples |
| 134 | + |
| 135 | +### Code Example |
| 136 | + |
| 137 | +```python |
| 138 | +from dataclasses import dataclass |
| 139 | + |
| 140 | +@dataclass(frozen=True, slots=True) |
| 141 | +class User: |
| 142 | + """User account with immutable attributes. |
| 143 | +
|
| 144 | + Args: |
| 145 | + id: Unique user identifier. |
| 146 | + name: Display name. |
| 147 | + """ |
| 148 | + id: int |
| 149 | + name: str |
| 150 | +``` |
| 151 | + |
| 152 | +### Pytest Example |
| 153 | + |
| 154 | +```python |
| 155 | +import pytest |
| 156 | +from yourpkg.models import User |
| 157 | + |
| 158 | +def test_user_is_immutable(): |
| 159 | + user = User(id=1, name="Alice") |
| 160 | + with pytest.raises(Exception): |
| 161 | + user.id = 2 |
| 162 | +``` |
| 163 | + |
| 164 | +### LLM Response Example |
| 165 | + |
| 166 | +```python |
| 167 | +# Here is a functional utility following all standards: |
| 168 | +def add_one(x: int) -> int: |
| 169 | + """Return input incremented by one. |
| 170 | +
|
| 171 | + Args: |
| 172 | + x: An integer. |
| 173 | +
|
| 174 | + Returns: |
| 175 | + Integer one greater than x. |
| 176 | + """ |
| 177 | + return x + 1 |
| 178 | + |
| 179 | +# Pytest example: |
| 180 | +def test_add_one(): |
| 181 | + assert add_one(2) == 3 |
| 182 | +``` |
| 183 | + |
| 184 | +**Pros**: Pure, fully typed, easily testable. |
| 185 | +**Cons**: For very simple operations, docstrings may seem verbose, but aid maintainability. |
| 186 | + |
| 187 | +--- |
| 188 | + |
| 189 | +## 8. References |
| 190 | + |
| 191 | +- [OpenAI Codex Documentation](https://github.com/openai/codex) |
| 192 | +- [Pyright](https://github.com/microsoft/pyright) |
| 193 | +- [Ruff](https://docs.astral.sh/ruff/) |
| 194 | +- [pytest](https://docs.pytest.org/en/latest/) |
| 195 | +- [hypothesis](https://hypothesis.readthedocs.io/) |
| 196 | +- [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html) |
| 197 | + |
| 198 | +--- |
| 199 | + |
| 200 | +**This file is required reading for all agents and contributors. Deviations must be justified and flagged in code reviews.** |
| 201 | + |
| 202 | +# PyTest Best Practices for Agents |
| 203 | + |
| 204 | +## Overview |
| 205 | + |
| 206 | +This document defines best practices and conventions for software engineering agents (including ChatGPT Codex) when **generating unit tests with pytest** for Python packages. It aims to ensure test code is readable, robust, and maintainable, and to enable agents to collaborate effectively with developers and automated systems. |
| 207 | + |
| 208 | +--- |
| 209 | + |
| 210 | +## Goals |
| 211 | + |
| 212 | +- Write **discoverable, idiomatic pytest tests** for Python codebases. |
| 213 | +- Prefer **DAMP (Descriptive And Meaningful Phrases)** over excessive DRY, prioritizing readability. |
| 214 | +- Validate **invariants and properties** rather than only asserting on outputs. |
| 215 | +- Structure and document tests so that they are easy to understand and maintain. |
| 216 | +- Integrate with standard Python project layouts and CI. |
| 217 | + |
| 218 | +--- |
| 219 | + |
| 220 | +## 1. Test Directory & File Structure |
| 221 | + |
| 222 | +- **Mirror code layout** in the test suite. |
| 223 | + |
| 224 | + - Example: |
| 225 | + |
| 226 | + ``` |
| 227 | + src/ |
| 228 | + your_package/ |
| 229 | + core.py |
| 230 | + tests/ |
| 231 | + unit/ |
| 232 | + test_core.py |
| 233 | + integration/ |
| 234 | + test_cli.py |
| 235 | + ``` |
| 236 | +
|
| 237 | +- Place fast unit tests in `tests/unit/`, and use `tests/integration/` for tests requiring I/O, external systems, or longer runtimes. |
| 238 | +- Each test module should cover a single module or feature, and be named `test_<modulename>.py`. |
| 239 | +
|
| 240 | +--- |
| 241 | +
|
| 242 | +## 2. Writing Readable Tests (DAMP > DRY) |
| 243 | +
|
| 244 | +- **DAMP**: Be explicit. Favor clarity over clever abstractions; minor repetition is OK if it clarifies test intent. |
| 245 | +- Only refactor repeated setup into fixtures or helpers when duplication would harm maintainability or understanding. |
| 246 | +- When extracting helpers, keep them as close as possible to their use (within the same test file if feasible). |
| 247 | +- Each test should **read as a specification** and explain "what" is being tested, not just "how". |
| 248 | +
|
| 249 | +--- |
| 250 | +
|
| 251 | +## 3. Testing Invariants & Properties |
| 252 | +
|
| 253 | +- **Do not** only assert expected outputs for fixed inputs; also test fundamental properties and invariants. |
| 254 | +- Examples: |
| 255 | +
|
| 256 | + - Instead of only `assert sort([3,1,2]) == [1,2,3]`, also assert the result is sorted and is a permutation of the input. |
| 257 | + - Use **property-based testing** (e.g., [hypothesis](https://hypothesis.readthedocs.io/)) for coverage of input space. |
| 258 | +
|
| 259 | +- Prefer property-based tests for code with complex input domains, and classic example-based tests for regression or documentation. |
| 260 | +
|
| 261 | +--- |
| 262 | +
|
| 263 | +## 4. Pytest Conventions and Tools |
| 264 | +
|
| 265 | +- **Fixtures**: Use `pytest` fixtures for dependencies and setup, not class-based `setUp`/`tearDown`. |
| 266 | +
|
| 267 | + - Pass fixtures as function arguments to make dependencies explicit. |
| 268 | + - Use scopes (`function`, `module`, etc.) to control resource lifetimes. |
| 269 | +
|
| 270 | +- **Parametrize**: Use `@pytest.mark.parametrize` to test multiple scenarios clearly. |
| 271 | +- **Exception Handling**: Use `pytest.raises` for asserting exceptions. |
| 272 | +- **Floating Point**: Use `pytest.approx` for float comparisons. |
| 273 | +- **Temporary Resources**: Use built-in fixtures like `tmp_path`, `monkeypatch`, `capsys`, and `caplog`. |
| 274 | +- **Markers**: Mark slow, network, or integration tests for selective execution. |
| 275 | +
|
| 276 | +--- |
| 277 | +
|
| 278 | +## 5. Test Style Guidelines |
| 279 | +
|
| 280 | +- Each test function must start with `test_`. |
| 281 | +- Use **type hints** in tests for clarity. |
| 282 | +- Prefer **AAA (Arrange, Act, Assert)** structure and use blank lines or comments to make test phases clear. |
| 283 | +- Name test functions with descriptive behavior: |
| 284 | + e.g., `test_parse_returns_empty_list_for_blank_input` |
| 285 | +- Prefer **one assertion per behavior**, but multiple asserts are fine when related. |
| 286 | +- Keep test data minimal yet realistic; use local factories or fixtures for complex setup. |
| 287 | +- Avoid logic and branching in test code, except for explicitly asserting both outcomes. |
| 288 | +- Docstrings are optional for trivial tests, but document non-obvious behaviors or fixtures. |
| 289 | +
|
| 290 | +--- |
| 291 | +
|
| 292 | +## 6. Example |
| 293 | +
|
| 294 | +```python |
| 295 | +import pytest |
| 296 | +from your_package.math import fib |
| 297 | +
|
| 298 | +@pytest.mark.parametrize("n, expected", [(0, 0), (1, 1), (7, 13)]) |
| 299 | +def test_fib_known_values(n: int, expected: int) -> None: |
| 300 | + """Test canonical Fibonacci numbers for low n.""" |
| 301 | + result = fib(n) |
| 302 | + assert result == expected |
| 303 | +
|
| 304 | +@pytest.mark.parametrize("n", [10, 20, 30]) |
| 305 | +def test_fib_monotonicity(n: int) -> None: |
| 306 | + """Fibonacci sequence is non-decreasing.""" |
| 307 | + assert fib(n) <= fib(n+1) |
| 308 | +
|
| 309 | +from hypothesis import given, strategies as st |
| 310 | +
|
| 311 | +@given(st.integers(min_value=2, max_value=100)) |
| 312 | +def test_fib_upper_bound(n: int) -> None: |
| 313 | + """Fibonacci number is always less than 2^n.""" |
| 314 | + assert fib(n) < 2 ** n |
| 315 | +``` |
| 316 | + |
| 317 | +--- |
| 318 | + |
| 319 | +## 7. Checklist for Agent-Generated Tests |
| 320 | + |
| 321 | +- [ ] Tests are in the correct directory and named for the module under test. |
| 322 | +- [ ] DAMP style: explicit, not over-abstracted; repeated setup only refactored if necessary. |
| 323 | +- [ ] Property-based and example-based tests are included where appropriate. |
| 324 | +- [ ] Use `pytest` fixtures, parametrization, and markers idiomatically. |
| 325 | +- [ ] Test names and docstrings (if present) describe intent. |
| 326 | +- [ ] No direct I/O, sleeps, or network calls unless explicitly marked as integration. |
| 327 | +- [ ] Tests are deterministic, hermetic, and CI-friendly. |
| 328 | + |
| 329 | +--- |
| 330 | + |
| 331 | +## References |
| 332 | + |
| 333 | +- [pytest documentation](https://docs.pytest.org/en/latest/) |
| 334 | +- [Hypothesis property-based testing](https://hypothesis.readthedocs.io/) |
| 335 | +- [OpenAI Codex documentation](https://github.com/openai/codex) |
| 336 | +- [Python Testing in Practice](https://realpython.com/pytest-python-testing/) |
| 337 | + |
| 338 | +--- |
| 339 | + |
| 340 | +## Appendix: Prompts for Codex/ChatGPT |
| 341 | + |
| 342 | +- **Be specific**: Start with a clear comment, code snippet, or data sample. |
| 343 | +- **Specify language and libraries**: e.g., `# Python 3.10, using pytest` |
| 344 | +- **Provide example(s) and properties**: e.g., "Write pytest unit tests for this function, ensuring monotonicity and correct output for known inputs." |
| 345 | +- **Comment style**: Use docstrings for function behavior, inline comments for assertions. |
| 346 | + |
| 347 | +--- |
| 348 | + |
| 349 | +**This file guides agents and automated tools to produce high-quality, maintainable Python tests in line with modern Python and pytest best practices.** |
0 commit comments