-
Notifications
You must be signed in to change notification settings - Fork 23
Description
What This Is
A set of developer experience improvements for the salesagent project, focused on making quality standards explicit, enforceable, and discoverable — for both human and AI-assisted development.
The changes are on a branch and I'd love feedback before turning this into a PR. Some of these choices are opinionated and I want to make sure they work for the team.
Branch: feature/agentic-coding-improvements
Context
The project already has strong foundations — 11 pre-commit hooks, 281+ tests, 7 critical architecture patterns in CLAUDE.md, and an 8-job CI pipeline. But after auditing against some agentic coding best practices, I found a few gaps that affect both AI agents and human contributors. I wrote up the full audit in cm-docs/agentic-coding-plan.md on the branch.
The thinking behind these changes draws on patterns from a series on agentic development workflows:
- Let the Agent Check Its Work — why a single quality gate command matters, complexity thresholds, and lint suppression policies
- Stop Using Claude Code Like a Copilot — why a 500-line CLAUDE.md hurts more than it helps, and the case for progressive disclosure
- Teach Your Agent to Search Before It Invents — doc-first rules to prevent hallucinated API usage
- Requirements Before Implementation — the research phase that prevents requirements from emerging mid-implementation
- The Diagnosis — diagnosing gaps in agentic coding setups
These aren't prescriptive — they're patterns that worked well on other projects and seemed like a good fit here. Happy to adjust based on what the team prefers.
What Changed (and Why)
Makefile — unified quality gate
Currently you have to remember separate commands for format, lint, typecheck, and tests. This adds:
make quality— format check + lint + mypy + unit tests (one command before every commit)make quality-full— above + integration/e2e with PostgreSQLmake lint-fix/make test-fast— convenience targets
Question for the team: Is a Makefile the right choice here, or would you prefer a script or pyproject.toml scripts section?
pyproject.toml — ruff complexity rules
Added C90 (cyclomatic complexity, max=10) and PLR (pylint-refactor) with max-args=5, max-branches=12, max-statements=50. These catch god functions before they're committed.
All 166 existing files with violations get per-file-ignores so nothing breaks — only new code is held to the stricter standard. Also auto-fixed 12 safe violations (collapsible else-if, useless return) and added justification comments to every existing ruff ignore entry.
Question for the team: Are these thresholds reasonable? The per-file-ignores approach means existing code isn't blocked, but I want to make sure the thresholds feel right for new code.
CLAUDE.md — 500 → 149 lines
The original CLAUDE.md loaded 500 lines into every AI session whether relevant or not. The refactored version keeps the 7 critical patterns, rules, and decision tree — and extracts everything else into on-demand files:
.claude/rules/patterns/code-patterns.md— SQLAlchemy, imports, type checking.claude/rules/patterns/testing-patterns.md— fixtures, quality rules.claude/rules/patterns/mcp-patterns.md— MCP client, CLI, A2A patterns
Sections that duplicated existing docs/ content (adapters, deployment, configuration) were simply removed with pointers to the existing docs.
Also added: doc-first rule, noqa justification policy, self-improvement rule.
Question for the team: Does the 149-line version still cover what you need at a glance? Is anything missing from the core file that should stay always-loaded?
.claude/rules/workflows/ — structured workflow docs
Seven workflow guides covering quality gates, TDD, bug reporting, research, subagents, session completion, and issue tracking. These are referenced from CLAUDE.md's reference table and loaded on demand.
.claude/agents/qc-validator.md and .claude/commands/research.md
A QC validation agent and a /research command for exploring tasks before implementation. These are optional tooling — they don't affect the main codebase.
Other
src/adapters/test_scenario_parser.py— Added__test__ = Falseto suppressPytestCollectionWarning- 60 files reformatted by ruff after auto-fix of PLR violations
What I'd Like Feedback On
- Makefile vs alternatives — is this the right ergonomics for this project?
- Complexity thresholds — are
max-complexity=10,max-args=5,max-branches=12,max-statements=50reasonable? - CLAUDE.md split — does the 149-line core cover the essentials? Should anything move back?
- Workflow docs — are these useful, or too much process overhead?
- Anything else — things I missed, things that don't fit the project culture, concerns
Diff Stats
85 files changed, 1563 insertions(+), 966 deletions(-)
Most of the file count is from ruff auto-fixes and reformatting. The substantive changes are in CLAUDE.md, pyproject.toml, Makefile, and the .claude/ directory.