Skip to content

chore: Agentic coding infrastructure — quality gates, CLAUDE.md refactoring, workflow rules #1021

@KonstantinMirin

Description

@KonstantinMirin

What This Is

A set of developer experience improvements for the salesagent project, focused on making quality standards explicit, enforceable, and discoverable — for both human and AI-assisted development.

The changes are on a branch and I'd love feedback before turning this into a PR. Some of these choices are opinionated and I want to make sure they work for the team.

Branch: feature/agentic-coding-improvements

Context

The project already has strong foundations — 11 pre-commit hooks, 281+ tests, 7 critical architecture patterns in CLAUDE.md, and an 8-job CI pipeline. But after auditing against some agentic coding best practices, I found a few gaps that affect both AI agents and human contributors. I wrote up the full audit in cm-docs/agentic-coding-plan.md on the branch.

The thinking behind these changes draws on patterns from a series on agentic development workflows:

These aren't prescriptive — they're patterns that worked well on other projects and seemed like a good fit here. Happy to adjust based on what the team prefers.

What Changed (and Why)

Makefile — unified quality gate

Currently you have to remember separate commands for format, lint, typecheck, and tests. This adds:

  • make quality — format check + lint + mypy + unit tests (one command before every commit)
  • make quality-full — above + integration/e2e with PostgreSQL
  • make lint-fix / make test-fast — convenience targets

Question for the team: Is a Makefile the right choice here, or would you prefer a script or pyproject.toml scripts section?

pyproject.toml — ruff complexity rules

Added C90 (cyclomatic complexity, max=10) and PLR (pylint-refactor) with max-args=5, max-branches=12, max-statements=50. These catch god functions before they're committed.

All 166 existing files with violations get per-file-ignores so nothing breaks — only new code is held to the stricter standard. Also auto-fixed 12 safe violations (collapsible else-if, useless return) and added justification comments to every existing ruff ignore entry.

Question for the team: Are these thresholds reasonable? The per-file-ignores approach means existing code isn't blocked, but I want to make sure the thresholds feel right for new code.

CLAUDE.md — 500 → 149 lines

The original CLAUDE.md loaded 500 lines into every AI session whether relevant or not. The refactored version keeps the 7 critical patterns, rules, and decision tree — and extracts everything else into on-demand files:

  • .claude/rules/patterns/code-patterns.md — SQLAlchemy, imports, type checking
  • .claude/rules/patterns/testing-patterns.md — fixtures, quality rules
  • .claude/rules/patterns/mcp-patterns.md — MCP client, CLI, A2A patterns

Sections that duplicated existing docs/ content (adapters, deployment, configuration) were simply removed with pointers to the existing docs.

Also added: doc-first rule, noqa justification policy, self-improvement rule.

Question for the team: Does the 149-line version still cover what you need at a glance? Is anything missing from the core file that should stay always-loaded?

.claude/rules/workflows/ — structured workflow docs

Seven workflow guides covering quality gates, TDD, bug reporting, research, subagents, session completion, and issue tracking. These are referenced from CLAUDE.md's reference table and loaded on demand.

.claude/agents/qc-validator.md and .claude/commands/research.md

A QC validation agent and a /research command for exploring tasks before implementation. These are optional tooling — they don't affect the main codebase.

Other

  • src/adapters/test_scenario_parser.py — Added __test__ = False to suppress PytestCollectionWarning
  • 60 files reformatted by ruff after auto-fix of PLR violations

What I'd Like Feedback On

  1. Makefile vs alternatives — is this the right ergonomics for this project?
  2. Complexity thresholds — are max-complexity=10, max-args=5, max-branches=12, max-statements=50 reasonable?
  3. CLAUDE.md split — does the 149-line core cover the essentials? Should anything move back?
  4. Workflow docs — are these useful, or too much process overhead?
  5. Anything else — things I missed, things that don't fit the project culture, concerns

Diff Stats

85 files changed, 1563 insertions(+), 966 deletions(-)

Most of the file count is from ruff auto-fixes and reformatting. The substantive changes are in CLAUDE.md, pyproject.toml, Makefile, and the .claude/ directory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions