build: Bootstrap agent-ready infrastructure by dahlem · Pull Request #453 · red-hat-data-services/lm-evaluation-harness

dahlem · 2026-02-12T16:09:58Z

Summary

Bootstrap agent-ready infrastructure via agentready bootstrap
Add assessment report from agentready assess
Add CI workflows, issue/PR templates, pre-commit config, and other repo hygiene files

Files added/modified

.agentready/ — assessment reports and configuration
.github/workflows/ — CI workflows (agentready assessment, security, tests)
.github/ISSUE_TEMPLATE/ — issue templates
.github/PULL_REQUEST_TEMPLATE.md — PR template
.github/CODEOWNERS — code ownership
.github/dependabot.yml — dependency update config
.pre-commit-config.yaml — pre-commit hooks
CODE_OF_CONDUCT.md — code of conduct (if added)

Test plan

Verify CI workflows pass on the PR
Review agentready assessment report
Confirm no unintended file changes

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-12T16:10:41Z

🤖 AgentReady Assessment Report

Repository: lm-evaluation-harness
Path: /home/runner/work/lm-evaluation-harness/lm-evaluation-harness
Branch: HEAD | Commit: cb12b2e2
Assessed: February 12, 2026 at 4:10 PM
AgentReady Version: 2.27.0
Run by: runner@runnervmwffz4

📊 Summary

Metric	Value
Overall Score	56.7/100 🥉 Bronze (Tier Definitions)
Attributes Assessed	23/25
Attributes Not Assessed	2
Assessment Duration	3.2s

Languages Detected

YAML: 6586 files
Python: 507 files
JSON: 348 files
Markdown: 226 files
Shell: 5 files

Repository Stats

Total Files: 8,288
Total Lines: 153,577

🎯 Priority Improvements

Focus on these high-impact fixes first:

CLAUDE.md Configuration Files (Tier 1) - +10.0 points potential
- Create CLAUDE.md or AGENTS.md with project-specific configuration for AI coding assistants
Type Annotations (Tier 1) - +10.0 points potential
- Add type annotations to function signatures
Standard Project Layouts (Tier 1) - +10.0 points potential
- Organize code into standard directories (src/, tests/, docs/)
Conventional Commit Messages (Tier 2) - +3.0 points potential
- Configure conventional commits with commitlint
Inline Documentation (Tier 2) - +3.0 points potential
- Add docstrings to public functions and classes

📋 Detailed Findings

Findings sorted by priority (Tier 1 failures first, then Tier 2, etc.)

CLAUDE.md Configuration Files ❌ 0/100

📝 Remediation Steps

Measured: missing (Threshold: present)

Evidence:

CLAUDE.md not found in repository root
AGENTS.md not found (alternative)

Create CLAUDE.md or AGENTS.md with project-specific configuration for AI coding assistants

Choose one of three approaches:
Option 1: Create standalone CLAUDE.md (>50 bytes) with project context
Option 2: Create AGENTS.md and symlink CLAUDE.md to it (cross-tool compatibility)
Option 3: Create AGENTS.md and reference it with @AGENTS.md in minimal CLAUDE.md
Add project overview and purpose
Document key architectural patterns
Specify coding standards and conventions
Include build/test/deployment commands
Add any project-specific context that helps AI assistants

Commands:

# Option 1: Standalone CLAUDE.md
touch CLAUDE.md
# Add content describing your project

# Option 2: Symlink CLAUDE.md to AGENTS.md
touch AGENTS.md
# Add content to AGENTS.md
ln -s AGENTS.md CLAUDE.md

# Option 3: @ reference in CLAUDE.md
echo '@AGENTS.md' > CLAUDE.md
touch AGENTS.md
# Add content to AGENTS.md

Examples:

# Standalone CLAUDE.md (Option 1)

## Overview
Brief description of what this project does.

## Architecture
Key patterns and structure.

## Development
```bash
# Install dependencies
npm install

# Run tests
npm test

# Build
npm run build

Coding Standards

Use TypeScript strict mode
Follow ESLint configuration
Write tests for new features

CLAUDE.md with @ reference (Option 3)

@AGENTS.md

AGENTS.md (shared by multiple tools)

Project Overview

This project implements a REST API for user management.

Architecture

Layered architecture: controllers, services, repositories
PostgreSQL database with SQLAlchemy ORM
FastAPI web framework

Development Workflow

# Setup
python -m venv .venv
source .venv/bin/activate
pip install -e .

# Run tests
pytest

# Start server
uvicorn app.main:app --reload

Code Conventions

Use type hints for all functions
Follow PEP 8 style guide
Write docstrings for public APIs
Maintain >80% test coverage


</details>

![T1](https://img.shields.io/badge/T1-Type_Annotations_40--100-red) **Type Annotations** ❌ 40/100
<details>
<summary>📝 Remediation Steps</summary>

**Measured**: 32.3% (Threshold: ≥80%)

**Evidence**:
- Typed functions: 804/2492
- Coverage: 32.3%

Add type annotations to function signatures

1. For Python: Add type hints to function parameters and return types
2. For TypeScript: Enable strict mode in tsconfig.json
3. Use mypy or pyright for Python type checking
4. Use tsc --strict for TypeScript
5. Add type annotations gradually to existing code

**Commands**:
```bash
# Python
pip install mypy
mypy --strict src/

# TypeScript
npm install --save-dev typescript
echo '{"compilerOptions": {"strict": true}}' > tsconfig.json

Examples:

# Python - Before
def calculate(x, y):
    return x + y

# Python - After
def calculate(x: float, y: float) -> float:
    return x + y

// TypeScript - tsconfig.json
{
  "compilerOptions": {
    "strict": true,
    "noImplicitAny": true,
    "strictNullChecks": true
  }
}

Standard Project Layouts ❌ 50/100

📝 Remediation Steps

Measured: 1/2 directories (Threshold: 2/2 directories)

Evidence:

Found 1/2 standard directories
src/: ✗
tests/: ✓

Organize code into standard directories (src/, tests/, docs/)

Create src/ directory for source code
Create tests/ directory for test files
Create docs/ directory for documentation
Move source code into src/
Move tests into tests/

Commands:

mkdir -p src tests docs
# Move source files to src/
# Move test files to tests/

Dependency Security & Vulnerability Scanning ✅ 35/100

README Structure ✅ 100/100

Dependency Pinning for Reproducibility ✅ 100/100

Conventional Commit Messages ❌ 0/100

📝 Remediation Steps

Measured: not configured (Threshold: configured)

Evidence:

No commitlint or husky configuration

Configure conventional commits with commitlint

Install commitlint
Configure husky for commit-msg hook

Commands:

npm install --save-dev @commitlint/cli @commitlint/config-conventional husky

Inline Documentation ❌ 34/100

📝 Remediation Steps

Measured: 27.1% (Threshold: ≥80%)

Evidence:

Documented items: 732/2705
Coverage: 27.1%
Many public functions/classes lack docstrings

Add docstrings to public functions and classes

Identify functions/classes without docstrings
Add PEP 257 compliant docstrings for Python
Add JSDoc comments for JavaScript/TypeScript
Include: description, parameters, return values, exceptions
Add examples for complex functions
Run pydocstyle to validate docstring format

Commands:

# Install pydocstyle
pip install pydocstyle

# Check docstring coverage
pydocstyle src/

# Generate documentation
pip install sphinx
sphinx-apidoc -o docs/ src/

Examples:

# Python - Good docstring
def calculate_discount(price: float, discount_percent: float) -> float:
    """Calculate discounted price.

    Args:
        price: Original price in USD
        discount_percent: Discount percentage (0-100)

    Returns:
        Discounted price

    Raises:
        ValueError: If discount_percent not in 0-100 range

    Example:
        >>> calculate_discount(100.0, 20.0)
        80.0
    """
    if not 0 <= discount_percent <= 100:
        raise ValueError("Discount must be 0-100")
    return price * (1 - discount_percent / 100)

// JavaScript - Good JSDoc
/**
 * Calculate discounted price
 *
 * @param {number} price - Original price in USD
 * @param {number} discountPercent - Discount percentage (0-100)
 * @returns {number} Discounted price
 * @throws {Error} If discountPercent not in 0-100 range
 * @example
 * calculateDiscount(100.0, 20.0)
 * // Returns: 80.0
 */
function calculateDiscount(price, discountPercent) {
    if (discountPercent < 0 || discountPercent > 100) {
        throw new Error("Discount must be 0-100");
    }
    return price * (1 - discountPercent / 100);
}

.gitignore Completeness ❌ 42/100

📝 Remediation Steps

Measured: 5/12 patterns (Threshold: ≥70% of language-specific patterns)

Evidence:

.gitignore found (313 bytes)
Pattern coverage: 5/12 (42%)
Missing 7 recommended patterns

Add missing language-specific ignore patterns

Review GitHub's gitignore templates for your language
Add the 7 missing patterns
Ensure editor/IDE patterns are included

Examples:

# Missing patterns:
.pytest_cache/
.venv/
*.swo
*.py[cod]
.env

File Size Limits ❌ 56/100

📝 Remediation Steps

Measured: 7 huge, 14 large out of 508 (Threshold: <5% files >500 lines, 0 files >1000 lines)

Evidence:

Found 7 files >1000 lines (1.4% of 508 files)
Largest: lm_eval/api/task.py (1785 lines)

Refactor large files into smaller, focused modules

Identify files >1000 lines
Split into logical submodules
Extract classes/functions into separate files
Maintain single responsibility principle

Examples:

# Split large file:
# models.py (1500 lines) → models/user.py, models/product.py, models/order.py

Separation of Concerns ❌ 69/100

📝 Remediation Steps

Measured: organization:100, cohesion:96, naming:0 (Threshold: ≥75 overall)

Evidence:

Good directory organization (feature-based or flat)
File cohesion: 21/507 files >500 lines
Anti-pattern files found: utils.py, utils.py, utils.py

Refactor code to improve separation of concerns

Avoid layer-based directories (models/, views/, controllers/)
Organize by feature/domain instead (auth/, users/, billing/)
Break large files (>500 lines) into focused modules
Eliminate catch-all modules (utils.py, helpers.py)
Each module should have single, well-defined responsibility
Group related functions/classes by domain, not technical layer

Examples:

# Good: Feature-based organization
project/
├── auth/
│   ├── login.py
│   ├── signup.py
│   └── tokens.py
├── users/
│   ├── profile.py
│   └── preferences.py
└── billing/
    ├── invoices.py
    └── payments.py

# Bad: Layer-based organization
project/
├── models/
│   ├── user.py
│   ├── invoice.py
├── views/
│   ├── user_view.py
│   ├── invoice_view.py
└── controllers/
    ├── user_controller.py
    ├── invoice_controller.py

Concise Documentation ✅ 82/100

Test Coverage Requirements ✅ 100/100

Pre-commit Hooks & CI/CD Linting ✅ 100/100

One-Command Build/Setup ✅ 100/100

Architecture Decision Records (ADRs) ❌ 0/100

📝 Remediation Steps

Measured: no ADR directory (Threshold: ADR directory with decisions)

Evidence:

No ADR directory found (checked docs/adr/, .adr/, adr/, docs/decisions/)

Create Architecture Decision Records (ADRs) directory and document key decisions

Create docs/adr/ directory in repository root
Use Michael Nygard ADR template or MADR format
Document each significant architectural decision
Number ADRs sequentially (0001-.md, 0002-.md)
Include Status, Context, Decision, and Consequences sections
Update ADR status when decisions are revised (Superseded, Deprecated)

Commands:

# Create ADR directory
mkdir -p docs/adr

# Create first ADR using template
cat > docs/adr/0001-use-architecture-decision-records.md << 'EOF'
# 1. Use Architecture Decision Records

Date: 2025-11-22

## Status
Accepted

## Context
We need to record architectural decisions made in this project.

## Decision
We will use Architecture Decision Records (ADRs) as described by Michael Nygard.

## Consequences
- Decisions are documented with context
- Future contributors understand rationale
- ADRs are lightweight and version-controlled
EOF

Examples:

# Example ADR Structure

```markdown
# 2. Use PostgreSQL for Database

Date: 2025-11-22

## Status
Accepted

## Context
We need a relational database for complex queries and ACID transactions.
Team has PostgreSQL experience. Need full-text search capabilities.

## Decision
Use PostgreSQL 15+ as primary database.

## Consequences
- Positive: Robust ACID, full-text search, team familiarity
- Negative: Higher resource usage than SQLite
- Neutral: Need to manage migrations, backups


</details>

![T3](https://img.shields.io/badge/T3-Structured_Logging_0--100-red) **Structured Logging** ❌ 0/100
<details>
<summary>📝 Remediation Steps</summary>

**Measured**: not configured (Threshold: structured logging library)

**Evidence**:
- No structured logging library found
- Checked files: pyproject.toml, requirements.txt, setup.py
- Using built-in logging module (unstructured)

Add structured logging library for machine-parseable logs

1. Choose structured logging library (structlog for Python, winston for Node.js)
2. Install library and configure JSON formatter
3. Add standard fields: timestamp, level, message, context
4. Include request context: request_id, user_id, session_id
5. Use consistent field naming (snake_case for Python)
6. Never log sensitive data (passwords, tokens, PII)
7. Configure different formats for dev (pretty) and prod (JSON)

**Commands**:
```bash
# Install structlog
pip install structlog

# Configure structlog
# See examples for configuration

Examples:

# Python with structlog
import structlog

# Configure structlog
structlog.configure(
    processors=[
        structlog.stdlib.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer()
    ]
)

logger = structlog.get_logger()

# Good: Structured logging
logger.info(
    "user_login",
    user_id="123",
    email="user@example.com",
    ip_address="192.168.1.1"
)

# Bad: Unstructured logging
logger.info(f"User {user_id} logged in from {ip}")

OpenAPI/Swagger Specifications ❌ 0/100

📝 Remediation Steps

Measured: no OpenAPI spec (Threshold: OpenAPI 3.x spec present)

Evidence:

No OpenAPI specification found
Searched recursively for: openapi.yaml, openapi.yml, openapi.json, swagger.yaml, swagger.yml, swagger.json

Create OpenAPI specification for API endpoints

Create openapi.yaml in repository root
Define OpenAPI version 3.x
Document all API endpoints with full schemas
Add request/response examples
Define security schemes (API keys, OAuth, etc.)
Validate spec with Swagger Editor or Spectral
Generate API documentation with Swagger UI or ReDoc

Commands:

# Install OpenAPI validator
npm install -g @stoplight/spectral-cli

# Validate spec
spectral lint openapi.yaml

# Generate client SDK
npx @openapitools/openapi-generator-cli generate \
  -i openapi.yaml \
  -g python \
  -o client/

Examples:

# openapi.yaml - Minimal example
openapi: 3.1.0
info:
  title: My API
  version: 1.0.0
  description: API for managing users

servers:
  - url: https://api.example.com/v1

paths:
  /users/{userId}:
    get:
      summary: Get user by ID
      parameters:
        - name: userId
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: User found
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/User'
        '404':
          description: User not found

components:
  schemas:
    User:
      type: object
      required:
        - id
        - email
      properties:
        id:
          type: string
          example: "user_123"
        email:
          type: string
          format: email
          example: "user@example.com"
        name:
          type: string
          example: "John Doe"

CI/CD Pipeline Visibility ✅ 80/100

Semantic Naming ✅ 95/100

Cyclomatic Complexity Thresholds ✅ 100/100

Issue & Pull Request Templates ✅ 100/100

Code Smell Elimination ✅ 67/100

Branch Protection Rules ⊘

Container/Virtualization Setup ⊘

📝 Assessment Metadata

AgentReady Version: v2.27.0
Research Version: v1.0.1
Repository Snapshot: cb12b2e
Assessment Duration: 3.2s
Assessed By: runner@runnervmwffz4
Assessment Date: February 12, 2026 at 4:10 PM

🤖 Generated with Claude Code

build: Bootstrap agent-ready infrastructure

01618b6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: Bootstrap agent-ready infrastructure#453

build: Bootstrap agent-ready infrastructure#453
dahlem wants to merge 1 commit intomainfrom
agentready-bootstrap

dahlem commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Coding Standards

CLAUDE.md with @ reference (Option 3)

AGENTS.md (shared by multiple tools)

Project Overview

Architecture

Development Workflow

Code Conventions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

dahlem commented Feb 12, 2026

Summary

Files added/modified

Test plan

Uh oh!

github-actions bot commented Feb 12, 2026

🤖 AgentReady Assessment Report

📊 Summary

Languages Detected

Repository Stats

🎯 Priority Improvements

📋 Detailed Findings

Coding Standards

CLAUDE.md with @ reference (Option 3)

AGENTS.md (shared by multiple tools)

Project Overview

Architecture

Development Workflow

Code Conventions

📝 Assessment Metadata

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments