Skip to content

feat: Property Based Testing Extension#119

Merged
raj-jain-aws merged 1 commit intomainfrom
feat/property-based-testing-extension
Mar 24, 2026
Merged

feat: Property Based Testing Extension#119
raj-jain-aws merged 1 commit intomainfrom
feat/property-based-testing-extension

Conversation

@raj-jain-aws
Copy link
Copy Markdown
Contributor

@raj-jain-aws raj-jain-aws commented Mar 15, 2026

Property-Based Testing (PBT) Extension for AI-DLC

Summary

This PR adds a Property-Based Testing (PBT) extension to the AI-DLC (AI-Driven Lifecycle) workflow. The extension introduces cross-cutting PBT rules that are enforced across applicable lifecycle stages — from functional design through code generation and build/test. It was validated end-to-end by building a simple AWS-hosted calculator application using the full AI-DLC workflow with PBT enabled in Partial enforcement mode.


What the PBT Extension Does

The PBT extension is an optional, opt-in module that plugs into the AI-DLC methodology. It ensures that code with identifiable mathematical or logical properties is tested using property-based techniques, complementing (not replacing) traditional example-based tests.

Extension Structure

The extension consists of two files:

File Purpose
extensions/testing/property-based/property-based-testing.opt-in.md Lightweight opt-in prompt injected into Requirements Analysis questions
extensions/testing/property-based/property-based-testing.md Full rule definitions (loaded only when user opts in)

Opt-In Mechanism

During Requirements Analysis, the extension automatically injects a question into the verification questionnaire:

Question 10: Property-Based Testing Extension
Should property-based testing (PBT) rules be enforced for this project?

A) Yes — enforce all PBT rules as blocking constraints
B) Partial — enforce PBT rules only for pure functions and serialization round-trips
C) No — skip all PBT rules

This follows the AI-DLC extension pattern: the *.opt-in.md file is loaded at startup (lightweight), and the full rules file is loaded only if the user opts in — saving context window space when PBT is declined.

Enforcement Modes

Mode Blocking Rules Advisory Rules
Full (A) All PBT rules (PBT-01 through PBT-10) None
Partial (B) PBT-02, PBT-03, PBT-07, PBT-08, PBT-09 PBT-01, PBT-04, PBT-05, PBT-06, PBT-10
Disabled (C) None None

PBT Rules Overview

Rule Name What It Enforces
PBT-01 Property Identification During Design Functional design must identify testable properties per component
PBT-02 Round-Trip Properties Inverse operations must have PBT (e.g., serialize/deserialize)
PBT-03 Invariant Properties Documented invariants must have PBT (commutativity, identity, etc.)
PBT-04 Idempotency Properties Idempotent operations must prove f(f(x)) == f(x)
PBT-05 Oracle/Model-Based Testing Reference implementations must be compared via PBT
PBT-06 Stateful Property Testing Stateful components must have stateful PBT
PBT-07 Generator Quality PBT must use domain-specific generators, not raw primitives
PBT-08 Shrinking and Reproducibility PBT must support shrinking and seed-based replay
PBT-09 Framework Selection A PBT framework must be selected and added to dependencies
PBT-10 Complementary Testing Strategy PBT must complement, not replace, example-based tests

Cross-Cutting Enforcement Points

The rules are enforced at specific AI-DLC stages:

AI-DLC Stage Applicable Rules
Functional Design PBT-01 (property identification in design artifacts)
NFR Requirements PBT-09 (framework in tech stack)
Code Generation Planning PBT-01 through PBT-10 (plan must include PBT steps)
Code Generation PBT-02 through PBT-08, PBT-10 (generated tests must include PBT)
Build and Test PBT-08 (test instructions must include seed logging)

How PBT Was tested in a Greenfield project

The extension was tested by running the full AI-DLC workflow to build a calculator application (Python Lambda + HTML/JS frontend on AWS). The user selected Partial enforcement (option B) during Requirements Analysis.

1. Requirements Analysis — Opt-In Question Injected

The PBT opt-in question was automatically added as Question 10 in the requirements verification questionnaire:

## Question 10: Property-Based Testing Extension
Should property-based testing (PBT) rules be enforced for this project?

A) Yes — enforce all PBT rules as blocking constraints
B) Partial — enforce PBT rules only for pure functions and serialization round-trips
C) No — skip all PBT rules

[Answer]: B

The user's choice was recorded in aidlc-state.md:

Extension Configuration

Extension Enabled Decided At
Property-Based Testing Partial Requirements Analysis

And the audit log captured the extension loading:

## INCEPTION - Requirements Analysis (Extension Loading)
Property-Based Testing extension: user opted IN as Partial (B) — loaded full rules
from property-based-testing.md. Partial enforcement mode: only PBT-02, PBT-03,
PBT-07, PBT-08, PBT-09 are blocking. All other PBT rules are advisory.

2. Functional Design — Testable Properties Identified (PBT-01)

The functional design's business-logic-model.md included a dedicated Testable Properties section identifying four properties for the calculate() function:

### Testable Properties (PBT)
- **Commutativity**: add(a, b) == add(b, a) for all valid floats
- **Identity**: add(a, 0) == a; subtract(a, 0) == a
- **Inverse**: subtract(add(a, b), b) == a (within float tolerance)
- **Invariant**: calculate() always returns a float for valid inputs

Each property maps to the PBT rule categories:

  • Commutativity → PBT-03 (Invariant)
  • Identity → PBT-03 (Invariant)
  • Inverse → PBT-02 (Round-trip)
  • Type preservation → PBT-03 (Invariant)

3. Code Generation Plan — Dedicated PBT Step

The code generation plan included a dedicated Step 3 for property-based tests, explicitly referencing the applicable PBT rules:

### Step 3: Backend — Property-Based Tests
- [ ] Create `tests/test_calculate_pbt.py` with:
  - PBT-02 (Round-trip/Inverse): subtract(add(a, b), b) ≈ a
  - PBT-03 (Invariant): calculate() always returns float; add commutativity
  - PBT-03 (Invariant): identity property add(a, 0) == a, subtract(a, 0) == a
  - PBT-07: Domain-specific generators for realistic numeric operands
  - PBT-08: Seed-based reproducibility via Hypothesis settings
  - PBT-09: Hypothesis framework as PBT dependency

4. Generated Code — PBT Test File

The workflow generated tests/test_calculate_pbt.py with 5 property-based tests using the Hypothesis framework:

# PBT-07: Domain-specific generator for realistic calculator operands.
realistic_operands = st.floats(
    min_value=-1e15, max_value=1e15,
    allow_nan=False, allow_infinity=False,
)

# PBT-08: Reproducible settings with deterministic database
pbt_settings = settings(max_examples=200, derandomize=True)

# PBT-02: Round-trip / Inverse property
@given(a=realistic_operands, b=realistic_operands)
def test_add_subtract_inverse(a, b):
    result = calculate("subtract", calculate("add", a, b), b)
    assert math.isclose(result, a, rel_tol=1e-9, abs_tol=1e-9)

# PBT-03: Invariant — commutativity
@given(a=realistic_operands, b=realistic_operands)
def test_add_commutativity(a, b):
    assert calculate("add", a, b) == calculate("add", b, a)

# PBT-03: Invariant — identity (add and subtract)
@given(a=realistic_operands)
def test_add_identity(a):
    assert calculate("add", a, 0) == a

@given(a=realistic_operands)
def test_subtract_identity(a):
    assert calculate("subtract", a, 0) == a

# PBT-03: Invariant — type preservation
@given(a=realistic_operands, b=realistic_operands)
def test_calculate_returns_float(a, b):
    assert isinstance(calculate("add", a, b), float)
    assert isinstance(calculate("subtract", a, b), float)

5. Dependencies — Hypothesis Added (PBT-09)

The root requirements.txt includes Hypothesis as a project dependency:

pytest>=7.0
hypothesis>=6.0

6. Build and Test Instructions — PBT Seed Reproducibility (PBT-08)

The unit test instructions include a dedicated PBT section:

## PBT Seed Reproducibility (PBT-08)
- Tests use `derandomize=True` so they are deterministic
- If a PBT fails, Hypothesis prints the minimal failing example
- The failing example can be added as a permanent regression test in test_lambda.py

And the build-and-test summary documents PBT coverage:

### Property-Based Tests
- **Total Tests**: 5 (200 examples each)
- **File**: tests/test_calculate_pbt.py
- **Framework**: Hypothesis (PBT-09)
- **Properties tested**: commutativity, identity, inverse round-trip, type invariant
- **Reproducibility**: derandomize=True (PBT-08)

7. Complementary Testing Strategy (PBT-10)

The project maintains clear separation between example-based and property-based tests:

File Type Count Purpose
tests/test_lambda.py Example-based 22 tests Specific known scenarios, regression cases, validation errors
tests/test_calculate_pbt.py Property-based 5 tests (200 examples each) General invariants across wide input space

Both files test the same calculate() function but from different angles — example-based tests pin specific expected values, while PBT tests verify properties hold universally across 200 randomly generated inputs per test.


Results

The PBT extension successfully:

  1. Injected the opt-in question into Requirements Analysis without loading full rules upfront (context-efficient)
  2. Loaded rules on-demand only after the user opted in (Partial mode)
  3. Influenced Functional Design by requiring testable property identification in design artifacts
  4. Shaped the Code Generation Plan with explicit PBT steps referencing rule IDs
  5. Generated compliant PBT tests using domain-specific generators (PBT-07), deterministic reproducibility (PBT-08), and the Hypothesis framework (PBT-09)
  6. Maintained complementary testing with separate example-based and property-based test files
  7. Included PBT in Build and Test instructions with seed reproducibility guidance

PBT Compliance Summary (Partial Mode)

Rule Status Notes
PBT-01 Advisory ✓ Testable Properties section in business-logic-model.md
PBT-02 Blocking ✅ Round-trip test: subtract(add(a,b), b) ≈ a
PBT-03 Blocking ✅ Invariant tests: commutativity, identity, type preservation
PBT-04 Advisory — N/A No idempotent operations in calculator
PBT-05 Advisory — N/A No reference/oracle implementation
PBT-06 Advisory — N/A No stateful components (stateless design)
PBT-07 Blocking ✅ realistic_operands generator with constrained floats
PBT-08 Blocking ✅ derandomize=True, seed logging documented
PBT-09 Blocking ✅ Hypothesis in requirements.txt
PBT-10 Advisory ✓ Separate test files for example-based vs PBT

All blocking rules passed. All advisory rules either complied or were correctly marked N/A.

How PBT Was tested in a Brownfield project

The project developed ago as greenfield was enhanced using AI-DLC brownfield workflow pathway

I want to do AI-DLC workflow brownfield enhancements in this project to introduce multiply and divide methods

The model understood the previous extension selections and presented the following question in requirements-verification-questions.md

## Question 3
**Should the existing extension configuration be carried forward from the previous workflow?
- Security Baseline: disabled
- Property-Based Testing: Partial (PBT-02, 03, 07, 08, 09 enforced)**

A) Yes — keep the same extension configuration
B) No — I want to change the extension settings (please describe after [Answer]: tag below)

[Answer]: A

The PBT extension was applied in Partial enforcement mode throughout this brownfield workflow. Here's how it played out at each stage:

Partial mode means only rules PBT-02, PBT-03, PBT-07, PBT-08, and PBT-09 are blocking. All other PBT rules (PBT-01, PBT-04, PBT-05, PBT-06, PBT-10) are advisory/non-blocking.

Functional Design — I identified testable properties for the new multiply and divide operations, following the PBT property categories:

Commutativity: multiply(a, b) == multiply(b, a)
Identity: multiply(a, 1) == a, divide(a, 1) == a
Zero property: multiply(a, 0) == 0
Round-trip/inverse: divide(multiply(a, b), b) ≈ a
Type invariant: all operations return float
This feeds PBT-01 (advisory in partial mode, but still useful to do).

Code Generation Planning — The plan explicitly included a step for PBT tests, referencing the specific PBT rules each test would satisfy (PBT-02 for round-trip, PBT-03 for invariants).

Code Generation — The enforced rules were applied as follows:

Rule How Applied
PBT-02 (Round-trip) test_multiply_divide_inverse — verifies divide(multiply(a, b), b) ≈ a with float tolerance
PBT-03 (Invariants) test_multiply_commutativity, test_multiply_identity, test_multiply_zero, test_divide_identity, updated test_calculate_returns_float to cover all 4 operations
PBT-07 (Generator quality) Reused realistic_operands (constrained floats, no NaN/Inf) and added nonzero_operands (filtered to exclude near-zero values) for divide tests
PBT-08 (Shrinking/reproducibility) All new tests use derandomize=True via the shared pbt_settings, ensuring deterministic reproducible runs
PBT-09 (Framework) Hypothesis already in requirements.txt from the original workflow — no change needed
The key thing is that the extension configuration was carried forward from the greenfield workflow (you confirmed this in requirements Q3). So I didn't re-ask the opt-in question — I just continued enforcing the same partial ruleset against the new code. Every stage completion was checked against the 5 blocking rules before presenting it to you.

Example test case generated:

# --- PBT-02: Multiply-divide inverse round-trip ---

# Non-zero operands for divide round-trip
nonzero_operands = st.floats(
    min_value=-1e15,
    max_value=1e15,
    allow_nan=False,
    allow_infinity=False,
).filter(lambda x: abs(x) > 1e-10)


@given(a=realistic_operands, b=nonzero_operands)
@pbt_settings
def test_multiply_divide_inverse(a, b):
    """divide(multiply(a, b), b) should approximately equal a.

    Tolerance accounts for IEEE 754 float precision loss.
    """
    result = calculate("divide", calculate("multiply", a, b), b)
    assert math.isclose(result, a, rel_tol=1e-9) or abs(result - a) <= abs(b) * 2e-16 + 1e-9

Unsupported PBT language coverage:

If a developer's language lacks a PBT framework, PBT-09 would surface as a blocking finding at the NFR Requirements stage. This is intentional — the opt-in mechanism handles this case. Developers should select option C (skip PBT) or option X (describe limitation) during Requirements Analysis if their language/stack doesn't support property-based testing. The recommended frameworks table in PBT-09 is non-exhaustive and covers the most common ecosystems; for niche languages without PBT support, the opt-in question is the designed exit path.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

@raj-jain-aws raj-jain-aws requested review from a team as code owners March 15, 2026 14:43
Copy link
Copy Markdown
Contributor

@harmjeff harmjeff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great extension. I look forward to using it

Copy link
Copy Markdown
Member

@scottschreckengaust scottschreckengaust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@scottschreckengaust
Copy link
Copy Markdown
Member

scottschreckengaust commented Mar 23, 2026

@raj-jain-aws, please update the pull request title (coordinate to conventional commits) and the contributor statement (see the "Check Merge Status" for details)

@scottschreckengaust scottschreckengaust added the enhancement New feature or request label Mar 23, 2026
@raj-jain-aws raj-jain-aws changed the title Property Based Testing Extension feat: Property Based Testing Extension Mar 24, 2026
@raj-jain-aws raj-jain-aws added this pull request to the merge queue Mar 24, 2026
Merged via the queue into main with commit 273bff6 Mar 24, 2026
22 of 27 checks passed
@raj-jain-aws raj-jain-aws deleted the feat/property-based-testing-extension branch March 24, 2026 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants