Legacy Lockpicks: Xlator (Translator) project

Goal: Represent and translate the given input (federal, state, and local government policy documents and code from legacy systems) into specs (intermediate representations of rulesets), following a Rules-as-Code (RaC) approach. To create the output, the specs are used to build the modernized system, quickly building much of the UI workflows to gather the input needed to run the ruleset.

Key Principles

Incremental Approach: Work on one program or rule set topic at a time, building confidence before expanding scope.
Version Control: Commit specs after each logical milestone to track evolution and enable rollback.
AI Collaboration: Use AI to accelerate translation, but human review ensures accuracy and policy compliance.
Testing Focus: Comprehensive tests prevent regressions and document expected behavior.

Ruleset as the codebase

This project treats the extracted ruleset (policies, workflows, decision logic, definitions) as a first-class code artifact rather than static documentation. Doing so provides the same rigor, traceability, and reliability we expect from software code.

Version Control & Traceability

Every change is tracked and reviewable.
Historical versions can be compared or restored.
Rule evolution is transparent and auditable.

Code Review for Rule Changes

Updates go through a PR review process.
Logic changes are visible and discussable.
Reduces silent drift and unintended consequences.

Tests Prevent Regressions

Tests encode expected behavior and edge cases.
Regressions are caught automatically.
Behavior becomes executable documentation.
Refactors are safer and verifiable.

Incremental Improvement

New test cases are added as rules are added.
Gaps surface via failing tests.
Accuracy improves through small, controlled updates.

Automation & Product Readiness

Supports validation, transpilation, and deployment.
Potential integration with CI/CD.

Desired Outcomes

Progress in Terms of Outcomes, Not Outputs (Outputs are activity. Outcomes are value.)

This rules extractor initiative:

Outcome: Save meaningful time for project teams.
Leading indicator: Accuracy of automatically generated rules.
Validation loop: Early and frequent feedback from the AK team about what matters most.

FAQ

Is this testing functional feasibility?
- Yes. This prototype tests if Claude Code can be used to extract and maintain the specs of a system (ruleset, workflows, documentation, etc.).
- It treats the specs as code, so the specs are version-controlled and there are tests for the rulesets (and other parts of the specs) to ensure they behave as intended when run in a rules engine.
Is this testing experiential value?
- Yes. From this prototype, we'll learn can and can't be done with Claude Code, and how well an iterative approach works. The hypothesis is: incrementally building up the ruleset produces a more accurate and verifiable ruleset than generating and iterating on an entire ruleset.
Is this testing look and feel?
- No. User interface and visual design are not the focus of this prototype. However, the experience of interacting with the AI and the IDE (e.g., navigating files, hover-tiggered DSL documentation) will inform ideas about user experience and AI interaction patterns.
Is this testing performance or scale?
- Yes, it is testing extracted-ruleset accuracy and it will test performance as the policy/ruleset size grows.
What is the primary (and secondary) purpose of this prototype?
- Primary purpose: explore iterative approach of building and maintaining the specs. A desired outcome is that the specs are easier to build incrementally.
- Secondary purpose: explore capabilities and limitations of Claude Code on a codebase of specs containing files in an atypical language.
Are we building a foundation we can develop into a product?
- Yes. The lessons learned (e.g., capabilities, incremental approach, test-driven validation of rules, transpilation/conversion to target languages) can inform the requirements and design of a product.
Are we delivering a win to a client / project team?
- TBD. We're testing initially with the Alaska team.
Are we trying to prove or disprove a specific piece of the approach?
- Yes, we are explicitly testing the hypothesis that incrementally building a ruleset results in a more accurate and verifiable outcome than generating a full ruleset and refining it afterward.
Are we trying to demonstrate our approach for a potential / current client?
- Yes, the prototype aims to demonstrate that an AI-assisted workflow can produce rulesets that are accurate, verifiable, and maintainable.
Why a custom DSL?
- Easier to customize for our evolving purposes
- Understand the degress of expressiveness needed for ruleset generation, which will help decide on a rules engine
- Versatile output: can be converted/transpiled to different output languages like Rego for OPA, OpenFisca, and natural language for humans
- Easier to adapt for input to other tools (e.g., ruleset comparison against ruleset-from-code, subset-of-rules extraction for a subtopic)

Step-by-Step Process

This project takes an incremental approach where each iteration involves the user (an SME on the policies) to perform the following:

The user adds input docs and code in manageable-sized amounts of policy docs
- The codebase contains the input and there is no context window (and hence no limit).
- The AI searches the codebase for the data it needs (similar to RAG but without a vector DB).
The user interacts with an AI to update the specs. Once satisfied, the specs are committed into git for version control.
- The user interacts with the AI to create/update the specs (ruleset, workflows, etc.) in manageable amounts.
- The specs are in a DSL format that will evolve over time.
- The specs are machine-readable so that it can be used to build the UI workflows.
Tests for updated specs are added by an AI and verified by the user to ensure future changes do not cause a regression.
Once a logical set of rules are captured, the user guides the AI to generate output, including the ruleset and code to get end-user input and run the ruleset on a given rules engine.
- A transpiler or converter may be needed to create the output ruleset so that it is usable by the modern system.

Each policy domain is a self-contained unit under domains/<name>/:

domains/
  snap/                              ← example: SNAP federal income eligibility
    input/policy_docs/               ← source policy documents (Markdown, PDF, etc.)
    specs/
      eligibility.civil.yaml         ← CIVIL DSL ruleset (human + AI authored)
      tests/eligibility_tests.yaml   ← test cases
    output/
      eligibility.rego               ← generated OPA/Rego (do not edit manually)
      demo/
        main.py                      ← FastAPI backend
        static/index.html            ← browser form UI
        start.sh                     ← starts OPA + FastAPI
  <next-domain>/                     ← add new domains here
    input/ specs/ output/

core/
  ruleset_schema.yaml                ← shared CIVIL DSL schema reference
tools/
  validate_civil.py                  ← CIVIL validator
  transpile_to_opa.py                ← CIVIL → OPA/Rego transpiler
  run_tests.py                       ← test runner
x.py                                 ← CLI entry point (all pipeline actions)
x                                    ← shell shim: exec python x.py "$@"

1. Input Collection

Add policy documents to domains/<name>/input/policy_docs/
Use the extract-ruleset command (.claude/commands/extract-ruleset.md) to extract CIVIL specs with AI assistance

2. Spec Creation (AI-Assisted)

Work with AI to create domains/<name>/specs/<module>.civil.yaml
Follow the CIVIL DSL schema at core/ruleset_schema.yaml
Commit completed specs to version control

3. Test Definition (AI-Assisted)

AI generates domains/<name>/specs/tests/<module>_tests.yaml
Review and verify test scenarios; add edge cases and boundary conditions

4. Output Generation

./x transpile <domain> <module> generates domains/<name>/output/<module>.rego

5. Validation & Iteration

./x test <domain> <module> starts OPA automatically, runs tests, and stops OPA
./x demo <domain> <module> starts the demo (OPA + FastAPI)
Iterate on specs as needed

Example (SNAP)

./x pipeline snap eligibility        # validate + transpile + test (OPA managed automatically)
./x demo snap eligibility            # start OPA + FastAPI demo at http://localhost:8000
./x list                             # show all available domain/module pairs

First-time setup

python x.py setup    # install uv, create .venv, install deps, install OPA

After setup, use ./x directly (the shell shim activates the right Python).

Vision diagram

flowchart TD

subgraph input
    policy_docs@{shape: docs} --> legacy_code
    legacy_code@{shape: procs}
    verified_artifacts
end

policy_docs --> Extractor1[[Extractor1]] --> ruleset
legacy_code --> Extractor2[[Extractor2]] --> specs

subgraph specs
    ruleset & workflows & artifacts
end

specs <--correct?--> verify[/verify/]

subgraph specs_testing
    tester_rule_engine[[Rule Engine]]
    ruleset --transpile?--> tester_rule_engine
    test_cases([test_cases]) --> tester_rule_engine --> expected_results([expected_results])
    tester_rule_engine --> explanation([explanation])
end

ruleset ---> Transpiler[[Transpiler]] --> ruleset2[ruleset]
workflows & artifacts ---> Coder[[Coder]] --> webforms

subgraph output["output (modern_system)"]
    ruleset2 --> rule_engine[[Rule Engine]] <--> code <--> webforms
end

One incarnation of Extractor1 is the Policy Extraction (doc-to-logic) prototype
Extractor2 will likely use AWS Transform, which also produces documentation, which would be included as part of the specs and can be used as input to the Coder.
- Another option is to include verified output from AWS Transform (noted as verified_artifacts) as part of the input.

Not yet in the diagram:

There can be multiple specs that can be compared to identify differences between systems (legacy vs legacy; modern vs modern; legacy vs modern).
Validating the modern_system against the legacy_system

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.claude		.claude
.entire		.entire
.vscode		.vscode
core		core
docs		docs
domains		domains
tools		tools
.gitignore		.gitignore
README.md		README.md
mise.toml		mise.toml
project_status.md		project_status.md
requirements.txt		requirements.txt
x		x
x.py		x.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Legacy Lockpicks: Xlator (Translator) project

Key Principles

Ruleset as the codebase

Version Control & Traceability

Code Review for Rule Changes

Tests Prevent Regressions

Incremental Improvement

Automation & Product Readiness

Desired Outcomes

FAQ

Step-by-Step Process

1. Input Collection

2. Spec Creation (AI-Assisted)

3. Test Definition (AI-Assisted)

4. Output Generation

5. Validation & Iteration

Example (SNAP)

First-time setup

Vision diagram

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Legacy Lockpicks: Xlator (Translator) project

Key Principles

Ruleset as the codebase

Version Control & Traceability

Code Review for Rule Changes

Tests Prevent Regressions

Incremental Improvement

Automation & Product Readiness

Desired Outcomes

FAQ

Step-by-Step Process

1. Input Collection

2. Spec Creation (AI-Assisted)

3. Test Definition (AI-Assisted)

4. Output Generation

5. Validation & Iteration

Example (SNAP)

First-time setup

Vision diagram

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages