Axiomatic-AI
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 1 deletion b/‎.gitignore‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎CLAUDE.md‎
Lines changed: 41 additions & 26 deletions b/‎CLAUDE.md‎
Lines changed: 41 additions & 26 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions b/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 31 additions & 24 deletions b/‎README.md‎
Lines changed: 31 additions & 24 deletions
diff --git a/‎configs/claude_ablations/claude4-5-fast-memory.yaml‎
Lines changed: 2 additions & 2 deletions b/‎configs/claude_ablations/claude4-5-fast-memory.yaml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎configs/claude_ablations/claude4-5.yaml‎
Lines changed: 2 additions & 2 deletions b/‎configs/claude_ablations/claude4-5.yaml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎configs/claude_ablations/claude4-6-fast-memory.yaml‎
Lines changed: 2 additions & 2 deletions b/‎configs/claude_ablations/claude4-6-fast-memory.yaml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎configs/claude_ablations/claude4-6.yaml‎
Lines changed: 2 additions & 2 deletions b/‎configs/claude_ablations/claude4-6.yaml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎configs/default.yaml‎
Lines changed: 4 additions & 4 deletions b/‎configs/default.yaml‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎configs/default_base_llm.yaml‎
Lines changed: 2 additions & 2 deletions b/‎configs/default_base_llm.yaml‎
Lines changed: 2 additions & 2 deletions
@@ -164,4 +164,4 @@ outputs/
 outputs/experiment_analysis/
 
 # Generated version file (setuptools-scm)
-src/ax_agent/_version.py
+src/ax_prover/_version.py
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 
 ## Project Overview
 
-ax-agent is a minimal LangGraph-based agent for automated Lean 4 theorem proving. It uses off-the-shelf LLMs (no fine-tuning) with iterative proof refinement, a memory system, and library search tools to prove theorems.
+ax-prover is a minimal LangGraph-based agent for automated Lean 4 theorem proving. It uses off-the-shelf LLMs (no fine-tuning) with iterative proof refinement, a memory system, and library search tools to prove theorems.
 
 The agent runs a 4-node loop: Proposer → Compiler → Reviewer → Memory, iterating until the proof is complete or the iteration budget is exhausted.
 
@@ -47,80 +47,81 @@ ruff check --fix .
 
 ```bash
 # Prove a specific theorem by location (module path)
-ax-agent prove MyModule.Path:theorem_name
+ax-prover prove MyModule.Path:theorem_name
 
 # Prove a specific theorem by file path
-ax-agent prove MyProject/Algebra/Ring.lean:theorem_name
+ax-prover prove MyProject/Algebra/Ring.lean:theorem_name
 
 # Prove the theorem at a specific line
-ax-agent prove MyProject/Algebra/Ring.lean#L42
+ax-prover prove MyProject/Algebra/Ring.lean#L42
 
 # Prove all unproven theorems in a file
-ax-agent prove MyProject/Algebra/Ring.lean
+ax-prover prove MyProject/Algebra/Ring.lean
 
 # Skip lake build (if repo is already built)
-ax-agent prove MyModule:theorem_name --skip-build
+ax-prover prove MyModule:theorem_name --skip-build
 
 # Force re-proving
-ax-agent prove MyModule:theorem_name --overwrite
+ax-prover prove MyModule:theorem_name --overwrite
 
 # Run batch experiments on a LangSmith dataset
-ax-agent experiment dataset_name --max-concurrency 8
+ax-prover experiment dataset_name --max-concurrency 8
 ```
 
 **Note on `--skip-build` flag:**
 - By default, `prove` runs `lake exe cache get && lake build` before starting
 - Use `--skip-build` when the repo is already built and up-to-date
-- The build step is defined in `src/ax_agent/utils/build.py:build_lean_repo()`
+- The build step is defined in `src/ax_prover/utils/build.py:build_lean_repo()`
 
 ## Architecture
 
-### Prover Agent Loop (`src/ax_agent/prover/agent.py`)
+### Prover Agent Loop (`src/ax_prover/prover/agent.py`)
 
 The agent uses a 4-node iterative LangGraph workflow:
 
 1. **Proposer** — A ReAct-style LLM agent that writes Lean 4 proof code. Can optionally use tools (LeanSearch, web search) to find relevant Mathlib lemmas before proposing.
 2. **Compiler (Builder)** — Applies the proposed code via `TemporaryProposal`, builds with `lake env lean`, and extracts goal states at `sorry` locations using `lean_interact`. Returns `BuildSuccessFeedback` or `BuildFailedFeedback`.
 3. **Reviewer** — Verifies statement preservation and proof validity (no `sorry`, no cheating tactics like `native_decide`). Returns `ReviewApprovedFeedback` or `ReviewRejectedFeedback`.
-4. **Memory** (`src/ax_agent/prover/memory.py`) — Summarizes lessons from failed attempts into a concise context ("lab notebook") to prevent repeating mistakes. Default strategy: `ExperienceProcessor` (self-reflection).
+4. **Memory** (`src/ax_prover/prover/memory.py`) — Summarizes lessons from failed attempts into a concise context ("lab notebook") to prevent repeating mistakes. Default strategy: `ExperienceProcessor` (self-reflection).
 
 Loop: Proposer → Builder → (Reviewer if build succeeds) → Memory → back to Proposer. Terminates on review approval, max iterations, or build timeout.
 
 ### Key Abstractions
 
-**State Models** (`src/ax_agent/models/`):
+**State Models** (`src/ax_prover/models/`):
 - `ProverAgentState` (`proving.py`): Main state for the prover workflow — messages, item, metrics, iteration tracking
 - `TargetItem` (`proving.py`): A theorem to prove — title, location, proven status
 - `Location` (`files.py`): Where code lives — `Module.Path:function_name` or `path/to/file.lean:function_name`
 - `Declaration` (`declaration.py`): A parsed Lean declaration with name, type, body, and line info
 
-**Messages** (`src/ax_agent/models/messages.py`):
+**Messages** (`src/ax_prover/models/messages.py`):
 - `ProposalMessage`: Code proposals with reasoning, imports, opens, and updated theorem
 - `FeedbackMessage`: Base class for feedback — `BuildSuccessFeedback`, `BuildFailedFeedback`, `ReviewApprovedFeedback`, `ReviewRejectedFeedback`, `SorriesGoalStateFeedback`, etc.
 
-**Configuration** (`src/ax_agent/config.py`):
+**Configuration** (`src/ax_prover/config.py`):
 - `Config`: Root config with `ProverConfig` and `ToolsConfig`
 - `ProverConfig`: LLM config, tools list, max iterations, memory config
 - OmegaConf-based: supports YAML files, CLI overrides, config merging
 
 ### Tools
 
-**Lean Search** (`src/ax_agent/tools/lean_search.py`):
+**Lean Search** (`src/ax_prover/tools/lean_search.py`):
 - Searches Lean 4/Mathlib theorems via LeanSearch API
 - Default: `https://leansearch.net` (public, no setup)
 
-**Web Search** (`src/ax_agent/tools/web_search.py`):
+**Web Search** (`src/ax_prover/tools/web_search.py`):
 - Tavily API for finding proof strategies online
 
-**Lean Build** (`src/ax_agent/utils/build.py`):
+**Lean Build** (`src/ax_prover/utils/build.py`):
 - `build_lean_repo()`: Runs `lake exe cache get && lake build`
 - `check_lean_file()`: Compiles a single file with `lake env lean`
 - `TemporaryProposal`: Context manager that applies code changes to a temp file, tests compilation, and can commit permanently
 
 ### Commands
 
-- `prove` (`src/ax_agent/commands/prove.py`): Prove theorems by location or all unproven in a file
-- `experiment` (`src/ax_agent/commands/experiment.py`): Run batch experiments on LangSmith datasets with evaluation metrics
+- `prove` (`src/ax_prover/commands/prove.py`): Prove theorems by location or all unproven in a file
+- `experiment` (`src/ax_prover/commands/experiment.py`): Run batch experiments on LangSmith datasets with evaluation metrics
+- `configure` (`src/ax_prover/commands/configure.py`): Interactive setup for API keys (writes to platform config dir via `platformdirs`)
 
 ### LangSmith Integration
 
@@ -151,7 +152,7 @@ with TemporaryProposal(base_folder, location, proposal) as applier:
 Locations support both file paths and module paths:
 
 ```python
-from ax_agent.models.files import Location
+from ax_prover.models.files import Location
 
 # Both formats work
 loc = Location.from_string("MyProject.Path:theorem")
@@ -248,18 +249,31 @@ counter += 1
 
 ## Configuration
 
-Environment variables (in `.env.secrets`):
-- `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY` (at least one required)
-- `TAVILY_API_KEY` (optional, for web search)
-- `LANGSMITH_API_KEY`, `LANGSMITH_TRACING` (optional, for observability)
+### API Keys
+
+Set up API keys interactively with `ax-prover configure`, or via environment variables.
+
+Secrets cascade (first found wins, shell env always takes priority):
+1. CWD `.env.secrets`
+2. `--folder` `.env.secrets`
+3. `<platformdirs config>/.env.secrets` (written by `ax-prover configure`)
+4. Package root `.env.secrets` (editable installs)
+
+Required: at least one of `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`
+Optional: `TAVILY_API_KEY` (web search), `LANGSMITH_API_KEY` (tracing)
+
+### YAML Config
 
 LLM model is configured via YAML config files (no hardcoded default).
+Default config is bundled in the package at `src/ax_prover/configs/default.yaml`.
+Config resolution: CWD > `--folder` > bundled package configs > package root `configs/`.
 
 ## Repository Structure
 
 ```
-src/ax_agent/
-├── commands/        # CLI command implementations (prove, experiment)
+src/ax_prover/
+├── commands/        # CLI command implementations (prove, experiment, configure)
+├── configs/         # Bundled default YAML configs and secrets template
 ├── models/          # Pydantic state models (proving, messages, files, declaration)
 ├── prover/          # Prover agent (agent, memory, prompts)
 ├── tools/           # LangChain tools (lean_search, web_search)
@@ -268,6 +282,7 @@ src/ax_agent/
 ├── evaluators.py    # LangSmith experiment evaluators
 └── main.py          # CLI entry point
 
+configs/             # Experimental/ablation configs (not shipped in package)
 tests/               # Pytest tests mirroring src structure
 ```
 
 
@@ -1,4 +1,4 @@
-# Contributing to ax-agent
+# Contributing to ax-prover
 
 Thanks for your interest in contributing! Whether it's a bug report, a feature idea, or a pull request — all contributions are welcome.
 
@@ -14,7 +14,7 @@ Thanks for your interest in contributing! Whether it's a bug report, a feature i
 ```bash
 # Clone the repository
 git clone https://github.com/Axiomatic-AI/ax-prover-base.git
-cd ax-agent
+cd ax-prover
 
 # Create and activate a virtual environment
 python -m venv .venv
 
@@ -1,11 +1,11 @@
-# ax-agent
+# ax-prover
 
 **A minimal agent for automated theorem proving in Lean 4**
 
-[![CI](https://github.com/Axiomatic-AI/ax-prover-base/actions/workflows/unit_tests.yml/badge.svg)](https://github.com/Axiomatic-AI/ax-prover-base/actions/workflows/unit_tests.yml)
+[![CI](https://github.com/Axiomatic-AI/ax-prover/actions/workflows/unit_tests.yml/badge.svg)](https://github.com/Axiomatic-AI/ax-prover/actions/workflows/unit_tests.yml)
 [![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue)](https://www.python.org/downloads/)
 [![License: AGPL-3.0](https://img.shields.io/badge/License-AGPL--3.0-blue.svg)](LICENSE)
-[![PyPI version](https://img.shields.io/pypi/v/ax-agent)](https://pypi.org/project/ax-agent/)
+[![PyPI version](https://img.shields.io/pypi/v/ax-prover)](https://pypi.org/project/ax-prover/)
 
 A simple, modular agent that proves Lean 4 theorems through iterative refinement.
 It uses off-the-shelf LLMs (no fine-tuning) with a feedback loop, a memory system, and library search tools to achieve competitive results against highly-engineered systems that rely on specialized training and orders of magnitude more compute.
@@ -25,7 +25,7 @@ All results with Claude Opus 4.5, 50 iterations, pass@1. See our [paper](#citati
 ## How It Works
 
 <p align="center">
-  <img src="assets/figure1.png" alt="ax-agent architecture" width="500">
+  <img src="assets/figure1.png" alt="ax-prover architecture" width="500">
 </p>
 
 The agent runs an iterative loop:
@@ -40,27 +40,26 @@ The loop continues until the proof is complete or the iteration budget is exhaus
 ## Quick Start
 
 ```bash
-pip install ax-agent
+pip install ax-prover
 ```
 
 ```bash
 # Configure your API keys
-cp .env.secrets.example .env.secrets
-# Edit .env.secrets and add at least one LLM key (Anthropic recommended)
+ax-prover configure
 
 # Navigate to a Lean 4 project
 cd /path/to/lean4-project
 
 # Prove a theorem
-ax-agent prove MyModule:my_theorem
+ax-prover prove MyModule:my_theorem
 ```
 
 ## Installation
 
 ```bash
-pip install ax-agent
+pip install ax-prover
 # or
-uv add ax-agent
+uv add ax-prover
 
 # For development (includes ruff, pytest, pre-commit)
 pip install -e ".[dev]"
@@ -77,11 +76,16 @@ pip install -e ".[dev]"
   - `GOOGLE_API_KEY`
 - **Tavily API key** (optional, for web search) — `TAVILY_API_KEY`
 
-Create a `.env.secrets` file or export the keys in your shell:
+Set up your API keys interactively:
 
 ```bash
-cp .env.secrets.example .env.secrets
-# Edit .env.secrets with your keys
+ax-prover configure
+```
+
+Or export them directly in your shell:
+
+```bash
+export ANTHROPIC_API_KEY=sk-ant-...
 ```
 
 </details>
@@ -92,22 +96,22 @@ cp .env.secrets.example .env.secrets
 
 ```bash
 # Prove a specific theorem by module path
-ax-agent prove MyModule.Path:theorem_name
+ax-prover prove MyModule.Path:theorem_name
 
 # Prove a specific theorem by file path
-ax-agent prove MyProject/Algebra/Ring.lean:theorem_name
+ax-prover prove MyProject/Algebra/Ring.lean:theorem_name
 
 # Prove the theorem at a specific line
-ax-agent prove MyProject/Algebra/Ring.lean#L42
+ax-prover prove MyProject/Algebra/Ring.lean#L42
 
 # Prove all unproven theorems in a file
-ax-agent prove MyProject/Algebra/Ring.lean
+ax-prover prove MyProject/Algebra/Ring.lean
 
 # Skip lake build (if repo is already built)
-ax-agent prove MyModule:theorem_name --skip-build
+ax-prover prove MyModule:theorem_name --skip-build
 
 # Save JSON output to file (for scripting/automation)
-ax-agent prove MyModule:theorem_name -o result.json
+ax-prover prove MyModule:theorem_name -o result.json
 ```
 
 ### Running experiments
@@ -116,10 +120,10 @@ Run batch evaluations on [LangSmith](https://smith.langchain.com) datasets:
 
 ```bash
 # Run experiment on a dataset
-ax-agent experiment dataset_name
+ax-prover experiment dataset_name
 
 # With custom concurrency
-ax-agent experiment dataset_name --max-concurrency 8
+ax-prover experiment dataset_name --max-concurrency 8
 ```
 
 <details>
@@ -141,10 +145,13 @@ prover:
 
 ```bash
 # Use a config file
-ax-agent --config my_config.yaml prove MyModule:theorem
+ax-prover --config my_config.yaml prove MyModule:theorem
 
 # Override values from the CLI
-ax-agent prove MyModule:theorem prover.max_iterations=100
+ax-prover prove MyModule:theorem prover.max_iterations=100
+
+# Save your current configuration for later reuse
+ax-prover --save-config my_setup prove MyModule:theorem
 ```
 
 </details>
@@ -160,7 +167,7 @@ This project is licensed under the [AGPL-3.0](LICENSE).
 
 ## Citation
 
-If you use ax-agent in your research, please cite:
+If you use ax-prover in your research, please cite:
 
 ```bibtex
 @article{axproverbase2026,
 
@@ -1,6 +1,6 @@
 import:
-  - configs/llms.yaml
-  - configs/default.yaml
+  - ../llms.yaml
+  - ../default.yaml
 
 prover:
   prover_llm: ${llm_configs.claude_opus_4_5}
 
@@ -1,6 +1,6 @@
 import:
-  - configs/llms.yaml
-  - configs/default.yaml
+  - ../llms.yaml
+  - ../default.yaml
 
 prover:
   prover_llm: ${llm_configs.claude_opus_4_5}
@@ -1,6 +1,6 @@
 import:
-  - configs/llms.yaml
-  - configs/default.yaml
+  - ../llms.yaml
+  - ../default.yaml
 
 prover:
   prover_llm: ${llm_configs.claude_opus_4_6}
 
@@ -1,6 +1,6 @@
 import:
-  - configs/llms.yaml
-  - configs/default.yaml
+  - ../llms.yaml
+  - ../default.yaml
 
 prover:
   prover_llm: ${llm_configs.claude_opus_4_6}
@@ -1,12 +1,12 @@
 import:
-  - configs/llms.yaml
-  - configs/tools.yaml
+  - llms.yaml
+  - tools.yaml
 
 prover:
   prover_llm: ${llm_configs.claude_opus_4_5}
   proposer_tools:
-    - ${tool_configs.search_lean_search}
-    - ${tool_configs.search_web}
+    search_lean: ${tool_configs.search_lean_search}
+    search_web: ${tool_configs.search_web}
   max_iterations: 50
   memory_config:
     class_name: ExperienceProcessor
 
@@ -1,9 +1,9 @@
 # Remove iterations, memory and tools
 import:
-  - configs/default.yaml
+  - default.yaml
 
 prover:
-  proposer_tools: []
+  proposer_tools: {}
   max_iterations: 1
   memory_config:
     class_name: MemorylessProcessor