Skip to content

Commit 7668eeb

Browse files
committed
Initial commit: Enhanced ARC solver with neural guidance, episodic retrieval, and TTT
Features: - Neuroscience-inspired architecture with MD network analog - Neural guidance for operation prediction - Episodic retrieval for analogical reasoning - Program sketch mining and macro-operations - Test-time training and adaptation - Complete Kaggle-ready submission pipeline - Comprehensive benchmarking and evaluation tools Implements full blueprint from ARC Prize 2025 research document.
0 parents  commit 7668eeb

21 files changed

+3260
-0
lines changed

README.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
# Enhanced ARC-AGI-2 Solver
2+
3+
This repository contains an advanced solver for the **ARC Prize 2025** competition (ARC‑AGI‑2), implementing the complete blueprint from neuroscience-inspired research. It combines symbolic reasoning with neural guidance, episodic retrieval, program sketches, and test-time training to achieve superior performance on abstract reasoning tasks.
4+
5+
## Key Features
6+
7+
### 🧠 Neuroscience-Inspired Architecture
8+
- **Neural guidance**: Predicts relevant DSL operations using task features
9+
- **Episodic retrieval**: Maintains database of solved tasks for analogical reasoning
10+
- **Program sketches**: Mines common operation sequences as macro-operators
11+
- **Test-time training**: Adapts scoring functions to each specific task
12+
- **Multi-demand network analog**: Prioritizes candidate programs using learned heuristics
13+
14+
### 🔧 Enhanced Capabilities
15+
- **Object-centric parsing** with connected component analysis
16+
- **Compact DSL** with composable primitives (rotate, flip, translate, recolor, etc.)
17+
- **Two-attempt diversity** as required by ARC Prize 2025 rules
18+
- **Fallback resilience** with graceful degradation to baseline methods
19+
- **Performance monitoring** with detailed statistics and benchmarking
20+
21+
## Directory Structure
22+
23+
```
24+
arc_solver_project/
25+
26+
├── arc_solver/ # Core solver package
27+
│ ├── grid.py # Grid operations and utilities
28+
│ ├── objects.py # Connected component extraction
29+
│ ├── dsl.py # Domain-specific language primitives
30+
│ ├── heuristics.py # Heuristic rule inference
31+
│ ├── search.py # Basic brute-force search
32+
│ ├── solver.py # Main solver interface (enhanced)
33+
│ ├── enhanced_solver.py # Enhanced solver with neural components
34+
│ ├── enhanced_search.py # Neural-guided program synthesis
35+
│ ├── io_utils.py # JSON loading and submission helpers
36+
│ └── neural/ # Neural guidance components
37+
│ ├── features.py # Task feature extraction
38+
│ ├── guidance.py # Neural operation prediction
39+
│ ├── sketches.py # Program sketch mining
40+
│ ├── episodic.py # Episodic retrieval system
41+
│ └── ttt.py # Test-time training
42+
43+
├── arc_submit.py # Command-line submission script
44+
├── train_neural_guidance.py # Training script for neural components
45+
├── benchmark.py # Benchmarking and evaluation tools
46+
└── README.md # This file
47+
```
48+
49+
## Quick Start
50+
51+
### 1. Basic Usage (Kaggle-ready)
52+
53+
```bash
54+
# Generate submission file (uses enhanced solver by default)
55+
python arc_submit.py
56+
57+
# Use baseline solver only (if needed)
58+
ARC_USE_BASELINE=1 python arc_submit.py
59+
```
60+
61+
### 2. Training Neural Components
62+
63+
```bash
64+
# Train neural guidance (requires training data)
65+
python train_neural_guidance.py
66+
67+
# Or setup environment with defaults
68+
python benchmark.py
69+
```
70+
71+
### 3. Python API
72+
73+
```python
74+
from arc_solver.enhanced_solver import solve_task_enhanced
75+
76+
# Solve a single task with full enhancements
77+
result = solve_task_enhanced(task)
78+
79+
# Configure solver behavior
80+
from arc_solver.enhanced_solver import ARCSolver
81+
solver = ARCSolver(use_enhancements=True)
82+
result = solver.solve_task(task)
83+
```
84+
85+
## How It Works
86+
87+
### Enhanced Pipeline
88+
89+
1. **Feature Extraction**: Extract task-level features (colors, objects, transformations)
90+
2. **Neural Guidance**: Predict which DSL operations are likely relevant
91+
3. **Episodic Retrieval**: Query database for similar previously solved tasks
92+
4. **Sketch-Based Search**: Use mined program templates with parameter filling
93+
5. **Test-Time Adaptation**: Fine-tune scoring function using task demonstrations
94+
6. **Program Selection**: Rank and select top 2 diverse candidate programs
95+
96+
### Fallback Strategy
97+
98+
If enhanced components fail, the solver gracefully falls back to:
99+
- Heuristic single-step transformations
100+
- Brute-force enumeration of 2-step programs
101+
- Identity transformation as last resort
102+
103+
## Configuration
104+
105+
The solver supports extensive configuration through environment variables and config files:
106+
107+
### Environment Variables
108+
- `ARC_USE_BASELINE=1`: Force baseline solver only
109+
- `ARC_DISABLE_ENHANCEMENTS=1`: Disable enhanced features
110+
111+
### Configuration File
112+
```json
113+
{
114+
"use_neural_guidance": true,
115+
"use_episodic_retrieval": true,
116+
"use_program_sketches": true,
117+
"use_test_time_training": true,
118+
"max_programs": 256,
119+
"timeout_per_task": 30.0
120+
}
121+
```
122+
123+
## Neural Components
124+
125+
### Neural Guidance
126+
- **Purpose**: Predict which DSL operations are relevant for a given task
127+
- **Architecture**: Simple MLP with task-level features
128+
- **Training**: Uses extracted features from training demonstrations
129+
- **Output**: Operation relevance scores to guide search
130+
131+
### Episodic Retrieval
132+
- **Purpose**: Reuse solutions from similar previously solved tasks
133+
- **Method**: Task signature matching with feature-based similarity
134+
- **Storage**: JSON-based database of solved programs with metadata
135+
- **Retrieval**: Cosine similarity on numerical features + boolean feature matching
136+
137+
### Program Sketches
138+
- **Purpose**: Capture common operation sequences as reusable templates
139+
- **Mining**: Extract frequent 1-step and 2-step operation patterns
140+
- **Usage**: Instantiate sketches with different parameter combinations
141+
- **Adaptation**: Learn from successful programs during solving
142+
143+
### Test-Time Training
144+
- **Purpose**: Adapt scoring function to each specific task
145+
- **Method**: Fine-tune lightweight scorer on task demonstrations
146+
- **Features**: Program length, operation types, success rate, complexity
147+
- **Augmentation**: Generate synthetic training examples via transformations
148+
149+
## Performance and Evaluation
150+
151+
### Benchmarking
152+
```python
153+
from benchmark import Benchmark, SolverConfig
154+
155+
config = SolverConfig()
156+
benchmark = Benchmark(config)
157+
results = benchmark.run_benchmark("test_data.json")
158+
print(f"Success rate: {results['performance_stats']['success_rate']:.3f}")
159+
```
160+
161+
### Monitoring
162+
The solver tracks detailed statistics:
163+
- Success rates for enhanced vs baseline methods
164+
- Component usage (episodic hits, neural guidance, TTT adaptation)
165+
- Timing breakdown per component
166+
- Failure mode analysis
167+
168+
## Implementation Notes
169+
170+
### Kaggle Compatibility
171+
- **Offline execution**: No internet access required
172+
- **Dependency-light**: Uses only NumPy for core operations
173+
- **Compute budget**: Optimized for ~$0.42 per task limit
174+
- **Output format**: Exactly 2 attempts per test input as required
175+
176+
### Code Quality
177+
- **Type hints**: Full typing support for better maintainability
178+
- **Documentation**: Comprehensive docstrings and comments
179+
- **Error handling**: Robust fallback mechanisms
180+
- **Testing**: Validation and benchmarking utilities
181+
182+
## Extending the Solver
183+
184+
### Adding New DSL Operations
185+
1. Define operation function in `dsl.py`
186+
2. Add parameter generation in `sketches.py`
187+
3. Update feature extraction in `features.py`
188+
4. Retrain neural guidance if needed
189+
190+
### Improving Neural Components
191+
1. **Better features**: Add domain-specific feature extractors
192+
2. **Advanced models**: Replace MLP with transformer/GNN
193+
3. **Meta-learning**: Implement few-shot adaptation algorithms
194+
4. **Hybrid methods**: Combine symbolic and neural reasoning
195+
196+
### Advanced Techniques
197+
- **Probabilistic programming**: Sample programs from learned distributions
198+
- **Curriculum learning**: Train on tasks of increasing difficulty
199+
- **Multi-agent reasoning**: Ensemble of specialized solvers
200+
- **Causal reasoning**: Incorporate causal structure learning
201+
202+
## Research Foundation
203+
204+
This implementation is based on the research blueprint "ARC Prize 2025 & Human Fluid Intelligence" which draws from cognitive neuroscience findings about:
205+
206+
- **Multiple-demand (MD) network**: Neural guidance mimics executive control
207+
- **Basal ganglia gating**: Operation selection and working memory control
208+
- **Hippocampal-mPFC loop**: Episodic retrieval and schema integration
209+
- **Test-time adaptation**: Rapid task-specific learning from few examples
210+
211+
The solver architecture directly maps these biological systems to computational components.
212+
213+
## Competition Strategy
214+
215+
### Short-term (Immediate)
216+
- ✅ Strong symbolic baseline with neural enhancements
217+
- ✅ Episodic retrieval for common patterns
218+
- ✅ Test-time adaptation for task specialization
219+
- ✅ Kaggle-ready submission format
220+
221+
### Medium-term (During Contest)
222+
- Train neural guidance on public training data
223+
- Mine program sketches from successful solutions
224+
- Analyze semi-private feedback for failure modes
225+
- Expand DSL based on discovered patterns
226+
227+
### Long-term (Advanced Research)
228+
- Probabilistic program synthesis
229+
- Hybrid symbolic-neural architecture
230+
- Broader cognitive priors and meta-learning
231+
- Integration with large language models
232+
233+
## License
234+
235+
This code is designed to be open-sourced under an appropriate license as required by ARC Prize 2025 rules.
236+
237+
## Citation
238+
239+
If you use this solver or build upon its ideas, please cite the research blueprint and this implementation.
240+
241+
## Contributing
242+
243+
Contributions are welcome! Focus areas:
244+
- Neural architecture improvements
245+
- New DSL operations based on failure analysis
246+
- Advanced meta-learning techniques
247+
- Performance optimizations for Kaggle constraints
248+
249+
---
250+
251+
**Ready to win ARC Prize 2025!** 🏆

arc_solver/dsl.py

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
"""
2+
Domain-specific language (DSL) primitives for ARC program synthesis.
3+
4+
This module defines a small set of composable operations that act on grids.
5+
Each operation is represented by an `Op` object with a name, a function, and
6+
metadata about its parameters. Programs are sequences of these operations.
7+
"""
8+
9+
from __future__ import annotations
10+
11+
import numpy as np
12+
from typing import Any, Callable, Dict, List, Tuple
13+
14+
from .grid import Array, rotate90, flip, transpose, translate, color_map, crop, pad_to, bg_color
15+
16+
17+
class Op:
18+
"""Represents a primitive transformation on a grid.
19+
20+
Attributes
21+
----------
22+
name : str
23+
Human-readable name of the operation.
24+
fn : Callable
25+
Function implementing the operation.
26+
arity : int
27+
Number of input grids (arity=1 for single-grid ops).
28+
param_names : List[str]
29+
Names of parameters accepted by the operation.
30+
"""
31+
32+
def __init__(self, name: str, fn: Callable[..., Array], arity: int, param_names: List[str]):
33+
self.name = name
34+
self.fn = fn
35+
self.arity = arity
36+
self.param_names = param_names
37+
38+
def __call__(self, *args, **kwargs) -> Array:
39+
return self.fn(*args, **kwargs)
40+
41+
42+
# Primitive operations (single-grid)
43+
def op_identity(a: Array) -> Array:
44+
return a
45+
46+
47+
def op_rotate(a: Array, k: int) -> Array:
48+
return rotate90(a, k)
49+
50+
51+
def op_flip(a: Array, axis: int) -> Array:
52+
return flip(a, axis)
53+
54+
55+
def op_transpose(a: Array) -> Array:
56+
return transpose(a)
57+
58+
59+
def op_translate(a: Array, dy: int, dx: int) -> Array:
60+
return translate(a, dy, dx, fill=bg_color(a))
61+
62+
63+
def op_recolor(a: Array, mapping: Dict[int, int]) -> Array:
64+
return color_map(a, mapping)
65+
66+
67+
def op_crop_bbox(a: Array, top: int, left: int, height: int, width: int) -> Array:
68+
# ensure cropping stays inside bounds
69+
h, w = a.shape
70+
top = max(0, min(top, h - 1))
71+
left = max(0, min(left, w - 1))
72+
height = max(1, min(height, h - top))
73+
width = max(1, min(width, w - left))
74+
return crop(a, top, left, height, width)
75+
76+
77+
def op_pad(a: Array, out_h: int, out_w: int) -> Array:
78+
return pad_to(a, (out_h, out_w), fill=bg_color(a))
79+
80+
81+
# Register operations in a dictionary for easy lookup
82+
OPS: Dict[str, Op] = {
83+
"identity": Op("identity", op_identity, 1, []),
84+
"rotate": Op("rotate", op_rotate, 1, ["k"]),
85+
"flip": Op("flip", op_flip, 1, ["axis"]),
86+
"transpose": Op("transpose", op_transpose, 1, []),
87+
"translate": Op("translate", op_translate, 1, ["dy", "dx"]),
88+
"recolor": Op("recolor", op_recolor, 1, ["mapping"]),
89+
"crop": Op("crop", op_crop_bbox, 1, ["top", "left", "height", "width"]),
90+
"pad": Op("pad", op_pad, 1, ["out_h", "out_w"]),
91+
}
92+
93+
94+
def apply_program(a: Array, program: List[Tuple[str, Dict[str, Any]]]) -> Array:
95+
"""Apply a sequence of operations (program) to the input array.
96+
97+
Parameters
98+
----------
99+
a : Array
100+
Input grid.
101+
program : List of (op_name, params)
102+
Sequence of operations with parameters. The operations are looked up in
103+
OPS.
104+
105+
Returns
106+
-------
107+
Array
108+
Resulting grid after applying the program.
109+
"""
110+
out = a
111+
for name, params in program:
112+
op = OPS[name]
113+
out = op(out, **params)
114+
return out

0 commit comments

Comments
 (0)