Synthetic Dataset for Leveraging Human‑Intuitive Analogies to Elevate AI Reasoning
By embedding robust human-intuitive analogies into ARC-style tasks, GIFARC guides AI agents to evaluate the task analogically before engaging in brute-force pattern search, thus efficiently reducing problem complexity and build a more concise and human-understandable solution.
will turn into .... GIFARC!
- 1,614 ARC style puzzles made from GIF with analogy.
- Pair‑wise ground‑truth mappings + rich textual rationales for supervised or in‑context use.
- Easy Play generation pipeline - extend or remix new analogy families with gif in a few minutes.
- Friendly Hugging Face dataset & interactive web demo for instant exploration.
We highly command to using docker. To setting with docker check SETUP.md.
git clone <GIT_url>
cd gifarc
pip install -r requirements.txt
pip install -r requirements-dev.txt from datasets import load_dataset
ds = load_dataset("DumDev/gif_arc")Once your Set up is down, open description_executor.ipynb and run the code here.
| Split | #Tasks | #Unique GIFs | Size |
|---|---|---|---|
| Train | 1,614 | 1,614 | < 100 MB |
Every task packages looks as follows:
{
"source": "<source code>", # python code string
"examples": [
[<input_grid_1>,<output_grid_1>], # pair 1
[<input_grid_2>,<output_grid_2>], # pair 2
...
],
"seeds": [
"<file_name_1>",
"<file_name_2>",
...,
"<file_name_N>",
"<Concept_and_description>"
],
"url": "<minified_url>"
}
See the full dataset card for licensing, intended use, and data statements.
- Modular & Easy generation – After put GIF in data/GIF, just click all run button at
description_executor.ipynbto generate Your own data! - Stable environment setting enable easy set up with docker and devcontainer.
- All intermediate artifacts are cached for reproducibility.
Detailed instructions live in GENERATION.md.
## Project Structure
./GIFARC
├── data
│ └── GIF
├── description_executor.ipynb # use this to execute
├── docker-compose.yml
├── docs
│ ├── EXPERIMENTS.md
│ ├── GENERATION.md
│ ├── project_directory_tree.txt
│ └── SETUP.md
├── loggings
├── README.md
├── requirements-dev.txt
├── requirements.txt
├── results # this will generate automatically
└── src
├── execution.py
├── experiments.py
├── generate_descriptions.py
├── generate_problems.py
├── generate_visualization_html.py
├── GIFARC_data_batch
├── GIFARC_utils
├── misc
├── parse_batch_description_samples.py
├── prompts
├── seeds
├── utility
└── visualize_problems.py
@misc{gifarc2025,
title = {GIFARC: Synthetic Dataset for Leveraging Human-Intuitive Analogies to Elevate AI Reasoning},
author = { Anonymous },
year = {2025},
note = {Under review at NeurIPS Datasets & Benchmarks 2025},
url = {}
}- GIPHY for powering the GIF search API.
- BARC – our generation pipeline stands on the shoulders of this excellent project.
- GIFARC wouldn’t be possible without the open‑source community and our amazing reviewers.
Distributed under the MIT License.



