This repository covers the experiments for the paper Visual Sudoku Puzzle Classification: A Suite of Collective Neuro-Symbolic Tasks. These scripts were run and tested on Linux.
To run experiments, you first have to choose the data you would like to use.
- All data is available here.
- This repository contains pointers to specific data chunks.
- Sample datasets that only have one split: without overlap and with overlap.
Note that most of these scripts have configurable paths (check with --help),
but this README will use the defaults.
After you get the data, we now need to put it in the data directory.
Move your data (starting with the dimension::* dir(s)) into the data/raw directory.
For example the sample dataset (with overlap) above will make a data directory like:
data
└── raw
└── dimension::4
└── datasets::mnist
└── strategy::simple
└── numTrain::00100
└── numTest::00100
└── numValid::00100
└── corruptChance::0.50
└── overlap::1.00
└── split::01
├── options.json
├── test_cell_labels.txt
├── test_puzzle_labels.txt
├── test_puzzle_notes.txt
├── test_puzzle_pixels.txt
├── train_cell_labels.txt
├── train_puzzle_labels.txt
├── train_puzzle_notes.txt
├── train_puzzle_pixels.txt
├── valid_cell_labels.txt
├── valid_puzzle_labels.txt
├── valid_puzzle_notes.txt
└── valid_puzzle_pixels.txt
Now, generate the PSL data from the raw data:
./scripts/convert-data.py
Your data directory should now include another directory named vspc where all the PSL data has been generated:
data
└── vspc
└── dimension::4
└── datasets::mnist
└── strategy::simple
└── numTrain::00100
└── numTest::00100
└── numValid::00100
└── corruptChance::0.50
└── overlap::1.00
└── split::01
├── digit_model_untrained.h5
├── digit_model_untrained_tf
│ ├── assets
│ ├── fingerprint.pb
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
├── eval
│ ├── digit_features.txt
│ ├── digit_labels.txt
│ ├── digit_targets.txt
│ ├── digit_truth.txt
│ ├── label_id_map.txt
│ ├── positive_digit_features.txt
│ ├── positive_digit_targets.txt
│ ├── positive_digit_truth.txt
│ ├── positive_row_col_violation_targets.txt
│ ├── positive_violation_targets.txt
│ ├── positive_violation_truth.txt
│ ├── row_col_violation_targets.txt
│ ├── violation_targets.txt
│ └── violation_truth.txt
├── learn
│ ├── digit_features.txt
│ ├── digit_labels.txt
│ ├── digit_targets.txt
│ ├── digit_truth.txt
│ ├── first_positive_puzzle.txt
│ ├── label_id_map.txt
│ ├── positive_digit_features.txt
│ ├── positive_digit_pinned_truth.txt
│ ├── positive_digit_targets.txt
│ ├── positive_digit_truth.txt
│ ├── positive_row_col_violation_targets.txt
│ ├── positive_violation_targets.txt
│ ├── positive_violation_truth.txt
│ ├── row_col_violation_targets.txt
│ ├── violation_targets.txt
│ └── violation_truth.txt
└── options.json
Now you can run the baselines with:
./scripts/run-baselines.py
All results will be stored in the results directory.
(The results directory is checked if an experiment has already been run before running it again.)
To run NeuPSL, use:
./scripts/run-psl.sh
All results will be stored in the results directory.
All results are stored (by default) in the results directory.
You an use the parse-results script to pull out most the values you should need:
./scripts/parse-results.py