This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
S-box Evolver - A research project implementing evolutionary algorithms for the design and optimization of cryptographic S-boxes (Substitution boxes).
This is a Master's thesis project from Brno University of Technology, Faculty of Information Technology (2009/2010), authored by Bedrich Hovorka. The project explores using genetic algorithms and other evolutionary computation methods to search for optimal S-boxes based on multiple cryptographic criteria.
Recent Updates (2025):
- January 2025: Migrated from Python 2.7 to Python 3.13
- Modernized Python syntax (print functions, range, dict methods)
- Added uv package manager for dependency management
- Updated dependencies (numpy 1.26.4, matplotlib 3.9+, pytest 8.3+)
- Environment-based library loading via SBOX_LIBRARY_PATH
- Optimized Docker multi-stage build (removed build tools from runtime)
- Debian Package Distribution: Docker build now creates
libsboxevolution_2.0.0_amd64.debpackage that installs library to/usr/lib/following FHS standards - All unit tests pass successfully with pytest 8.3+
- Known Limitation: Two Python C API functions (
simpleReportSearching()andbestPopulationStringsSearching()) are disabled due to architectural constraints with ctypes-loaded libraries. Usestatistics().bestPopulationOutputsfor S-box data instead.
- Restructured with improved organization, Docker support, unit tests, and enhanced documentation while preserving the original research implementation
sboxEvolver/
├── prog/ # Source code (Python + C++)
│ ├── Dockerfile # Multi-stage build container
│ ├── makefile # Build configuration
│ └── src/
│ ├── main/ # Source files (reorganized 2025)
│ │ ├── cpp/ # C++ implementation
│ │ │ ├── main.cpp # C++ core implementation (GAlib integration)
│ │ │ ├── main.h # Main header
│ │ │ ├── common.h # Common definitions
│ │ │ ├── cgp.cpp/h # Cartesian Genetic Programming representation
│ │ │ ├── softwareSbox.cpp/h # Software implementation of S-boxes
│ │ │ ├── eda.cpp/h # Estimation of Distribution Algorithms
│ │ │ ├── criterions.cpp/h # Cryptographic fitness functions
│ │ │ ├── multicriterial.cpp/h # VEGA and SPEA algorithms
│ │ │ ├── boxes.h # S-box lookup tables
│ │ │ └── combination.h # Combinatorial utilities
│ │ └── python/ # Python interface
│ │ ├── evolution.py # Python wrapper for evolutionary algorithms
│ │ └── experiments.py # Experimental framework with 6 experiments
│ └── test/ # Unit tests (added 2025)
│ └── python/
│ ├── test_routines.py # Core algorithm and routine tests
│ └── test_experiments.py # Experiment framework tests
├── text/ # LaTeX thesis document
│ ├── Dockerfile # LaTeX build container
│ ├── *.utf8.tex # Thesis chapters
│ ├── diplomka.bib # Bibliography
│ ├── img/ # Figures and diagrams
│ └── Makefile # LaTeX build
├── artifacts/ # Build outputs (gitignored)
│ ├── prog/
│ │ ├── libsboxevolution.so # Generated C++ shared library (for native builds)
│ │ └── libsboxevolution_2.0.0_amd64.deb # Debian package (for system installation)
│ └── text/
│ └── diplomka.pdf # Generated thesis PDF
├── outputs/ # Experiment results and logs
├── docker-compose.yml # Docker orchestration
├── galib247bh100329.tar.bz2 # GAlib library (genetic algorithm framework)
├── DIPxhovor07final.pdf # Final thesis PDF
└── README.md # Project documentation
The project uses a two-tier architecture:
- C++ Core (
prog/src/main/cpp/*.cpp/h): High-performance evolutionary algorithms using GAlib - Python Interface (
prog/src/main/python/evolution.py): Wraps C++ via ctypes for easy experimentation
- BINARY: Binary string representation (traditional GA)
- PERMUTATION: Permutation-based encoding (bijective S-boxes)
- CGP: Cartesian Genetic Programming (evolved circuits)
- SOFTWARE_IMPL: Software-optimized lookup table implementation
- Ga: Standard Genetic Algorithm
- Random: Random search baseline
- ParallelRandom: Mutate-best-parallel strategy
- Eda: Estimation of Distribution Algorithm (UMDA/BMDA variants)
- Vega: Vector Evaluated Genetic Algorithm (multi-objective)
- Spea: Strength Pareto Evolutionary Algorithm (multi-objective)
The S-boxes are evaluated against multiple cryptographic quality metrics:
- BENT_AND_MOSAC: Combined bentness and maximum-order SAC
- LP_MAX: Maximum Linear Probability (resistance to linear cryptanalysis)
- DP_MAX: Maximum Differential Probability (resistance to differential cryptanalysis)
- SAC: Strict Avalanche Criterion (output bit change probability)
- BF: Balance and Completeness Factor
- POLYNOMIAL_DEGREE: Nonlinearity measure
- BIJECTIVE_SCORE: Bijectivity constraint satisfaction
- MINIMAL_REALISATION: Circuit complexity
Uses ctypes to load the compiled shared library:
libsboxevolution = ctypes.CDLL("libsboxevolution.so")
# Create genome
genome = Genome(ChromozomeType.PERMUTATION, CriterionFunction.LP_MAX, True, 4, 4)
# Create search algorithm
searching = Searching("Ga", genome, popSize=1000, nGen=50, pMut=0.1, pCross=0.9)
# Run evolution
searching.simpleEvolve(TerminationCondition.GENERATION)The project includes Docker support for reproducible builds without manual dependency setup:
# Build both C++ library and thesis PDF
docker-compose up --build
# Build individually
docker-compose build prog # Builds libsboxevolution.so
docker-compose build text # Builds diplomka.pdf
# Artifacts appear in ./artifacts/
# - artifacts/prog/libsboxevolution.so (raw library for native builds)
# - artifacts/prog/libsboxevolution_2.0.0_amd64.deb (Debian package for system installation)
# - artifacts/text/diplomka.pdfDocker Architecture:
- prog service: Multi-stage optimized build
- Stage 1: Build GAlib 2.4.7
- Stage 2: Build libsboxevolution.so (C++ library) and create Debian package
- Stage 3: Build Python 3.13 dependencies (numpy compilation with gcc/g++)
- Stage 4: Clean Python 3.13 runtime base (installs Debian package to /usr/lib/, no build tools)
- Stage 5: Run pytest 8.3+ tests
- Stage 6: Production runtime (copies tested artifacts)
- text service: Single-stage LaTeX compilation with Czech language tools
- No inter-service dependencies: Services build in parallel
Advantages:
- No manual dependency installation (GAlib extraction, LaTeX tools, Czech typography packages)
- Reproducible builds across platforms
- Isolated build environments with minimal runtime image size
- Automatic makefile patching for modern compilers (g++-4.4 → g++, k8-sse3 → native)
- Python 3.13 support with modern pytest 8.3+
- Build tools (gcc/g++) only in build stages, not in runtime (reduces image size ~200MB)
- Resolves all platform-specific limitations mentioned in this document
Debian Package:
The Docker build creates a proper Debian package (libsboxevolution_2.0.0_amd64.deb) that can be installed on Debian-based systems:
# After Docker build, the package is available in artifacts/
ls artifacts/prog/libsboxevolution_2.0.0_amd64.deb
# Install on Debian/Ubuntu system
sudo dpkg -i artifacts/prog/libsboxevolution_2.0.0_amd64.deb
# Or install dependencies automatically
sudo apt-get install -y libgomp1
sudo apt-get install ./artifacts/prog/libsboxevolution_2.0.0_amd64.deb
# Verify installation
dpkg -l | grep libsboxevolution
ldconfig -p | grep libsboxevolution # Should show /usr/lib/libsboxevolution.so
# Uninstall
sudo dpkg -r libsboxevolutionPackage Details:
- Location: Library installs to
/usr/lib/libsboxevolution.so(standard FHS path) - Dependencies: libgomp1 (OpenMP), libc6 (>= 2.28)
- Architecture: amd64 (x86_64) only
- Debug symbols: Included (useful for debugging with gdb)
- Library loading: Python wrapper (
evolution.py) will automatically find the library via system paths
For development without Docker:
cd prog/
# Extract and compile GAlib (one-time setup)
tar -xjf ../galib247bh100329.tar.bz2 -C ..
cd ../galib247
make
cd ../prog
# Compile the S-box evolution library
make clean
make
# This produces libsboxevolution.so (build artifact, not in git)Dependencies:
g++(C++98 standard support)- OpenMP support (
-fopenmp) - Python 3.13+ development headers
- GAlib 2.4.7 (extracted from included tarball)
- Line 18:
CXX = g++-4.4→CXX = g++(if g++-4.4 unavailable) - Line 26:
-march=k8-sse3→-march=native(for modern CPUs) - Python include paths: Update to your Python 3.13 installation path
Note: The shared library libsboxevolution.so is a build artifact generated by the build process. It is not checked into version control.
The experiments.py file contains 6 numbered experiments from the thesis:
Docker (Recommended):
# Build and run experiment 1 (default)
docker-compose up --build prog
# Run different experiment (set environment variable)
EXPERIMENT_NUM=3 docker-compose up progLocal Development (Python 3.13 + uv):
cd prog/
# First, build C++ library with Docker (one-time setup)
docker-compose build prog
# Install Python dependencies with uv
uv sync
# Set library path to Docker-compiled artifact
export SBOX_LIBRARY_PATH=/home/beda/PycharmProjects/sboxEvolver/artifacts/prog
# Run specific experiment (1-6)
uv run python src/main/python/experiments.py <experiment_number>
# Examples:
uv run python src/main/python/experiments.py 1 # Mutation/crossover parameter tuning
uv run python src/main/python/experiments.py 2 # GA vs. EDA comparison
uv run python src/main/python/experiments.py 3 # ParallelRandom parameter search
uv run python src/main/python/experiments.py 4 # VEGA/SPEA parameter tuning
uv run python src/main/python/experiments.py 5 # Larger S-boxes (6x6, 8x8)
uv run python src/main/python/experiments.py 6 # Symbolic regressionOutput:
- Creates timestamped directory:
./results<YYYYMMDDHHMMSS>/(inprog/directory) - Generates EPS plots and LaTeX tables
- Saves serialized results (
.piclefiles)
Python unit tests validate core functionality (added in 2025 restructuring).
Docker (Recommended):
Tests are automatically run during the Docker build process using pytest:
# Tests run automatically as part of the build
docker-compose build prog
# The build will fail if tests don't pass
# Tests are executed in Stage 5 of the multi-stage Dockerfile using:
# pytest src/test/pythonThe Docker build uses:
- pytest 8.3+ (Python 3.13 compatible)
- assertpy 1.1+ (assertion library)
- Isolated environment with all dependencies
- Non-interactive matplotlib backend (Agg)
Native (Manual):
If running tests without Docker (requires compiled library):
cd prog/
# Ensure library path is set
export SBOX_LIBRARY_PATH=/path/to/sboxEvolver/artifacts/prog
# Run all tests with pytest
uv run pytest src/test/python/
# Or with pytest directly (if installed)
pytest src/test/python/
# Run specific test module
uv run pytest src/test/python/test_routines.py
uv run pytest src/test/python/test_experiments.py
# Run with verbose output
uv run pytest -v src/test/python/Test Coverage:
test_routines.py- Core algorithm and routine tests (S-box operations, fitness functions, genome handling)test_experiments.py- Experiment framework tests (parameter tuning, multi-objective optimization)
Prerequisites:
- Compiled
libsboxevolution.somust be present - Must run from
prog/directory for correct library path resolution - Python 3.13 with numpy and matplotlib
- pytest 8.3+ and assertpy 1.1+ (installed via
uv sync)
Docker Build (Recommended):
# Build thesis PDF in Docker
docker-compose up --build text
# Output appears at: ./artifacts/text/diplomka.pdfNative Build:
cd text/
# Compile LaTeX thesis (full build)
make
# Or clean and rebuild
make rebuild
# Clean auxiliary files only
make clean
# Remove all generated files including PDF
make mrproperRequired Tools:
LaTeX compilation requires the following tools to be installed:
-
Core LaTeX Tools:
pdflatex- LaTeX to PDF compilerbibtex- Bibliography processor for .bib files
-
Encoding Conversion:
cstocs- Czech text encoding converter (UTF-8 → ISO-8859-2/IL2)vlna- Czech typography tool (adds non-breaking spaces per Czech rules)
-
Image Processing:
epstopdf- EPS to PDF converter for plotsinkscape- SVG to PDF/EPS converterdia- Diagram editor and export tool
-
Optional:
gnuplot- For regenerating plots from .gp files (if present)
Installation (Fedora/RHEL):
sudo dnf install texlive texlive-epstopdf inkscape dia
# Note: cstocs and vlna may need to be installed from source or Czech-specific reposInstallation (Debian/Ubuntu):
sudo apt install texlive-latex-base texlive-latex-extra texlive-bibtex-extra \
texlive-fonts-recommended texlive-lang-czechslovak \
texmaker epstopdf inkscape diaBuild Process:
The Makefile performs these steps:
- Converts UTF-8 .tex files to ISO-8859-2 encoding (Czech standard)
- Applies Czech typographic rules with
vlna - Generates abbreviations list from
zkratky.dat - Converts all EPS images to PDF
- Runs
pdflatex→bibtex→pdflatex× 2 (for cross-references) - Produces
diplomka.pdf(thesis PDF)
Notes:
- The thesis is written in Czech language
- Source files use UTF-8 encoding (
.utf8.tex) - Output encoding is ISO-8859-2 (Latin-2)
- Three pdflatex passes are needed for correct TOC and references
- Genome: Permutation (4x4 S-boxes)
- Goal: Find optimal mutation and crossover probabilities
- Output: Box plots comparing parameter combinations, comparison with random search
- Genome: Binary representation
- Algorithms: GA, UMDA, BMDA, Random
- Criteria: All six cryptographic criteria
- Output: Statistical tables and box plots
- Genomes: CGP and SOFTWARE_IMPL
- Goal: Optimize mutation probability for parallel random search
- Output: Parameter sensitivity analysis
- Algorithms: VEGA and SPEA
- Goal: Find Pareto-optimal parameter settings
- Metric: Count of above-average individuals (weak dominance)
- Dimensions: 4x4, 6x4, 6x6, 8x8
- Algorithms: VEGA, SPEA on multiple representations
- Output: Non-dominated solution sets
- Goal: Evolve S-box as algebraic expressions using CGP
- Training: Random permutation target
from evolution import *
# 1. Define genome (representation + criterion)
genome = Genome(
ChromozomeType.PERMUTATION, # Representation
CriterionFunction.DP_MAX, # Fitness function
True, # Bijective constraint
4, # Input bits
4 # Output bits
)
# 2. Create search algorithm
search = Searching(
"Ga", # Algorithm type
genome,
popSize=1000, # Population size
nGen=50, # Number of generations
pMut=0.1, # Mutation probability
pCross=0.9, # Crossover probability
False # Elitism flag (Ga only)
)
# 3. Run evolution
search.simpleEvolve(TerminationCondition.GENERATION)
# 4. Get results
stats = search.statistics()
print("Best score:", stats.maxEver)
print("Best S-boxes:", search.bestPopulationStrings())
# 5. Cleanup
search.close()# Define multiple criteria
criterions = [
CriterionFunction.LP_MAX,
CriterionFunction.DP_MAX,
CriterionFunction.SAC
]
# Genome with no single criterion (multi-objective)
genome = Genome(ChromozomeType.CGP, CriterionFunction.NONE_FITNESS, 4, 4, 4, 3, 6)
# VEGA or SPEA
search = Searching("Vega", genome, 1000, 50, 0.1, 0.9, criterions)
search.simpleEvolve()
# Extract Pareto front
stats = search.statistics()
criterionsValues = stats.bestPopulationCriterionsValues # Array of shape [popSize, 8]# Create list of search instances
searches = [
Searching("Ga", genome, 1000, 50, 0.1, 0.9, False)
for _ in range(40) # 40 independent runs
]
# Run in parallel (uses OpenMP in C++)
Searching.parallelSearching(searches, TerminationCondition.GENERATION)
# Collect results
for s in searches:
print(s.statistics().maxEver)
s.close()stats = search.statistics()
# Scalar metrics
stats.online # Average of all scores
stats.offlineMax # Average of maximum scores
stats.maxEver # Best score ever found
stats.generation # Current generation
stats.convergence # Convergence ratio
# Population arrays (length: nBestGenomes)
stats.bestPopulationScores # Fitness values [popSize]
stats.bestPopulationCriterionsValues # All criteria [popSize, 8]
stats.bestPopulationOutputs # S-box lookup tables [popSize, 2^inputs]- Define genome and algorithm in
experiments.py - Use
runs()for single-criterion ormulticriterialRuns()for multi-objective - Process results with
runsStats()ornondominance() - Generate plots with
matplotlib(pylab) - Format output as LaTeX tables
- Edit
prog/src/main/cpp/criterions.cppto add/modify criterion logic - Add enum to
CriterionFunctioninprog/src/main/python/evolution.py - Update
CRITERIONS_COUNTconstant - Rebuild C++ library:
cd prog && make clean && make
- Implement new genome class inheriting from
GAGenomein C++ - Add factory function in
prog/src/main/cpp/main.cpp - Expose via
createGenome()C interface - Add enum to
ChromozomeTypeinprog/src/main/python/evolution.py
- ✅ Successfully migrated from Python 2.7 to Python 3.13
- All syntax modernized:
print()functions,range(), dict.items() - Uses uv for package management (see
prog/pyproject.toml) - Breaking change: Python 2 pickle files are not compatible
- Set
SBOX_LIBRARY_PATHenvironment variable for local development - All unit tests pass successfully with pytest 8.3+
- Docker multi-stage build optimized (6 stages, runtime image without build tools)
Known Architectural Limitation:
- Two C++ functions permanently disabled due to ctypes/Python C API incompatibility:
simpleReportSearching()- detailed evolution report (commented out)bestPopulationStringsSearching()- string representation of S-boxes (returns empty list)
- Root Cause: Libraries loaded via
ctypes.CDLLcannot safely call Python C API functions likePyUnicode_FromStringAndSize()to create Python objects. This would require converting the entire codebase to use Python extension modules instead of ctypes. - Workaround: All core functionality works perfectly. Use
statistics().bestPopulationOutputs(integer arrays) andstatistics().bestPopulationScoresinstead of string representations.
- The project relies on GAlib 2.4.7 (included as tarball)
- GAlib must be compiled first in
../galib247/directory - Makefile expects specific directory structure
- Native makefile uses
-march=k8-sse3(AMD K8 architecture from 2010) - Adjust
-marchflag for your CPU architecture (Docker auto-patches to-march=native) - OpenMP parallel execution assumes multi-core CPU
- Results saved as pickle files (Python 3 format)
- Plots generated as EPS (Encapsulated PostScript)
- LaTeX output uses Czech language strings
- S-boxes represented as lookup tables:
int[2^inputs] - Fitness values are
float32(NumPy arrays) - Criterion values stored as 8-element arrays (even if unused)
- Missing GAlib: Extract and compile
galib247bh100329.tar.bz2first - Python headers: Install
python2-devor adjust include path in makefile - OpenMP: Ensure compiler supports
-fopenmpflag ⚠️ Docker builds avoid all these issues automatically
- Library not found: Ensure
libsboxevolution.sois compiled and run fromprog/directory (or setLD_LIBRARY_PATH) - Segmentation fault: Check genome parameter validity (e.g., input/output sizes)
- Memory errors: Large populations may exhaust RAM (reduce
popSize)
- Docker builds failing at test stage: Check test output in build logs (Stage 4: tester)
- ImportError: No module named evolution (native): Must run tests from
prog/directory - OSError: libsboxevolution.so: cannot open shared object file (native): Compile C++ library first
- Test failures after code changes:
- Docker: Rebuild with
docker-compose build --no-cache prog - Native: Rebuild C++ library:
cd prog && make clean && make
- Docker: Rebuild with
- pytest not found (native): Install with
uv syncto install all dev dependencies including pytest 8.3+
- No output directory: Script creates
./results<timestamp>/automatically - Missing plots: Requires
matplotlib(pylab) and display backend - LaTeX errors: Tables use Czech characters, ensure UTF-8 encoding
- Build failures: Ensure Docker has sufficient memory (4GB+ recommended)
- Permission errors: Check that
artifacts/directory is writable - Image size warnings: Expected - TeX Live and build tools are large
This work explores the automated design of cryptographic S-boxes using evolutionary computation. S-boxes are critical components in symmetric encryption algorithms (e.g., DES, AES) that provide confusion in the cipher.
Key Research Questions:
- Can evolutionary algorithms discover S-boxes competitive with hand-designed ones?
- Which representation (binary, permutation, CGP) is most effective?
- How do single-objective and multi-objective approaches compare?
- What are optimal parameters for different algorithm types?
Main Contributions:
- Comparative study of 6+ evolutionary algorithms on S-box design
- Multi-objective optimization with 6 cryptographic criteria
- Novel CGP application to S-box evolution
- Extensive parameter sensitivity analysis (40 runs per configuration)
- Thesis PDF:
DIPxhovor07final.pdf(Czech language) - GAlib Documentation: https://lancet.mit.edu/ga/
- S-box Cryptanalysis: See
text/crypto.utf8.texfor background
- Original Implementation: January - May 2010
- Author: Bedrich Hovorka
- Institution: Brno University of Technology, Faculty of Information Technology
- Degree: Master's Thesis (2009/2010)
- Repository Restructuring: 2025
- Source code reorganization (src/main/ and src/test/)
- Docker containerization
- Unit test implementation
- Documentation updates
For Claude Code:
- This is a legacy research codebase (2010) with modern enhancements (2025)
- Fully migrated to Python 3.13 with modern syntax
- C++98 standard with GAlib 2.4.7 integration
- Unit tests validate core functionality but coverage is not exhaustive
- Testing uses pytest 8.3+ via Docker (Stage 5 of multi-stage build)
- Prefer Docker builds for consistency and reproducibility
- When modifying code:
- Use Python 3.13 syntax and conventions
- Rebuild C++ library after any
.cppchanges - Run unit tests to verify changes:
- Docker:
docker-compose build prog(pytest runs automatically) - Native:
uv run pytest src/test/python/
- Docker:
- Update documentation if adding new features