Trade-off between prediction performance and run time, colored by encoder on the left and by learner on the right.
strable/
├── configs/
│ ├── exp_configs.py
│ ├── model_parameters.py
│ └── path_configs.py
├── data/
│ ├── download_datasets.py # Download benchmark data from Hugging Face
│ └── data_processed/ # (created after download)
├── scripts/
│ ├── analysis_setup.py # Shared setup for all analysis scripts
│ ├── compile_results.py # Aggregate individual run scores into one CSV
│ ├── datasets_representation.py # Collect dataset metadata
│ ├── download_fasttext.py # Download the fastText model
│ ├── script_evaluate.py # Main benchmark evaluation entry point
│ ├── script_extract_llm_embeddings.py # LLM embedding extraction
│ └── data_preprocessing_scripts/ # Dataset-specific preprocessing
├── src/
│ ├── encoding.py # Table embedding / encoding strategies
│ ├── inference.py # Model inference
│ ├── param_search.py # Hyper-parameter search
│ ├── utils_evaluation.py # Data loading, scoring, estimator assignment
│ ├── utils_preprocess.py # Data-cleaning helpers
│ └── utils_visualization.py # Critical-difference diagrams, etc.
├── plots/
│ ├── main/ # Figures for the main paper
│ └── appendix/ # Figures for the appendix
├── tables/
│ ├── main/ # Tables for the main paper
│ └── appendix/ # Tables for the appendix
├── requirements.txt
└── pyproject.toml
Install all dependencies:
pip install -r requirements.txtNote: To install ContextTab follow https://github.com/SAP-samples/sap-rpt-1-oss.
All paths should be configures automatically through configs/path_configs.py.
python data/download_datasets.pyThis mirrors the STRABLE-benchmark Hugging Face dataset repository into data/data_processed/.
python scripts/download_fasttext.pyDownloads the English fastText model (cc.en.300.bin) to the path specified in path_configs.
python scripts/script_extract_llm_embeddings.pyExtracts embeddings of a Language Model for a given dataset. Results are saved under data/llm_embeding/ and timing information under data/llm_embed_time/.
python scripts/script_evaluate.pyThis runs the full evaluation pipeline for all dataset (Num+Str, Str only, Num only) × encoder × learner combinations.
Individual scores are stored in results/benchmark/.
python scripts/compile_results.pyAggregates every per-run CSV under results/benchmark/ into a single results file used by all downstream analysis scripts.
python scripts/datasets_representation.pyProduces the dataset summary table consumed by the figures and tables.
Each figure is a self-contained script.
Running it generates a PDF in results_pics/<today>/.
Main paper:
python plots/main/figure_1.py
python plots/main/figure_2.py
# … through figure_11
python plots/main/figure_11.pyAppendix:
python plots/appendix/figure_C1.py
python plots/appendix/figure_E1.py
# … and so on for all appendix figuresEach table is likewise a self-contained script.
Running it generates a LaTeX file in results_tables/<today>/.
Main paper:
python tables/main/table_1.pyAppendix:
python tables/appendix/table_B1.py
python tables/appendix/table_C1.py
# … through table_E2If you use STRABLE in your work, please cite:
@unpublished{strable2026,
title={STRABLE: Benchmarking Tabular Machine Learning with Strings},
author={Anonymous Authors},
year={2026}
}This project is released under the BSD 3-Clause License.