An Interpretable Deep Learning Approach for Morphological Script Type Analysis (IWCP 2024)
https://learnable-handwriter.github.io/
Malamatenia Vlachou Efstathiou, Ioannis Siglidis, Dominique Stutzmann and Mathieu Aubry-
For training without having to install, we provide a standalone Colab
notebook.
-
For minimal inference on pre-trained and finetuned models without having to install, we provide a standalone Colab
notebook.
-
A figures.ipynb notebook is provided to reproduce the paper results and graphs. You'll need to download & extract datasets.zip and runs.zip in the base folder first or run it directly in Colab
Note
macOS is not supported due to compatibility issues with the available PyTorch version (affine transforms are not fully implemented or optimized in the macOS build). We recommend running the code on a Linux system (locally or on a server) with CUDA support. For training, the use of a GPU is strongly advised.
After cloning the repository and entering the base folder:
- Create a conda environment:
conda create --name lhr python=3.10 conda activate lhr
- Install pytorch.
- If you're using pip:
python -m pip install -r requirements.txt
In this case you'll need to download & extract only the datasets.zip.
python scripts/train.py iwcp_south_north.yaml python scripts/finetune_scripts.py -i runs/iwcp_south_north/train/ -o runs/iwcp_south_north/finetune/ --mode g_theta --max_steps 2500 --invert_sprites --script Northern_Textualis Southern_Textualis -a datasets/iwcp_south_north/annotation.json -d datasets/iwcp_south_north/ --split trainpython scripts/finetune_docs.py -i runs/iwcp_south_north/train/ -o runs/iwcp_south_north/finetune/ --mode g_theta --max_steps 2500 --invert_sprites -a datasets/iwcp_south_north/annotation.json -d datasets/iwcp_south_north/ --split allconfigs/dataset/<DATASET_ID>.yaml
...
DATASET-TAG:
path: <DATASET-NAME>/
sep: '' # How the character separator is denoted in the annotation.
space: ' ' # How the space is denoted in the annotation.
configs/<DATASET_ID>.yaml
...
For its structure, see the config file provided for our experiment.
datasets/<DATASET-NAME>
├── annotation.json
└── images
├── <image_id>.png
└── ...
The annotation.json file should be a dictionary with entries of the form:
"<image_id>": {
"split": "train", # {"train", "val", "test"} - "val" is ignored in the unsupervised case.
"label": "A beautiful calico cat." # The text that corresponds to this line.
"script": "Times_New_Roman" # (optional) Corresponds to the script type of the image
},
You can completely ignore the annotation.json file in the case of unsupervised training without evaluation.
python scripts/train.py <CONFIG_NAME>.yaml- On a group of documents defined by their "script" type with:
python scripts/finetune_scripts.py -i runs/<MODEL_PATH> -o <OUTPUT_PATH> --mode g_theta --max_steps <int> --invert_sprites --script '<SCRIPT_NAME>' -a <DATASET_PATH>/annotation.json -d <DATASET_PATH> --split <train or all>- On individual documents with:
python scripts/finetune_docs.py -i runs/<MODEL_PATH> -o <OUTPUT_PATH> --mode g_theta --max_steps <int> --invert_sprites -a <DATASET_PATH>/annotation.json -d <DATASET_PATH> --split <train or all>Note
To ensure a consistent set of characters regardless of the annotation source for our analysis, we implement internally choco-mufin, using a disambiguation-table.csv to normalize or exclude characters from the annotations. The current configuration suppresses allographs and edition signs (e.g., modern punctuation) for a graphetic result.
@misc{vlachou2024interpretable,
title = {An Interpretable Deep Learning Approach for Morphological Script Type Analysis},
author = {Vlachou-Efstathiou, Malamatenia and Siglidis, Ioannis and Stutzmann, Dominique and Aubry, Mathieu},
publisher = {Document Analysis and Recognition--ICDAR 2021 Workshops: Athens, Greece, August 30--September 4, 2023, Proceedings},
year = {2024},
organization={Springer},
url={https://arxiv.org/abs/2408.11150}}Check out also: Siglidis, I., Gonthier, N., Gaubil, J., Monnier, T., & Aubry, M. (2023). The Learnable Typewriter: A Generative Approach to Text Analysis.
This study was supported by the CNRS through MITI and the 80|Prime program (CrEMe Caractérisation des écritures médiévales) , and by the European Research Council (ERC project DISCOVER, number 101076028). We thank Ségolène Albouy, Raphaël Baena, Sonat Baltacı, Syrine Kalleli, and Elliot Vincent for valuable feedback on the paper.
