Standalone phonological feature systems for alteruphono and other Python
libraries.
distfeat is the canonical home for:
- phonological feature datasets
- feature system protocols and registries
- feature geometry and distance logic
- built-in systems:
ipa,tresoldi,distinctive, and the P-base-derivedpbase-hc,pbase-jfh,pbase-spe,pbase-uftc
The package is developed as a standalone Python library and can be used by
alteruphono or other downstream tools.
A FeatureDataset is the source of truth for feature data. It contains:
sounds: grapheme to descriptive nameclasses: sound class definitions and class feature stringsfeatures:(value, feature)pairs
The built-in package dataset is bundled as TSV files, and users can also load their own datasets.
A feature system implements the FeatureSystem protocol. Systems convert
between graphemes and native representations, handle class matching, and
expose distance calculations.
Built-in systems:
ipa: compact categorical feature bundlestresoldi: broader categorical bundles preserving more modifiersdistinctive: categorical features plus scalar conversionspbase-*: native multi-state feature tables derived from the bundled P-base segment table
For new code, prefer the native representation methods
(grapheme_to_representation(...), matches(...), segment_distance(...))
over the older set-based compatibility helpers.
A Registry binds a dataset to one or more named systems. distfeat also
provides a lazily initialized default global registry so common use stays
simple.
distfeat.geometry provides a feature hierarchy based on the Clements & Hume
tradition. It is used for:
- feature-value distance
- sound distance
- category-aware grouping across systems
distfeat.analysis provides higher-level helpers that operate across systems:
features_to_graphemes(...)derive_class_features(...)minimal_matrix(...)tabulate_matrix(...)distance(...)
For most users:
import distfeat
features = distfeat.get_features("a")
vowel_class = distfeat.get_class_features("V")
valued = distfeat.get_representation("a", system="pbase-hc")For isolated experiments or custom datasets:
from distfeat import create_registry, load_dataset
dataset = load_dataset(directory="my_data")
registry = create_registry(dataset=dataset)
system = registry.get_system("ipa")For analysis tasks:
import distfeat
matrix = distfeat.minimal_matrix(["t", "d", "s"])
print(distfeat.tabulate_matrix(matrix))