M. Sc. Project - Artifical Categorical Datasets
This project is also hosted on Outrank.
- Source code in src/generator.py
- Demo in src/CC Demo.ipnyb
pip install catclassfrom catclass import Categorical Classification
cc = CategoricalClassification()
# Creates a simple dataset of 10 features, 10k samples, with feature cardinality of all features being 35
X = cc.generate_data(10,
10000,
cardinality=35,
ensure_rep=True,
random_values=True,
low=0,
high=40)
# Creates target labels via clustering
y = cc.generate_labels(X, n=2, class_relation='cluster')