Skip to content

Latest commit

 

History

History
31 lines (24 loc) · 970 Bytes

File metadata and controls

31 lines (24 loc) · 970 Bytes

Categorical Classification

M. Sc. Project - Artifical Categorical Datasets

selftest

This project is also hosted on Outrank.

  • Source code in src/generator.py
  • Demo in src/CC Demo.ipnyb

Usage

pip install catclass

Creating a simple dataset

from catclass import Categorical Classification

cc = CategoricalClassification()

# Creates a simple dataset of 10 features, 10k samples, with feature cardinality of all features being 35
X = cc.generate_data(10, 
                     10000, 
                     cardinality=35, 
                     ensure_rep=True, 
                     random_values=True, 
                     low=0, 
                     high=40)

# Creates target labels via clustering
y = cc.generate_labels(X, n=2, class_relation='cluster')