1
1
# CategoricalDistributions.jl
2
2
3
3
Probability distributions and measures for finite sample spaces whose
4
- elements are * labeled* .
4
+ elements are * labeled* (consist of the class pool of a
5
+ ` CategoricalArray ` ).
5
6
6
7
Designed for performance in machine learning applications. For
7
8
example, probabilistic classifiers in
8
9
[ MLJ] ( https://alan-turing-institute.github.io/MLJ.jl/dev/ ) typically
9
10
predict the ` UnivariateFiniteVector ` objects defined in this package.
10
11
11
- For probability distributions over integers (unlabeled data) see the
12
+ For probability distributions over integers see the
12
13
[ Distributions.jl] ( https://juliastats.org/Distributions.jl/stable/univariate/#Discrete-Distributions )
13
14
package, whose methods the current package extends.
14
15
@@ -31,22 +32,16 @@ this package is the class pool of a `CategoricalArray`:
31
32
``` julia
32
33
using CategoricalDistributions
33
34
using CategoricalArrays
34
- julia> data = rand ([" yes" , " no" , " maybe" ], 10 ) |> categorical
35
- 10 - element CategoricalArray{String,1 ,UInt32}:
36
- " maybe"
37
- " maybe"
38
- " no"
39
- " yes"
40
- " maybe"
41
- " no"
42
- " no"
43
- " no"
44
- " no"
45
- " yes"
46
-
47
- julia> d = fit (UnivariateFinite, data)
48
- UnivariateFinite {Multiclass{3}} (maybe=> 0.3 , no=> 0.5 , yes=> 0.2 )
49
-
35
+ import Distributions
36
+ data = [" no" , " yes" , " no" , " maybe" , " maybe" , " no" ,
37
+ " maybe" , " no" , " maybe" ] |> categorical
38
+ julia> d = Distributions. fit (UnivariateFinite, data)
39
+ UnivariateFinite{Multiclass{3 }}
40
+ ┌ ┐
41
+ maybe ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.4
42
+ no ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.5
43
+ yes ┤■■■■■■■ 0.1
44
+ └ ┘
50
45
julia> pdf (d, " no" )
51
46
0.5
52
47
@@ -59,7 +54,11 @@ from a probability vector:
59
54
60
55
``` julia
61
56
julia> d2 = UnivariateFinite ([" no" , " yes" ], [0.15 , 0.85 ], pool= data)
62
- UnivariateFinite {Multiclass{3}} (no=> 0.15 , yes=> 0.85 )
57
+ UnivariateFinite{Multiclass{3 }}
58
+ ┌ ┐
59
+ no ┤■■■■■■ 0.15
60
+ yes ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.85
61
+ └ ┘
63
62
```
64
63
65
64
A ` UnivariateFinite ` distribution tracks all classes in the pool:
@@ -74,12 +73,12 @@ levels(d2)
74
73
julia> pdf (d2, " maybe" )
75
74
0.0
76
75
77
- julia> pdf (d2, " okay" )
76
+ julia> pdf (d2, " okay" )https : // github . com / JuliaAI / CategoricalDistributions . jl # measures-over-finite-labeled-sets
78
77
ERROR: DomainError with Value okay not in pool. :
79
78
```
80
79
81
80
Arrays of ` UnivariateFinite ` distributions are defined using the same
82
- constructor. Broadcasting methods, such as ` pdf ` , is optimized for
81
+ constructor. Broadcasting methods, such as ` pdf ` , are optimized for
83
82
such arrays:
84
83
85
84
```
@@ -148,8 +147,8 @@ over finite labeled sets.
148
147
Distributions.jl, with efficient broadcasting over the new array
149
148
type.
150
149
151
- - Implementation of ` fit ` from Distributions.jl for ` UnivariateFinite `
152
- distributions.
150
+ - Implementation of ` Distributions. fit` from Distributions.jl for
151
+ ` UnivariateFinite ` distributions.
153
152
154
153
- A single constructor for constructing ` UnivariateFinite `
155
154
distributions and arrays thereof, from arrays of probabilities.
0 commit comments