@@ -14,12 +14,15 @@ choosing `probs` to be an array of one higher dimension than the array
14
14
generated.
15
15
16
16
Here the word "probabilities" is an abuse of terminology as there is
17
- no requirement that probabilities actually sum to one, only that they
18
- be non-negative. So `UnivariateFinite` objects actually implement
19
- arbitrary non-negative measures over finite sets of labelled points. A
20
- `UnivariateDistribution` will be a bona fide probability measure when
21
- constructed using the `augment=true` option (see below) or when
22
- `fit` to data.
17
+ no requirement that the that probabilities actually sum to one. The
18
+ only requirement is that the probabilities have a common type `T` for
19
+ which `zero(T)` is defined. In particular, `UnivariateFinite` objects
20
+ implement arbitrary non-negative, signed, or complex measures over
21
+ finite sets of labelled points. A `UnivariateDistribution` will be a
22
+ bona fide probability measure when constructed using the
23
+ `augment=true` option (see below) or when `fit` to data. And the
24
+ probabilities of a `UnivariateFinite` object `d` must be non-negative,
25
+ with a non-zero sum, for `rand(d)` to be defined and interpretable.
23
26
24
27
Unless `pool` is specified, `support` should have type
25
28
`AbstractVector{<:CategoricalValue}` and all elements are assumed to
@@ -37,28 +40,37 @@ constructor then returns an array of `UnivariateFinite` distributions
37
40
of size `(n1, n2, ..., nk)`.
38
41
39
42
```
40
- using CategoricalArrays
41
- v = categorical([:x, :x, :y, :x, :z])
42
-
43
- julia> UnivariateFinite(classes(v), [0.2, 0.3, 0.5])
44
- UnivariateFinite{Multiclass{3}}(x=>0.2, y=>0.3, z=>0.5)
45
-
46
- julia> d = UnivariateFinite([v[1], v[end]], [0.1, 0.9])
43
+ using CategoricalDistributions, CategoricalArrays, Distributions
44
+ samples = categorical(['x', 'x', 'y', 'x', 'z'])
45
+ julia> Distributions.fit(UnivariateFinite, samples)
46
+ UnivariateFinite{Multiclass{3}}
47
+ ┌ ┐
48
+ x ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.6
49
+ y ┤■■■■■■■■■■■■ 0.2
50
+ z ┤■■■■■■■■■■■■ 0.2
51
+ └ ┘
52
+
53
+ julia> d = UnivariateFinite([samples[1], samples[end]], [0.1, 0.9])
47
54
UnivariateFinite{Multiclass{3}(x=>0.1, z=>0.9)
55
+ UnivariateFinite{Multiclass{3}}
56
+ ┌ ┐
57
+ x ┤■■■■ 0.1
58
+ z ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.9
59
+ └ ┘
48
60
49
61
julia> rand(d, 3)
50
62
3-element Array{Any,1}:
51
- CategoricalArrays. CategoricalValue{Symbol,UInt32} :z
52
- CategoricalArrays. CategoricalValue{Symbol,UInt32} :z
53
- CategoricalArrays. CategoricalValue{Symbol,UInt32} :z
63
+ CategoricalValue{Symbol,UInt32} 'z'
64
+ CategoricalValue{Symbol,UInt32} 'z'
65
+ CategoricalValue{Symbol,UInt32} 'z'
54
66
55
- julia> levels(d )
67
+ julia> levels(samples )
56
68
3-element Array{Symbol,1}:
57
- :x
58
- :y
59
- :z
69
+ 'x'
70
+ 'y'
71
+ 'z'
60
72
61
- julia> pdf(d, :y )
73
+ julia> pdf(d, 'y' )
62
74
0.0
63
75
```
64
76
@@ -77,19 +89,27 @@ In the last case, specify `ordered=true` if the pool is to be
77
89
considered ordered.
78
90
79
91
```
80
- julia> UnivariateFinite([:x, :z], [0.1, 0.9], pool=missing, ordered=true)
81
- UnivariateFinite{OrderedFactor{2}}(x=>0.1, z=>0.9)
82
-
83
- julia> d = UnivariateFinite([:x, :z], [0.1, 0.9], pool=v) # v defined above
84
- UnivariateFinite(x=>0.1, z=>0.9) (Multiclass{3} samples)
85
-
86
- julia> pdf(d, :y) # allowed as `:y in levels(v)`
92
+ julia> UnivariateFinite(['x', 'z'], [0.1, 0.9], pool=missing, ordered=true)
93
+ UnivariateFinite{OrderedFactor{2}}
94
+ ┌ ┐
95
+ x ┤■■■■ 0.1
96
+ z ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.9
97
+ └ ┘
98
+
99
+ samples = categorical(['x', 'x', 'y', 'x', 'z'])
100
+ julia> d = UnivariateFinite(['x', 'z'], [0.1, 0.9], pool=samples)
101
+ ┌ ┐
102
+ x ┤■■■■ 0.1
103
+ z ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.9
104
+ └ ┘
105
+
106
+ julia> pdf(d, 'y') # allowed as `'y' in levels(samples)`
87
107
0.0
88
108
89
- v = categorical([:x, :x, :y, :x, :z, :w ])
109
+ v = categorical(['x', 'x', 'y', 'x', 'z', 'w' ])
90
110
probs = rand(100, 3)
91
111
probs = probs ./ sum(probs, dims=2)
92
- julia> UnivariateFinite([:x, :y, :z ], probs, pool=v)
112
+ julia> d1 = UnivariateFinite(['x', 'y', 'z' ], probs, pool=v)
93
113
100-element UnivariateFiniteVector{Multiclass{4},Symbol,UInt32,Float64}:
94
114
UnivariateFinite{Multiclass{4}}(x=>0.194, y=>0.3, z=>0.505)
95
115
UnivariateFinite{Multiclass{4}}(x=>0.727, y=>0.234, z=>0.0391)
@@ -107,6 +127,18 @@ for the classes `c2, c3, ..., cn`. The class `c1` probabilities are
107
127
chosen so that each `UnivariateFinite` distribution in the returned
108
128
array is a bona fide probability distribution.
109
129
130
+ ```julia
131
+ julia> UnivariateFinite([0.1, 0.2, 0.3], augment=true, pool=missing)
132
+ 3-element UnivariateFiniteArray{Multiclass{2}, String, UInt8, Float64, 1}:
133
+ UnivariateFinite{Multiclass{2}}(class_1=>0.9, class_2=>0.1)
134
+ UnivariateFinite{Multiclass{2}}(class_1=>0.8, class_2=>0.2)
135
+ UnivariateFinite{Multiclass{2}}(class_1=>0.7, class_2=>0.3)
136
+
137
+ d2 = UnivariateFinite(['x', 'y', 'z'], probs[:, 2:end], augment=true, pool=v)
138
+ julia> pdf(d1, levels(v)) ≈ pdf(d2, levels(v))
139
+ true
140
+ ```
141
+
110
142
---
111
143
112
144
UnivariateFinite(prob_given_class; pool=nothing, ordered=false)
@@ -142,6 +174,8 @@ struct UnivariateFinite{S,V,R,P}
142
174
prob_given_ref:: LittleDict{R,P,Vector{R}, Vector{P}}
143
175
end
144
176
177
+ @doc DOC_CONSTRUCTOR UnivariateFinite
178
+
145
179
"""
146
180
UnivariateFiniteArray
147
181
160
194
161
195
const UnivariateFiniteVector{S,V,R,P} = UnivariateFiniteArray{S,V,R,P,1 }
162
196
197
+ # private:
198
+ const SingletonOrArray{S,V,R,P} = Union{UnivariateFinite{S,V,R,P},
199
+ UnivariateFiniteArray{S,V,R,P}}
200
+
163
201
164
202
# # CHECKS AND ERROR MESSAGES
165
203
0 commit comments