@@ -4,10 +4,10 @@ Probability distributions and measures for finite sample spaces whose
4
4
elements are * labeled* (consist of the class pool of a
5
5
` CategoricalArray ` ).
6
6
7
- Designed for performance in machine learning applications. For
7
+ Designed for performance in machine learning applications, where one is constructing large arrays of such distributions . For
8
8
example, probabilistic classifiers in
9
9
[ MLJ] ( https://alan-turing-institute.github.io/MLJ.jl/dev/ ) typically
10
- predict the ` UnivariateFiniteVector ` objects defined in this package.
10
+ predict the ` UnivariateFiniteArray ` objects defined in this package.
11
11
12
12
For probability distributions over integers see the
13
13
[ Distributions.jl] ( https://juliastats.org/Distributions.jl/stable/univariate/#Discrete-Distributions )
@@ -50,6 +50,16 @@ julia> mode(d)
50
50
CategoricalValue{String, UInt32} " no"
51
51
```
52
52
53
+ Of course, a ` UnivariateFinite ` object can be sampled:
54
+
55
+ ``` julia
56
+ julia> rand (d, 5 )
57
+ 3 - element Vector{CategoricalValue{String, UInt32}}:
58
+ " no"
59
+ " no"
60
+ " maybe"
61
+ ```
62
+
53
63
A ` UnivariateFinite ` distribution can also be constructed directly
54
64
from a probability vector:
55
65
@@ -62,11 +72,11 @@ julia> d2 = UnivariateFinite(["no", "yes"], [0.15, 0.85], pool=data)
62
72
└ ┘
63
73
```
64
74
65
- A ` UnivariateFinite ` distribution tracks all classes in the pool:
75
+ A ` UnivariateFinite ` distribution tracks all classes (levels) in the pool:
66
76
67
77
``` julia
68
78
levels (d2)
69
- 3 - element Vector {String}:
79
+ 3 - element CategoricalArray {String, 1 ,UInt32 }:
70
80
" maybe"
71
81
" no"
72
82
" yes"
@@ -106,7 +116,7 @@ probability array:
106
116
107
117
``` julia
108
118
julia> L = levels (data)
109
- 3 - element Vector {String}:
119
+ 3 - element CategoricalArray {String, 1 ,UInt32 }:
110
120
" maybe"
111
121
" no"
112
122
" yes"
@@ -121,26 +131,28 @@ julia> pdf(v, L)
121
131
122
132
## Measures over finite labeled sets
123
133
124
- There is, in fact, no enforcement that probabilities in a
125
- ` UnivariateFinite ` distribution sum to one, only that they be belong
126
- to a type ` T ` for which ` zero(T) ` is defined. In particular
127
- ` UnivariateFinite ` objects implement arbitrary non-negative, signed,
128
- or complex measures over a finite labeled set.
134
+ There is, in fact, no enforcement that probabilities in a ` UnivariateFinite ` distribution
135
+ sum to one, only that they be belong to a type ` T ` for which ` zero(T) ` is defined. In
136
+ particular ` UnivariateFinite ` objects implement arbitrary non-negative, signed, or complex
137
+ measures over a finite labeled set.
138
+
139
+ However, you cannot sample using ` pdf ` unless "probabilities" are non-negative (their type
140
+ ` T ` must support ` > ` and addition).
129
141
130
142
## What does this package provide?
131
143
132
- - A new type ` UnivariateFinite{S} ` for representing probability
133
- distributions over the pool of a ` CategoricalArray ` , that is, over
134
- finite * labeled * sets. Here ` S ` is a subtype of ` OrderedFactor `
135
- from ScientificTypesBase.jl, if the pool is ordered, or of
136
- ` Multiclass ` if the pool is unordered.
144
+ - A new type ` UnivariateFinite{S} ` for representing probability distributions over the
145
+ pool of a ` CategoricalArray ` , that is, over finite * labeled * sets. Here ` S ` is a subtype
146
+ of ` OrderedFactor ` from
147
+ [ ScientificTypesBase.jl] ( https://github.com/JuliaAI/ScientificTypesBase.jl ) , if the pool
148
+ is ordered, or of ` Multiclass ` if the pool is unordered.
137
149
138
150
- A new array type `UnivariateFiniteArray{S} <:
139
151
AbstractArray{<: UnivariateFinite { S } }` for efficiently manipulating
140
152
arrays of ` UnivariateFinite ` distributions.
141
153
142
- - Implementations of ` rand ` for generating random samples of a
143
- ` UnivariateFinite ` distribution.
154
+ - Implementations of ` rand ` for generating random samples of a ` UnivariateFinite `
155
+ distribution, in the case that "probabilities" come from an ordered field .
144
156
145
157
- Implementations of the ` pdf ` , ` logpdf ` , ` mode ` and ` modes ` methods of
146
158
Distributions.jl, with efficient broadcasting over the new array
0 commit comments