Skip to content

Commit 74f7466

Browse files
authored
Merge pull request #9 from JuliaAI/dev
For a 0.1.1 release
2 parents 5cb55e8 + 1c3f73d commit 74f7466

File tree

6 files changed

+46
-33
lines changed

6 files changed

+46
-33
lines changed

Project.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "CategoricalDistributions"
22
uuid = "af321ab8-2d2e-40a6-b165-3d674595d28e"
33
authors = ["Anthony D. Blaom <[email protected]>"]
4-
version = "0.1.0"
4+
version = "0.1.1"
55

66
[deps]
77
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
@@ -10,13 +10,15 @@ Missings = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
1010
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
1111
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
1212
ScientificTypesBase = "30f210dd-8aff-4c5f-94ba-8e64358c1161"
13+
UnicodePlots = "b8865327-cd53-5732-bb35-84acbb429228"
1314

1415
[compat]
1516
CategoricalArrays = "0.9, 0.10"
1617
Distributions = "0.25"
1718
Missings = "0.4, 1"
1819
OrderedCollections = "1.1"
1920
ScientificTypesBase = "2"
21+
UnicodePlots = "2"
2022
julia = "1.0"
2123

2224
[extras]

README.md

Lines changed: 22 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
# CategoricalDistributions.jl
22

33
Probability distributions and measures for finite sample spaces whose
4-
elements are *labeled*.
4+
elements are *labeled* (consist of the class pool of a
5+
`CategoricalArray`).
56

67
Designed for performance in machine learning applications. For
78
example, probabilistic classifiers in
89
[MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) typically
910
predict the `UnivariateFiniteVector` objects defined in this package.
1011

11-
For probability distributions over integers (unlabeled data) see the
12+
For probability distributions over integers see the
1213
[Distributions.jl](https://juliastats.org/Distributions.jl/stable/univariate/#Discrete-Distributions)
1314
package, whose methods the current package extends.
1415

@@ -31,22 +32,16 @@ this package is the class pool of a `CategoricalArray`:
3132
```julia
3233
using CategoricalDistributions
3334
using CategoricalArrays
34-
julia> data = rand(["yes", "no", "maybe"], 10) |> categorical
35-
10-element CategoricalArray{String,1,UInt32}:
36-
"maybe"
37-
"maybe"
38-
"no"
39-
"yes"
40-
"maybe"
41-
"no"
42-
"no"
43-
"no"
44-
"no"
45-
"yes"
46-
47-
julia> d = fit(UnivariateFinite, data)
48-
UnivariateFinite{Multiclass{3}}(maybe=>0.3, no=>0.5, yes=>0.2)
49-
35+
import Distributions
36+
data = ["no", "yes", "no", "maybe", "maybe", "no",
37+
"maybe", "no", "maybe"] |> categorical
38+
julia> d = Distributions.fit(UnivariateFinite, data)
39+
UnivariateFinite{Multiclass{3}}
40+
┌ ┐
41+
maybe ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.4
42+
no ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.5
43+
yes ┤■■■■■■■ 0.1
44+
└ ┘
5045
julia> pdf(d, "no")
5146
0.5
5247

@@ -59,7 +54,11 @@ from a probability vector:
5954

6055
```julia
6156
julia> d2 = UnivariateFinite(["no", "yes"], [0.15, 0.85], pool=data)
62-
UnivariateFinite{Multiclass{3}}(no=>0.15, yes=>0.85)
57+
UnivariateFinite{Multiclass{3}}
58+
┌ ┐
59+
no ┤■■■■■■ 0.15
60+
yes ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.85
61+
└ ┘
6362
```
6463

6564
A `UnivariateFinite` distribution tracks all classes in the pool:
@@ -74,12 +73,12 @@ levels(d2)
7473
julia> pdf(d2, "maybe")
7574
0.0
7675

77-
julia> pdf(d2, "okay")
76+
julia> pdf(d2, "okay")https://github.com/JuliaAI/CategoricalDistributions.jl#measures-over-finite-labeled-sets
7877
ERROR: DomainError with Value okay not in pool. :
7978
```
8079

8180
Arrays of `UnivariateFinite` distributions are defined using the same
82-
constructor. Broadcasting methods, such as `pdf`, is optimized for
81+
constructor. Broadcasting methods, such as `pdf`, are optimized for
8382
such arrays:
8483

8584
```
@@ -148,8 +147,8 @@ over finite labeled sets.
148147
Distributions.jl, with efficient broadcasting over the new array
149148
type.
150149

151-
- Implementation of `fit` from Distributions.jl for `UnivariateFinite`
152-
distributions.
150+
- Implementation of `Distributions.fit` from Distributions.jl for
151+
`UnivariateFinite` distributions.
153152

154153
- A single constructor for constructing `UnivariateFinite`
155154
distributions and arrays thereof, from arrays of probabilities.

src/CategoricalDistributions.jl

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,14 @@
11
module CategoricalDistributions
22

3-
export UnivariateFinite, UnivariateFiniteArray
4-
5-
# re-eported from Distributions:
6-
export pdf, logpdf, support, mode
7-
83
import Distributions
9-
import ScientificTypesBase
4+
import ScientificTypesBase: Finite, Multiclass, OrderedFactor
105
using OrderedCollections
116
using CategoricalArrays
127
import Missings
138
using Random
9+
using UnicodePlots
1410

1511
const Dist = Distributions
16-
const STB = ScientificTypesBase
1712

1813
import Distributions: pdf, logpdf, support, mode
1914

@@ -22,4 +17,12 @@ include("types.jl")
2217
include("methods.jl")
2318
include("arrays.jl")
2419

20+
export UnivariateFinite, UnivariateFiniteArray
21+
22+
# re-eport from Distributions:
23+
export pdf, logpdf, support, mode
24+
25+
# re-export from ScientificTypesBase:
26+
export Multiclass, OrderedFactor
27+
2528
end

src/methods.jl

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,15 @@ function Base.show(stream::IO, d::UnivariateFinite)
7979
print(stream, "UnivariateFinite{$(d.scitype)}($arg_str)")
8080
end
8181

82+
function Base.show(io::IO, mime::MIME"text/plain",
83+
d::UnivariateFinite{S}) where S
84+
s = support(d)
85+
x = string.(CategoricalArrays.DataAPI.unwrap.(s))
86+
y = pdf.(d, s)
87+
plt = barplot(x, y, title="UnivariateFinite{$S}")
88+
show(io, mime, plt)
89+
end
90+
8291
show_prefix(u::UnivariateFiniteArray{S,V,R,P,1}) where {S,V,R,P} =
8392
"$(length(u))-element"
8493
show_prefix(u::UnivariateFiniteArray) = join(size(u),'x')

src/types.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ same size as the array.
131131

132132
# extend Ditributions type hiearchy to account for non-euclidean
133133
# supports:
134-
abstract type Categorical{S<:STB.Finite} <: Dist.ValueSupport end
134+
abstract type Categorical{S<:Finite} <: Dist.ValueSupport end
135135

136136
# not exported:
137137
const _UnivariateFinite_{S} =

src/utilities.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
function scitype(c::CategoricalValue)
66
nc = length(levels(c.pool))
7-
return ifelse(c.pool.ordered, STB.OrderedFactor{nc}, STB.Multiclass{nc})
7+
return ifelse(c.pool.ordered, OrderedFactor{nc}, Multiclass{nc})
88
end
99

1010

0 commit comments

Comments
 (0)