Skip to content

Commit 6ae0c4f

Browse files
authored
Merge pull request #132 from dwijenchawra/master
fixed issue #67 and #51
2 parents c5e1681 + 6bd6e88 commit 6ae0c4f

23 files changed

+99
-127
lines changed

.github/workflows/TagBot.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,15 @@
11
name: TagBot
22
on:
3-
schedule:
4-
- cron: 0 * * * *
3+
issue_comment:
4+
types:
5+
- created
6+
workflow_dispatch:
57
jobs:
68
TagBot:
9+
if: github.event_name == 'workflow_dispatch' || github.actor == 'JuliaTagBot'
710
runs-on: ubuntu-latest
811
steps:
912
- uses: JuliaRegistries/TagBot@v1
1013
with:
1114
token: ${{ secrets.GITHUB_TOKEN }}
15+
ssh: ${{ secrets.DOCUMENTER_KEY }}

.github/workflows/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ jobs:
1616
with:
1717
version: '1.6'
1818
- name: Install LuaLatex
19-
run: sudo apt-get install texlive-full && sudo apt-get install texlive-latex-extra && sudo mktexlsr && sudo updmap-sys
19+
run: sudo apt-get update && sudo apt-get install texlive-full --fix-missing && sudo apt-get install texlive-latex-extra && sudo mktexlsr && sudo updmap-sys
2020
- name: Install dependencies
2121
run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'
2222
- name: Build and deploy

.travis.yml

Lines changed: 0 additions & 34 deletions
This file was deleted.

docs/Project.toml

Lines changed: 0 additions & 8 deletions
This file was deleted.

docs/make.jl

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,5 +25,6 @@ makedocs(
2525
)
2626

2727
deploydocs(
28-
repo = "github.com/sisl/BayesNets.jl.git",
29-
)
28+
repo = "github.com/dwijenchawra/BayesNets.jl.git",
29+
)
30+
return true

docs/src/usage.md

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,7 @@ rand(bn_gibbs, gsampler, 5)
187187

188188
BayesNets.jl supports parameter learning for an entire graph.
189189

190-
```julia
190+
```julia
191191
fit(BayesNet, data, (:a=>:b), [StaticCPD{Normal}, LinearGaussianCPD])
192192
```
193193

@@ -223,7 +223,7 @@ Inference methods for discrete Bayesian networks can be used via the `infer` met
223223
bn = DiscreteBayesNet()
224224
push!(bn, DiscreteCPD(:a, [0.3,0.7]))
225225
push!(bn, DiscreteCPD(:b, [0.2,0.8]))
226-
push!(bn, DiscreteCPD(:c, [:a, :b], [2,2],
226+
push!(bn, DiscreteCPD(:c, [:a, :b], [2,2],
227227
[Categorical([0.1,0.9]),
228228
Categorical([0.2,0.8]),
229229
Categorical([1.0,0.0]),
@@ -283,7 +283,7 @@ data[1:3,:] # only display a subset...
283283
Here we use the K2 structure learning algorithm which runs in polynomial time but requires that we specify a topological node ordering.
284284

285285
```@example bayesnet
286-
parameters = K2GraphSearch([:Species, :SepalLength, :SepalWidth, :PetalLength, :PetalWidth],
286+
parameters = K2GraphSearch([:Species, :SepalLength, :SepalWidth, :PetalLength, :PetalWidth],
287287
ConditionalLinearGaussianCPD,
288288
max_n_parents=2)
289289
bn = fit(BayesNet, data, parameters)
@@ -300,7 +300,7 @@ Changing the ordering will change the structure.
300300

301301
```julia
302302
CLG = ConditionalLinearGaussianCPD
303-
parameters = K2GraphSearch([:Species, :PetalLength, :PetalWidth, :SepalLength, :SepalWidth],
303+
parameters = K2GraphSearch([:Species, :PetalLength, :PetalWidth, :SepalLength, :SepalWidth],
304304
[StaticCPD{Categorical}, CLG, CLG, CLG, CLG],
305305
max_n_parents=2)
306306
fit(BayesNet, data, parameters)
@@ -311,7 +311,7 @@ A `ScoringFunction` allows for extracting a scoring metric for a CPD given data.
311311
A `GraphSearchStrategy` defines a structure learning algorithm. The K2 algorithm is defined through `K2GraphSearch` and `GreedyHillClimbing` is implemented for discrete Bayesian networks and the Bayesian score:
312312

313313
```@example bayesnet
314-
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3],
314+
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3],
315315
b=[1,1,1,2,2,2,2,1,1,2,1,1],
316316
a=[1,1,1,2,1,1,2,1,1,2,1,1])
317317
parameters = GreedyHillClimbing(ScoreComponentCache(data), max_n_parents=3, prior=UniformPrior())
@@ -325,7 +325,7 @@ TikzPictures.save(SVG("plot9"), plot) # hide
325325

326326
We can specify the number of categories for each variable in case it cannot be correctly inferred:
327327

328-
```julia
328+
```@example bayesnet
329329
bn = fit(DiscreteBayesNet, data, parameters, ncategories=[3,3,2])
330330
```
331331

@@ -338,11 +338,17 @@ A whole suite of features are supported for DiscreteBayesNets. Here, we illustra
338338

339339
We also detail obtaining a bayesian score for a network structure in the next section.
340340

341-
```julia
342-
count(bn, :a, data) # 1
343-
statistics(bn.dag, data) # 2
344-
table(bn, :b) # 3
345-
table(bn, :c, :a=>1) # 4
341+
```@example bayesnet
342+
count(bn, :a, data)
343+
```
344+
```@example bayesnet
345+
statistics(bn.dag, data)
346+
```
347+
```@example bayesnet
348+
table(bn, :b)
349+
```
350+
```@example bayesnet
351+
table(bn, :c, :a=>1)
346352
```
347353

348354
## Reading from XDSL
@@ -363,12 +369,10 @@ TikzPictures.save(SVG("plot10"), plot) # hide
363369
The bayesian score for a discrete-valued BayesNet can can be calculated based only on the structure and data (the CPDs do not need to be defined beforehand). This is implemented with a method of ```bayesian_score``` that takes in a directed graph, the names of the nodes and data.
364370

365371
```@example bayesnet
366-
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3],
372+
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3],
367373
b=[1,1,1,2,2,2,2,1,1,2,1,1],
368374
a=[1,1,1,2,1,1,2,1,1,2,1,1])
369375
g = DAG(3)
370376
add_edge!(g,1,2); add_edge!(g,2,3); add_edge!(g,1,3)
371377
bayesian_score(g, [:a,:b,:c], data)
372378
```
373-
374-

src/BayesNets.jl

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,9 @@ export
107107
adding_edge_preserves_acyclicity,
108108
bayesian_score_component,
109109
bayesian_score_components,
110-
bayesian_score
110+
bayesian_score,
111+
112+
nodenames
111113

112114

113115
include("bayes_nets.jl")

src/DiscreteBayesNet/discrete_bayes_net.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -71,15 +71,15 @@ function table(bn::DiscreteBayesNet, name::NodeName)
7171
d[!,name] = 1:ncategories(cpd(assignment))
7272
end
7373

74-
p = ones(size(d,1)) # the probability column
74+
potential = ones(size(d,1)) # the probability column
7575
for i in 1:size(d,1)
7676
assignment = Assignment()
7777
for j in 1:length(varnames)
7878
assignment[varnames[j]] = d[i,j]
7979
end
80-
p[i] = pdf(cpd, assignment)
80+
potential[i] = pdf(cpd, assignment)
8181
end
82-
d[!,:p] = p
82+
d[!,:potential] = potential
8383

8484
return Table(d)
8585
end

src/DiscreteBayesNet/io.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -157,8 +157,8 @@ function Base.write(io::IO, mime::MIME"text/plain", bn::DiscreteBayesNet)
157157
for name in arr_names
158158
cpd = get(bn, name)
159159
for D in cpd.distributions
160-
for p in probs(D)[1:end-1]
161-
str = @sprintf("%.16g", p)
160+
for potential in probs(D)[1:end-1]
161+
str = @sprintf("%.16g", potential)
162162
print(io, space ? " " : "" , str)
163163
space = true
164164
end

src/DiscreteBayesNet/tables.jl

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
DataFrames are used to represent factors
33
https://en.wikipedia.org/wiki/Factor_graph
44
5-
:p is the column containing the probabilities, ::Float64
5+
:potential is the column containing the probabilities, ::Float64
66
Each variable has its own column corresponding to its assignments and named with its name
77
88
These can be obtained using the table() function
@@ -34,16 +34,16 @@ function Base.:*(t1::Table, t2::Table)
3434
f1 =t1.potential
3535
f2 =t2.potential
3636

37-
onnames = setdiff(intersect(propertynames(f1), propertynames(f2)), [:p])
38-
finalnames = vcat(setdiff(union(propertynames(f1), propertynames(f2)), [:p]), :p)
37+
onnames = setdiff(intersect(propertynames(f1), propertynames(f2)), [:potential])
38+
finalnames = vcat(setdiff(union(propertynames(f1), propertynames(f2)), [:potential]), :potential)
3939

4040
if isempty(onnames)
4141
j = join(f1, f2, kind=:cross, makeunique=true)
4242
else
4343
j = outerjoin(f1, f2, on=onnames, makeunique=true)
4444
end
4545

46-
j[!,:p] = broadcast(*, j[!,:p], j[!,:p_1])
46+
j[!,:potential] = broadcast(*, j[!,:potential], j[!,:potential_1])
4747

4848
return Table(j[!,finalnames])
4949
end
@@ -57,25 +57,25 @@ function sumout(t::Table, v::NodeNameUnion)
5757
f = t.potential
5858

5959
# vcat works for single values and vectors alike (magic?)
60-
remainingvars = setdiff(propertynames(f), vcat(v, :p))
60+
remainingvars = setdiff(propertynames(f), vcat(v, :potential))
6161

6262
if isempty(remainingvars)
6363
# they want to remove all variables except for prob column
6464
# uh ... 'singleton' table?
65-
return Table(DataFrame(p = sum(f[!,:p])))
65+
return Table(DataFrame(potential = sum(f[!,:potential])))
6666
else
6767
# note that this will fail miserably if f is too large (~1E4 maybe?)
6868
# nothing I can do; there is a github issue
69-
return Table(combine(df -> DataFrame(p = sum(df[!,:p])), DataFrames.groupby(f, remainingvars)))
69+
return Table(combine(df -> DataFrame(potential = sum(df[!,:potential])), DataFrames.groupby(f, remainingvars)))
7070
end
7171
end
7272

7373
"""
7474
Table normalization
75-
Ensures that the `:p` column sums to one
75+
Ensures that the `:potential` column sums to one
7676
"""
7777
function LinearAlgebra.normalize!(t::Table)
78-
t.potential[!,:p] ./= sum(t.potential[!,:p])
78+
t.potential[!,:potential] ./= sum(t.potential[!,:potential])
7979

8080
return t
8181
end
@@ -103,23 +103,23 @@ end
103103

104104
"""
105105
takes a list of observations of assignments represented as a DataFrame
106-
or a set of data samples (without :p),
106+
or a set of data samples (without :potential),
107107
takes the unique assignments,
108108
and estimates the associated probability of each assignment
109109
based on its frequency of occurrence.
110110
"""
111111
function Distributions.fit(::Type{Table}, f::DataFrame)
112112
w = ones(size(f, 1))
113113
t = f
114-
if hasproperty(f, :p)
115-
t = f[:, propertynames(t) .!= :p]
116-
w = f[!,:p]
114+
if hasproperty(f, :potential)
115+
t = f[:, propertynames(t) .!= :potential]
116+
w = f[!,:potential]
117117
end
118118
# unique samples
119119
tu = unique(t)
120120
# add column with probabilities of unique samples
121-
tu[!,:p] = Float64[sum(w[Bool[tu[j,:] == t[i,:] for i = 1:size(t,1)]]) for j = 1:size(tu,1)]
122-
tu[!,:p] /= sum(tu[!,:p])
121+
tu[!,:potential] = Float64[sum(w[Bool[tu[j,:] == t[i,:] for i = 1:size(t,1)]]) for j = 1:size(tu,1)]
122+
tu[!,:potential] /= sum(tu[!,:potential])
123123

124124
return Table(tu)
125125
end
@@ -133,8 +133,8 @@ end
133133
# n = size(f, 1)
134134
# p = zeros(n)
135135
# w = ones(n)
136-
# if hasproperty(f, :p)
137-
# w = f[!,:p]
136+
# if hasproperty(f, :potential)
137+
# w = f[!,:potential]
138138
# end
139139
#
140140
# dfindex = find([hasproperty(a, n) for n in names(f)])

0 commit comments

Comments
 (0)