Skip to content

Conversation

@vandenman
Copy link
Collaborator

Profiling shows that the majority of the time is spend in the gradient computation, which is to be expected. Sadly I wasn't able to improve this much for this PR.

I think that in principle a lot more optimization is possible, but that would require much more changes to the code. Basically, instead of separate log_p and gradient functions we'd need a class. That way, things like total_len can be easily computed once and cached, and the gradient vector grad can also be reused. In general, it should make memory reuse easier.

For now, this PR implements:

  • fewer allocations in bgmCompare
  • replace strings with enums, because comparing strings is more expensive than ints.
  • small tweak to the progress bar: the finish() method ensures that all progress bars are completely filled at the end of the loop. Now it no longer does this if there was a user interrupt.

Let's see if the R CMD CHECK runtime decreases (I doubt it though).

@MaartenMarsman
Copy link
Collaborator

I ran the following tests on the code and spotted differences in the bgmCompare output with and without these updates.

#run this with old software
fit_bgm = bgm(Wenchuan[, 1:5], seed = 1234)
x = Boredom[Boredom$language == "fr", 2:6][1:50,]
y = Boredom[Boredom$language != "fr", 2:6][1:50,]
z = Boredom[Boredom$language != "fr", 2:6][51:90, ]
group = c(rep(1, 50), rep(2, 50), rep(3, 40))
x = rbind(x,y,z)
fit_compare <- bgmCompare(x, group = group, seed = 1234)

#run this with new software
fit_bgm2 = bgm(Wenchuan[, 1:5], seed = 1234)
x = Boredom[Boredom$language == "fr", 2:6][1:50,]
y = Boredom[Boredom$language != "fr", 2:6][1:50,]
z = Boredom[Boredom$language != "fr", 2:6][51:90, ]
group = c(rep(1, 50), rep(2, 50), rep(3, 40))
x = rbind(x,y,z)
fit_compare2 <- bgmCompare(x, group = group, seed = 1234)

#tests

all.equal(fit_bgm$raw_samples, fit_bgm2$raw_samples)
[1] TRUE

all.equal(fit_compare$raw_samples, fit_compare2$raw_samples)
[1] "Component “main”: Component 1: Mean relative difference: 0.1814449"
[2] "Component “main”: Component 2: Mean relative difference: 0.1823261"
[3] "Component “main”: Component 3: Mean relative difference: 0.1826248"
[4] "Component “main”: Component 4: Mean relative difference: 0.1847298"
[5] "Component “pairwise”: Component 1: Mean relative difference: 0.4530948"
[6] "Component “pairwise”: Component 2: Mean relative difference: 0.4708279"
[7] "Component “pairwise”: Component 3: Mean relative difference: 0.4671137"
[8] "Component “pairwise”: Component 4: Mean relative difference: 0.4657075"
[9] "Component “indicator”: Component 1: Mean relative difference: 1.882105"
[10] "Component “indicator”: Component 2: Mean relative difference: 2.010225"
[11] "Component “indicator”: Component 3: Mean relative difference: 1.995624"
[12] "Component “indicator”: Component 4: Mean relative difference: 1.880952"

@vandenman
Copy link
Collaborator Author

I'll double check where the change comes from! Also Windows is now down to 1h 20m from 2h 30m 🎉

@MaartenMarsman MaartenMarsman merged commit e98edc2 into Bayesian-Graphical-Modelling-Lab:main Sep 26, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants